Mongodb aggregate and return multiple document value - mongodb

Assuming I have the following JSON structure I want to group by gender and want to return multiple document values on in the same field:
[
{
"id": 0,
"age": 40,
"name": "Tony Bond",
"gender": "male"
},
{
"id": 1,
"age": 30,
"name": "Nikki Douglas",
"gender": "female"
},
{
"id": 2,
"age": 23,
"name": "Kasey Cardenas",
"gender": "female"
},
{
"id": 3,
"age": 25,
"name": "Latasha Burt",
"gender": "female"
}
]
Now I know I can do something like this but I need to join both age and name into one field
.aggregate().group({ _id:'$gender', age: { $addToSet: "$age" }, name: { $addToSet: "$name"}})

Yes you can, just have a sub-document as the argument:
db.collection.aggregate([
{ "$group": {
"_id": "$gender",
"person": { "$addToSet": { "name": "$name", "age": "$age" } }
}}
])
Of course if you actually expect no duplicates here the $push does the same thing:
db.collection.aggregate([
{ "$group": {
"_id": "$gender",
"person": { "$push": { "name": "$name", "age": "$age" } }
}}
])

In addition to what Neil mentioned for the aggregation query, you have to consider the fact that the document that contains the grouped individual record cannot be beyond 16 MB (in v2.5+) or the entire aggregate result cannot be more than 16MB (on v2.4 and below).
So in case you have huge data set, you have to be aware of limitations that may impose on the model you use to aggregate data.

Related

Restructuring a collection in MongoDB

My collection called "sets" currently looks like this:
[{
"_id": {
"$oid": "61c2c90b04a5c1fd873bca6c"
},
"exercise": "Flat Barbell Bench Press",
"repetitions": 8,
"rpe": 8,
"__v": 0,
"weight": 90,
"createdAt": {
"$date": {
"$numberLong": "1640155403594"
}
}
}]
It's an array with about 1500 documents several months worth of workouts.
What I'm trying to accomplish is this:
[{
"_id": {
"$oid": "62f3cee8d149f0c3534d848c"
},
"user": {
"$oid": "62d11eaa0caf6d2b3133b4b9"
},
"sets": [
{
"weight": 50,
"exercise": "Bench Press",
"repetitions": 8,
"rpe": 8,
"notes": "some note",
"_id": {
"$oid": "62f3cee8d149f0c3534d848d"
}
},
{},
{}
],
"createdAt": {
"$date": {
"$numberLong": "1660145384923"
}
}
}]
Essentially, what I'm trying to accomplish here is embedding an array of "set" objects as a field value for "sets" field. So that instead of a list of sets I have a list of workouts where sets are stored as an array of objects in a field called "sets".
Each "set" object has a date stamp and what I also need to do is to group these sets by day. So at the end of the day each new document represents one workout and has an id, user and sets fields, where each set is from that day.
My stackoverflow research tells me that I need to use aggregation, but I can't quite wrap my mind around how exactly I would do that.
Any help would be greatly appreciated!
/* UPDATE */
Here's the final query I came up with, hope someone will find it useful.
db.collection.aggregate([
{
$group: {
_id: {
$dateToString: {
format: "%Y-%m-%d",
date: "$createdAt"
}
},
sets: {
$push: {
_id: "$_id",
exercise: "$exercise",
repetitions: "$repetitions",
rpe: "$rpe",
__v: "$__v",
weight: "$weight",
createdAt: "$createdAt"
}
}
}
},
{
"$addFields": {
"user": "UserID",
"date": "$_id"
}
},
{
$project: {
"_id": 0
}
}
])

Merge two documents in the same collection in MongoDB

I've been tryng to merge some duplicate documents in my collection called Cities.
Consider the following:
{
"_id": 1,
"Name": "Santa Monica",
"State": "California"
},
{
"_id": {"$oid":"5ec5fcc993ce00388429278a"},
"Name": "Santa Monica",
"State": "California",
"TimeZone": "UTC−8",
"Population": 90401,
"State": "California"
}
The second entry does have more information, like TimeZone and Population.
I want to merge both in only one object, preserving the first _id like this:
{
"_id": 1
"Name": "Santa Monica",
"State": "California",
"TimeZone": "UTC−8",
"Population": 90401,
"State": "California"
}
I know that I have to use aggregations but don't know how. The documentation lacks a clear example of how to do that at the same collection.
Thanks in advance.
You need to group documents by Name and State fields and merge them.
Try this one:
db.Cities.aggregate([
{
$group: {
_id: {
"Name": "$Name",
"State": "$State"
},
data: {
$push: "$$ROOT"
}
}
},
{
"$replaceWith": {
"$mergeObjects": {
"$reverseArray": "$data"
}
}
}
])
MongoPlayground
Note: $reverseArray allows flip grouped array

Get distinct values from each field within mongodb collection

How to get the distinct values of all the fields within the mongodb collection using single query.
{ "_id": "1", "Gender": "male", "car": "bmw" , "house":"2bhk" , "married_to": "kalpu"},
{ "_id": "2", "Gender": "female", "car": nan , "house":"3bhk", "married_to": "kalpu"},
{ "_id": "3", "Gender": "female", "car": "audi", "house":"1bhk", "married_to": "deepa"},
This is an example with few fields, In my actual collection, each document has atleast 50 fields. So how to query effeciently that will return unique values within each of the fields? Thanks in advance for help.
Answer expected:
for each field,
Gender:"male", "female"
car :"bmw", "audi",.....
house : "3hbk","2bhk","1bhk"
married_to: "kalpu","deepa",....
....
....
...
You can use aggregation pipeline $group stage with $addToSet operator
db.collection.aggregate([
{
$group: {
_id: null,
Gender: {
"$addToSet": "$Gender"
},
car: {
"$addToSet": "$car"
},
house: {
"$addToSet": "$house"
},
married_to: {
"$addToSet": "$married_to"
},
}
}
])
Working Example

Filtering a mongodb query result based on the position of a field in an array

Apologies for the confusing title, I am not sure how to summarize this.
Suppose I have the following list of documents in a collection:
{ "name": "Lorem", "source": "A" }
{ "name": "Lorem", "source": "B" }
{ "name": "Ipsum", "source": "A" }
{ "name": "Ipsum", "source": "B" }
{ "name": "Ipsum", "source": "C" }
{ "name": "Foo", "source": "B" }
as well an ordered list of accepted sources, where lower indexes signify higher priority
sources = ["A", "B"]
My query should:
Take a list of available sources and a list of wanted names
Return a maximum of one document per name.
In case of multiple matches, the document with the most prioritized source should be chosen.
Example:
wanted_names = ['Lorem', 'Ipsum', 'Foo', 'NotThere']
Result:
{ "name": "Lorem", "source": "A" }
{ "name": "Ipsum", "source": "A" }
{ "name": "Foo", "source": "B" }
The results don't necessarily have to be ordered.
Is it possible to do this with a Mongo query alone? If so could someone point me towards a resource detailing how to accomplish it?
My current solution doesn't support a list of names, and instead relies on a Python script to execute multiple queries:
db.collection.aggregate([
{$match: {
"name": "Lorem",
"source": {
$in: sources
}}},
{$addFields: {
"order": {
$indexOfArray: [sources, "$source"]
}}},
{$sort: {
"order": 1
}},
{$limit: 1}
]);
Note: _id fields are omitted in this question for the sake of brevity
How about this: With $group we have $min operator which takes lower source
Note: If you prioritize as ['B', 'A'], use $max then
db.collection.aggregate([
{
$match: {
"name": {
$in: [
"Lorem",
"Ipsum",
"Foo",
"NotThere"
]
},
"source": {
$in: [
"A",
"B"
]
}
}
},
{
$group: {
_id: "$name",
source: {
$min: "$source"
}
}
},
{
$project: {
_id: 0,
name: "$_id",
source: 1
}
}
])
MongoPlayground

How to find document that contains array with two equal values?

I have chats's collection with participants array
[{
"_id": ObjectId("5d12b2a10507cfe0bad6d93c"),
"participants": [{
"_id": ObjectId("5ce4af580507cfe0ba1c6f5b"),
"firstname": "John",
"lastname": "Anderson",
"icon": "/assets/images/avatars/small/2.jpg"
},
{
"_id": ObjectId("5ce4af580507cfe0ba1c6f5b"),
"firstname": "John",
"lastname": "Anderson",
"icon": "/assets/images/avatars/small/2.jpg"
}
]
}, {
"_id": ObjectId("5d1124a50507cfe0baba7909"),
"participants": [{
"_id": ObjectId("5ce4af580507cfe0ba1c6f5b"),
"firstname": "John",
"lastname": "Anderson",
"icon": "/assets/images/avatars/small/2.jpg"
},
{
"_id": ObjectId("5ce54cb80507cfe0ba25d74b"),
"firstname": "Milosh",
"lastname": "Jersi",
"icon": "/assets/images/avatars/small/3.jpg"
}
]
}]
I fetch it by
req.db.collection('chats').findOne({'participants._id': {$all: [req.userID, new mongo.ObjectID(req.params.to)]}});
where userID is also ObjectID and equals.
Usually it have different participants, but our user can also send messages to itself, is allowed option in many social networks.
So in this situation, our user "John Anderson" sent message to himself and we inserted chat document for it.
And now i have problem, how to get document with equal array values
{'participants._id': { '$all': [ 5ce4af580507cfe0ba1c6f5b, 5ce4af580507cfe0ba1c6f5b] }}
// return every chat contains our id in atleast one field, but we need both to be equal
// same for $in
{'participants._id': { '$eq': [ 5ce4af580507cfe0ba1c6f5b, 5ce4af580507cfe0ba1c6f5b] }}
// return nothing
what else can I do ?
you can achieve this with the aggregation framework, using a $group stage. First, group by chat._id and use $addToSet to keep only unique users in a new array, ant then add a filter to keep only the documents with one participant:
db.collection.aggregate([
{
"$unwind": "$participants"
},
{
"$group": {
"_id": "$_id",
"participants": {
"$addToSet": "$participants._id"
}
}
},
{
"$match": {
"participants": {
"$size": 1
}
}
}
])
result:
[
{
"_id": ObjectId("5d12b2a10507cfe0bad6d93c"),
"participants": [
ObjectId("5ce4af580507cfe0ba1c6f5b")
]
}
]
you can try it online: mongoplayground.net/p/plB-gsNIxRd