How to Find MongoDB Documents with the Same Value and Count Them? - mongodb

In MongoDB, I need to use group aggregation (I believe), in order to get the number of documents in a collection with the same value. I need to get these results returned to me from greatest to least, and then get the common value for each result.
Eg.
I have a normal query with a range (eg. field "value" > 5). I assume for this I should use the "match" feature when aggregating
Get all documents with the same value for the "id" field, that also match the above query parameters
Sort the results from most matching values to least
Give me the common value of "id" for each result.
Sample documents:
#1. Type: "Like", value: 6, id: 123
#2. Type: "Like", value: 7, id: 123
#3. Type: "Like", value: 7, id: 123
#4. Type: "Like", value: 8, id: 12345
#5. Type: "Like", value: 7, id: 12345
#6. Type: "Like", value: 6, id: 1234
#7. Type: "Like", value: 2, id: 1234
#7. Type: "Like", value: 2, id: 1234
#7. Type: "Like", value: 2, id: 1234
Expected output (assume I have a limit of 3 documents, and the query asks for only documents with the "value" field > 5):
1. id: 123
2. id: 12345
3. id: 1234
I expect these in this order, as the id 123 is most popular, and 1234 is least popular, of the documents where the "value" field > 5.
Ideally, I would have a method that would return something like a String[] of the resulting Ids, in order.

db.data.aggregate([
{$match: {value:{$gt:5}}},
{$group: {'_id':"$id", num:{$sum:1}, avg:{$avg:"$value"}}},
{$sort:{num:-1}}, { $limit : 50}
])

db.getCollection('my_collection').aggregate([
//Only include documents whose field named "value" is greater than 5
{
$match: {
value: {
$gt:5
}
}
},
//Using the documents gathered from the $match above, create a new set of
// documents grouped by the "id" field, and use the "id" field as the "_id"
// for the group. Make a new field called "num" that increments by 1 for
// every matching document. Make a new field named "avg" that is the average
// of the field named "value".
{
$group: {
'_id' : "$id",
num : {
$sum : 1
},
avg : {
$avg : "$value"
}
}//end $group
},
// -- //
// Note: you could do another $match here, which would run on the new
// documents created by $group.
// -- //
//Sort the new documents by the "num" field in descending order
{
$sort : {
num : -1
}
},
//Only return the first 3 of the new documents
{
$limit : 3
}
])

Related

Building mongo query

I have a model like this:
[{item: {
_id: 123,
field1: someValue,
price: {
[_id: 456,
field1: anotherValue,
type: []],
[_id: 789,
field1: anotherValue,
type: ['super']]
}
}]
I need to find an item by 3 parameters: item _id, price _id, and check if price type array is empty. And check it in one price field.
Model.findOneAndUpdate({_id: 123, "price._id": 456, "price.type": {size:0})
This query always returns item, cause search in different prices.
Model.findOneAndUpdate({_id: 123, price: {id: 456, type: {size:0})
This query returns error (cast array value or something like this).
tried to build query with $in, $and, but still getting an error
Use $elemMatch:
The $elemMatch operator matches documents that contain an array field
with at least one element that matches all the specified query
criteria.
db.inventory.find({
price: {
"$elemMatch": {
_id: 456,
type: {
$size: 0
}
}
}
})

Mongodb nested find

My database schema is somewhat like
{
"_id" : ObjectId("1"),
"createdAt" : ISODate("2017-03-10T00:00:00.000+0000"),
"user_list" : [
{
"id" : "a",
"some_flag" : 1,
},
{
"id" : "b",
"some_flag" : 0,
}
]
}
What I want to do is get the document where id is b & some_flag for the user b is 0.
My query is
db.collection.find({
'createdAt': {
$gte: new Date()
},
'user_list.id': 'b',
'user_list.some_flag': 1
}).sort({
createdAt: 1
})
When I run the query in shell. It returns the doc with id 1(which it shouldn't as the value of some_flag for b is 0)
The thing happening here is,
the query 'user_list.id': user_id matches with the nested object where "id" : b
'user_list.some_flag': 1 is matched with some_flag of nested object where "id": a (as the value of some_flag is 1 here)
What modifications should I make to compare the id & some_flag for the same nested object.
P.S. the amount of data is quite large & using aggregate will be a performance bottleneck
You should be using $elemMatch otherwise mongoDB queries are applied independently on array items, so in your case 'user_list.some_flag': 1 will be matched to array item with id a and 'user_list.id': 'b' will match array item with id b. So essentially if you want to query on array field with and logic use $elemMatch as following:
db.collection.find({
'createdAt': {
$gte: new Date()
},
user_list: {$elemMatch: {id: 'b', some_flag: 1}} // will only be true if a unique item of the array fulfill both of the conditions.
}).sort({
createdAt: 1
})
you need to try something like :
db.collection.find({
'createdAt': {
$gte: new Date()
},
user_list: {
$elemMatch: {
id: 'b',
some_flag: 1
}
}
}).sort({
createdAt: 1
});
This will match only user_list entries where _id is b and someflag is 1

How to check if multiple documents exist

Is there such a query that gets multiple fields, and returns which of these exists in the collection?
For example, if the collection has only:
{id : 1}
{id : 2}
And I want to know which of [{id : 1} , {id : 3}] exists in it, then the result will be something like [{id : 1}].
You are looking for the $in-operator.
db.collection.find({ id: { $in: [ 1, 3 ] } });
This will get you any documents where the id-field (different from the special _id field) is 1 or 3. When you only want the values of the id field and not the whole documents:
db.collection.find({ id: { $in: [ 1, 3 ] } }, { _id: false, id:true });
If you want to check provided key with value is present or not in collection, you can simply check by matching values and combining conditions using $or operator.
By considering id is different than _id in mongo.
You can use $or to get expected output and query will be as following.
db.collection.find({$or:[{"id":1},{"id":3}]},{"_id":0,"id":1})
If you want to match _id then use following query:
db.collection.find({$or:[{"_id":ObjectId("557fda78d077e6851e5bf0d3")},{"_id":ObjectId("557fda78d077e6851e5bf0d5")}]}

usage for MongoDB sort in array

I would like to ranked in descending order a list of documents in array names via their number value.
Here's the structure part of my collection :
_id: ObjectId("W")
var1: "X",
var2: "Y",
var3: "Z",
comments: {
names: [
{
number: 1;
},
{
number: 3;
},
{
number: 2;
}
],
field: Y;
}
but all my request with db.collection.find().sort( { "comments.names.number": -1 } ) doesn't work.
the desired output sort is :
{ "_id" : ObjectId("W"), "var1" : "X", "var3" : "Z", "comments" : { [ { "number" : 3 }, { "number" : 2 },{ "number" : 1 } ], "field": "Y" } }
Can you help me?
You need to aggregate the result, as below:
Unwind the names array.
Sort the records based on comments.names.number in descending
order.
Group the records based on the _id field.
project the required structure.
Code:
db.collection.aggregate([
{$unwind:"$comments.names"},
{$sort:{"comments.names.number":-1}},
{$group:{"_id":"$_id",
"var1":{$first:"$var1"},
"var2":{$first:"$var2"},
"var3":{$first:"$var3"},
"field":{$first:"$comments.field"},
"names":{$push:"$comments.names"}}},
{$project:{"comments":{"names":"$names","field":"$field"},"var1":1,
"var2":1,"var3":1}}
],{"allowDiskUse":true})
If your collection is large, you might want to add a $match criteria in the beginning of the aggregation pipeline to filter records or use (allowDiskUse:true), to facilitate sorting large number of records.
db.collection.aggregate([
{$match:{"_id":someId}},
{$unwind:"$comments.names"},
{$sort:{"comments.names.number":-1}},
{$group:{"_id":"$_id",
"var1":{$first:"$var1"},
"var2":{$first:"$var2"},
"var3":{$first:"$var3"},
"field":{$first:"$comments.field"},
"names":{$push:"$comments.names"}}},
{$project:{"comments":{"names":"$names","field":"$field"},"var1":1,
"var2":1,"var3":1}}
])
What The below query does:
db.collection.find().sort( { "comments.names.number": -1 } )
is to find all the documents, then sort those documents based on the number field in descending order. What this actually does is for each document get the comments.names.number field value which is the largest, for each document. And then sort the parent documents based on this number. It doesn't manipulate the names array inside each parent document.
You need update document for sort an array.
db.collection.update(
{ _id: 1 },
{
$push: {
comments.names: {
$each: [ ],
$sort: { number: -1 }
}
}
}
)
check documentation here:
http://docs.mongodb.org/manual/reference/operator/update/sort/#use-sort-with-other-push-modifiers
MongoDB queries sort the result documents based on the collection of fields specified in the sort. They do not sort arrays within a document. If you want the array sorted, you need to sort it yourself after you retrieve the document, or store the array in sorted order. See this old SO answer from Stennie.

MongoDB: how to aggregate array field that may be missing

How do I get MongoDB to calculate the sum of array values when the array field may be missing completely (as is the case for month 10)?
For example:
> db.month.save({MonthNum: 10,
... NumWeekdays: 23});
> db.month.save({MonthNum: 11,
... NumWeekdays: 21,
... Holidays: [ {Description: "Thanksgiving", NumDays: 2} ] });
> db.month.save({MonthNum: 12,
... NumWeekdays: 22,
... Holidays: [ {Description: "Christmas", NumDays: 6},
... {Description: "New Year's Eve", NumDays: 1} ] });
> db.month.aggregate( { $unwind: "$Holidays" },
... { $group: { _id: "$MonthNum",
... total: { $sum: "$Holidays.NumDays" } } });
{
"result" : [
{
"_id" : 12,
"total" : 7
},
{
"_id" : 11,
"total" : 2
}
],
"ok" : 1
}
How do I get month 10 to show up in the above results (showing "total" as 0)?
Bonus: How do I get the above to show the available weekdays (the NumWeekdays minus the sum of the Holidays)?
I've tried $project to get the data into a canonical format first but without success so far... thanks!
$unwind isn't passing along your document with MonthNum 10 because your Holidays array is empty on that document (see the note at the bottom of the $unwind docs). Assuming that Holidays is always either an array containing at least one item or completely absent from a document, you can use the $ifNull operator inside of $project to add a "Holiday" document that just has NumDays = 0 to your Holidays is null:
db.month.aggregate([
// Make "Holidays" = [{NumDays:0}] if "Holidays" is null for this document (i.e. absent)
{$project:{NumWeekDays:1, MonthNum:1, Holidays:{$ifNull:["$Holidays", [{"NumDays":0}]]}}},
// Now you can unwind + group as normal
{$unwind:"$Holidays"},
{$group:{_id:"$MonthNum", NumWeekDays:{$first:"$NumWeekDays"}, "total":{$sum:"$Holidays.NumDays"}}},
// This should take care of "available weekdays"
{$project:{total:1, available:{$subtract:["$NumWeekDays", "$total"]}}}
]);
Note that $ifNull won't work if for some of your documents Holidays is an empty array; it has to be absent completely.