MongoDB query using aggregation not returning expected results - mongodb

I have a few documents that look like this example:
{
"_id": ObjectId("540f4b6496f35c16af001dc4"),
"groups": [
1,
46105,
46106,
53241,
55397,
55406,
62840
],
"vehicleid": 123,
"vehiclename": "123 - CAN BC",
"totaldistancetraveled": 472.0,
"date_num": 20140901
}
I need to find the total distance driven by all vehicles that belong to group 46105 and where theie date_num matches with 20140901.
I tried the following aggregation query:
db.vehicle_performance_monthly.aggregate(
{ $unwind : "$groups"},
{$group:
{_id: "$groups",
totalMiles: { $sum: "$totaldistancetraveled"}}},
{$match:{_id: {$in:[46106]}},{"$date_num":{$in:20140901}}}
)
But multiple matches are not being returned. Any help is appreciated.

This should work.
db.vehicle_performance_monthly.aggregate([ {
$match : {
groups : 46106,
date_num : 20140901
}
}, {
$unwind : "$groups"
}, {
$match : {
groups : 46106
}
}, {
$group : {
_id : "$groups",
totalMiles : {
$sum : "$totaldistancetraveled"
}
}
} ]);
Analysis for your original answer:
db.vehicle_performance_monthly.aggregate(
{ $unwind : "$groups"},
{$group:
{_id: "$groups",
totalMiles: { $sum: "$totaldistancetraveled"}}}, // $group doesn't map "date_name" then it will lost.
{$match:{_id: {$in:[46106]}},{"$date_num":{$in:20140901}}} // syntax error: {$match:{_id: {$in:[46106]}},{"$date_num":{$in:20140901}}} should be {$match:{_id: {$in:[46106]},"$date_num":{$in:[20140901]}}}
)
$match first to improve performance

Related

Select latest document after grouping them by a field in MongoDB

I got a question that I would expect to be pretty simple, but I cannot figure it out. What I want to do is this:
Find all documents in a collection and:
sort the documents by a certain date field
apply distinct on one of its other fields, but return the whole document
Best shown in an example.
This is a mock input:
[
{
"commandName" : "migration_a",
"executionDate" : ISODate("1998-11-04T18:46:14.000Z")
},
{
"commandName" : "migration_a",
"executionDate" : ISODate("1970-05-09T20:16:37.000Z")
},
{
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
},
{
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
]
The expected output is:
[
{
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
},
{
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
]
Or, in other words:
Group the input data by the commandName field
Inside each group sort the documents
Return the newest document from each group
My attempts to write this query have failed:
The distinct() function will only return the value of the field I am distinct-ing on, not the whole document. That makes it unsuitable for my case.
Tried writing an aggregate query, but ran into an issue of how to sort-and-select a single document from inside of each group? The sort aggreation stage will sort the groups among one other, which is not what I want.
I am not too well-versed in Mongo and this is where I hit a wall. Any ideas on how to continue?
For reference, this is the work-in-progress aggregation query I am trying to expand on:
db.getCollection('some_collection').aggregate([
{ $group: { '_id': '$commandName', 'docs': {$addToSet: '$$ROOT'} } },
{ $sort: {'_id.docs.???': 1}}
])
Post-resolved edit
Thank you for the answers. I got what I needed. For future reference, this is the full query that will do what was requested and also return a list of the filtered documents, not groups.
db.getCollection('some_collection').aggregate([
{ $sort: {'executionDate': 1}},
{ $group: { '_id': '$commandName', 'result': { $last: '$$ROOT'} } },
{ $replaceRoot: {newRoot: '$result'} }
])
The query result without the $replaceRoot stage would be:
[
{
"_id": "migration_a",
"result": {
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
}
},
{
"_id": "migration_b",
"result": {
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
}
]
The outer _id and _result are just "group-wrappers" around the actual document I want, which is nested under the result key. Moving the nested document to the root of the result is done using the $replaceRoot stage. The query result when using that stage is:
[
{
"commandName" : "migration_a",
"executionDate" : ISODate("2005-11-08T11:58:52.000Z")
},
{
"commandName" : "migration_b",
"executionDate" : ISODate("2016-06-02T19:48:34.000Z")
}
]
Try this:
db.getCollection('some_collection').aggregate([
{ $sort: {'executionDate': -1}},
{ $group: { '_id': '$commandName', 'doc': {$first: '$$ROOT'} } }
])
I believe this will result in what you're looking for:
db.collection.aggregate([
{
$group: {
"_id": "$commandName",
"executionDate": {
"$last": "$executionDate"
}
}
}
])
You can check it out here
Of course, if you want to match your expected output exactly, you can add a sort (this may not be necessary since your goal is to simply return the newest document from each group):
{
$sort: {
"executionDate": 1
}
}
You can check this version out here.
The use-case the question presents is nearly covered in the $last aggregation operator documentation.
Which summarises:
the $group stage should follow a $sort stage to have the input
documents in a defined order. Since $last simply picks the last
document from a group.
Query: Link
db.collection.aggregate([
{
$sort: {
executionDate: 1
}
},
{
$group: {
_id: "$commandName",
executionDate: {
$last: "$executionDate"
}
}
}
]);

need me use aggregation mongodb in arrays

I need help in aggregate this query, I need aggregate values of debito
{
"_id" : ObjectId("5a088f6584ccb0a665900726"),
"usuario" : "tamura",
"creditos" : [
{
"nome_do_credito" : "credito inicial",
"credito" : 0
}
],
"debitos" : [
{
"nome_do_debito" : "debito inicial",
"debito" : 0
},
{
"nome_do_debito" : "Faculdade",
"debito" : "150.00"
}
]
}
I need the output
debito : 150
(0+150)
You will first need to turn all your debito fields into a numerical type (as in 150.00) since you cannot do Maths on strings (as in "150.00"). And then the following query should do the trick:
db.collection.aggregate({
$project: {
"debitos": {
$sum: "$debitos.debito"
}
}
})
In case you have more than one document in your collection and you want the total sum over all documents you can run this:
db.collection.aggregate({
$unwind: "$debitos" // flatten the "debitos" array
}, {
$group: {
"_id": null, // do not really group, just throw all documents in the same group
"debitos": {
$sum: "$debitos.debito" // sum up all debito fields
}
}
})

MongoDB: Create Object in Aggregation result

I want to return Object as a field in my Aggregation result similar to the solution in this question. However in the solution mentioned above, the Aggregation results in an Array of Objects with just one item in that array, not a standalone Object. For example, a query like the following with a $push operation
$group:{
_id: "$publisherId",
'values' : { $push:{
newCount: { $sum: "$newField" },
oldCount: { $sum: "$oldField" } }
}
}
returns a result like this
{
"_id" : 2,
"values" : [
{
"newCount" : 100,
"oldCount" : 200
}
]
}
}
not one like this
{
"_id" : 2,
"values" : {
"newCount" : 100,
"oldCount" : 200
}
}
}
The latter is the result that I require. So how do I rewrite the query to get a result like that? Is it possible or is the former result the best I can get?
You don't need the $push operator, just add a final $project pipeline that will create the embedded document. Follow this guideline:
var pipeline = [
{
"$group": {
"_id": "$publisherId",
"newCount": { "$sum": "$newField" },
"oldCount": { "$sum": "$oldField" }
}
},
{
"$project" {
"values": {
"newCount": "$newCount",
"oldCount": "$oldCount"
}
}
}
];
db.collection.aggregate(pipeline);

MongoDB sum() data

I am new to mongoDB and nosql, what is the syntax to get a sum?
In MySQL, I would do something like this:
SELECT SUM(amount) from my_table WHERE member_id = 61;
How would I convert that to MongoDB? Here is what I have tried:
db.bigdata.aggregate({
$group: {
_id: {
memberId: 61,
total: {$sum: "$amount"}
}
}
})
Using http://docs.mongodb.org/manual/tutorial/aggregation-zip-code-data-set/ for reference you want:
db.bigdata.aggregate(
{
$match: {
memberId: 61
}
},
{
$group: {
_id: "$memberId",
total : { $sum : "$amount" }
}
})
From the MongoDB docs:
The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated results.
It would be better to match first and then group, so that you system only perform group operation on filtered records. If you perform group operation first then system will perform group on all records and then selects the records with memberId=61.
db.bigdata.aggregate(
{ $match : {memberId : 61 } },
{ $group : { _id: "$memberId" , total : { $sum : "$amount" } } }
)
db.bigdata.aggregate(
{ $match : {memberId : 61 } },
{ $group : { _id: "$memberId" , total : { $sum : "$amount" } } }
)
would work if you are summing data which is not a part of array, if you want to sum the data present in some array in a document then use
db.collectionName.aggregate(
{$unwind:"$arrayName"}, //unwinds the array element
{
$group:{_id: "$arrayName.arrayField", //id which you want to see in the result
total: { $sum: "$arrayName.value"}} //the field of array over which you want to sum
})
and will get result like this
{
"result" : [
{
"_id" : "someFieldvalue",
"total" : someValue
},
{
"_id" : "someOtherFieldvalue",
"total" : someValue
}
],
"ok" : 1
}

MongoDB Aggregation: Counting distinct fields

I am trying to write an aggregation to identify accounts that use multiple payment sources. Typical data would be.
{
account:"abc",
vendor:"amazon",
}
...
{
account:"abc",
vendor:"overstock",
}
Now, I'd like to produce a list of accounts similar to this
{
account:"abc",
vendorCount:2
}
How would I write this in Mongo's aggregation framework
I figured this out by using the $addToSet and $unwind operators.
Mongodb Aggregation count array/set size
db.collection.aggregate([
{
$group: { _id: { account: '$account' }, vendors: { $addToSet: '$vendor'} }
},
{
$unwind:"$vendors"
},
{
$group: { _id: "$_id", vendorCount: { $sum:1} }
}
]);
Hope it helps someone
I think its better if you execute query like following which will avoid unwind
db.t2.insert({_id:1,account:"abc",vendor:"amazon"});
db.t2.insert({_id:2,account:"abc",vendor:"overstock"});
db.t2.aggregate([
{ $group : { _id : { "account" : "$account", "vendor" : "$vendor" }, number : { $sum : 1 } } },
{ $group : { _id : "$_id.account", number : { $sum : 1 } } }
]);
Which will show you following result which is expected.
{ "_id" : "abc", "number" : 2 }
You can use sets
db.test.aggregate([
{$group: {
_id: "$account",
uniqueVendors: {$addToSet: "$vendor"}
}},
{$project: {
_id: 1,
vendorsCount: {$size: "$uniqueVendors"}
}}
]);
I do not see why somebody would have to use $group twice
db.t2.aggregate([ { $group: {"_id":"$account" , "number":{$sum:1}} } ])
This will work perfectly fine.
This approach doesn't make use of $unwind and other extra operations. Plus, this won't affect anything if new things are added into the aggregation. There's a flaw in the accepted answer. If you have other accumulated fields in the $group, it would cause issues in the $unwind stage of the accepted answer.
db.collection.aggregate([{
"$group": {
"_id": "$account",
"vendors": {"$addToSet": "$vendor"}
}
},
{
"$addFields": {
"vendorCount": {
"$size": "$vendors"
}
}
}])
To identify accounts that use multiple payment sources:
Use grouping to count data from multiple account records and group the result by account with count
Use a match case is to filter only such accounts having more than one payment method
db.payment_collection.aggregate([ { $group: {"_id":"$account" ,
"number":{$sum:1}} }, {
"$match": {
"number": { "$gt": 1 }
}
} ])
This will work perfectly fine,
db.UserModule.aggregate(
{ $group : { _id : { "companyauthemail" : "$companyauthemail", "email" : "$email" }, number : { $sum : 1 } } },
{ $group : { _id : "$_id.companyauthemail", number : { $sum : 1 } } }
);
An example
db.collection.distinct("example.item").forEach( function(docs) {
print(docs + "==>>" + db.collection.count({"example.item":docs}))
});