Related
I am trying to sort a mongodb aggregate I don't what it is happening. I was searching some solution in stack overflow but they didn't work and I don't know why...
My idea is return a ranking of values from the field array (tags). I could achieve the list the sum of values but I can not sort it...
This is the query that I could do and it seems that it works:
db.getCollection("metadata").aggregate(
{$unwind: '$tags'},
{$group: {_id:'$tags.name', total: {$sum: 1}}}
);
Because I receive this result that It has sense:
{
"_id" : "kite",
"total" : 1.0
}
{
"_id" : "piggy bank",
"total" : 1.0
}
{
"_id" : "sorrel",
"total" : 1.0
}
{
"_id" : "eggnog"
"total" : 4.0
}
{
"_id" : "Weimaraner",
"total" : 1.0
}
{
"_id" : "bassinet",
"total" : 15.0
}
{
"_id" : "squirrel monkey",
"total" : 1.0
}
{
"_id" : "bath towel",
"total" : 6.0
}
TRIES
When I tried something like this:
db.getCollection("metadata").aggregate(
{$unwind: '$tags'},
{$group: {_id:'$tags.name', total: {$sum: 1}}},
{$sort: {total: -1}}
);
RESULT TRY:
{
"_id" : "baboon",
"total" : 12.0
}
{
"_id" : "snow leopard",
"total" : 4.0
}
{
"_id" : "green lizard",
"total" : 5.0
}
{
"_id" : "Dandie Dinmont",
"total" : 7.0
}
{
"_id" : "echidna",
"total" : 8.0
}
{
"_id" : "bee eater",
"total" : 6.0
}
or like this:
db.getCollection("metadata").aggregate(
{$unwind: '$tags'},
{$group: {_id: { name:'$tags.name', total: {$sum: 1}}}},
{$sort: {total: -1}}
);
The result doesn't sort or directly not sum the values...
EXTRA
This is the query if I want to list all the entries with the array:
db.getCollection('metadata').find({tags: {$exists: true}})
And the result is:
/* 2 */
{
"_id" : ObjectId("5900af3ff6844d2f7519fe13"),
"user_id" : 23,
"company_id" : 1,
"created" : ISODate("2017-04-26T14:31:27.000Z"),
"md5file" : "fdd30b1ca52e1c15f330f46c0079498c",
"path" : "/storage/emulated/0/DCIM/Camera/IMG_20160605_133703.jpg",
"image_width" : 3456,
"image_height" : 4608,
"originalTags" : [
{
"name" : "sleeping bag",
"percentage" : 0.7529412
},
{
"name" : "diaper",
"percentage" : 0.05490196
},
{
"name" : "bib",
"percentage" : 0.039215688
}
],
"tags" : [
{
"name" : "sleeping bag",
"percentage" : 0.7529412
}
]
}
/* 3 */
{
"_id" : ObjectId("5900af3ff6844d2f7519fe14"),
"user_id" : 23,
"company_id" : 1,
"created" : ISODate("2017-04-26T14:31:27.000Z"),
"md5file" : "22612c8bc99d1031146f7c9918555572",
"path" : "/storage/emulated/0/DCIM/Camera/IMG_20160605_164243.jpg",
"image_width" : 4608,
"image_height" : 3456,
"originalTags" : [
{
"name" : "bath towel",
"percentage" : 0.62352943
},
{
"name" : "quilt",
"percentage" : 0.101960786
},
{
"name" : "cradle",
"percentage" : 0.043137256
}
],
"tags" : [
{
"name" : "bath towel",
"percentage" : 0.62352943
}
]
}
Aggregation pipeline is an array. It should be wrapped in square brackets []:
db.getCollection("metadata").aggregate(
[
{$unwind: '$tags'},
{$group: {_id:'$tags.name', total: {$sum: 1}}},
{$sort: {total: -1}}
]
);
I have a collection called transaction with below documents,
/* 0 */
{
"_id" : ObjectId("5603fad216e90d53d6795131"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e67267",
"status" : "A",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:15:36.609Z")
}
/* 1 */
{
"_id" : ObjectId("5603fad216e90d53d6795134"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e6726d",
"status" : "B",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:14:31.609Z")
}
/* 2 */
{
"_id" : ObjectId("5603fad216e90d53d679512e"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e6726d",
"status" : "C",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:13:36.609Z")
}
/* 3 */
{
"_id" : ObjectId("5603fad216e90d53d6795132"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e6726d",
"status" : "D",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:16:36.609Z")
}
When I run the below Aggregation query without $group,
db.transaction.aggregate([
{
"$match": {
"userId": "100",
"statusId": "65c719e6727d"
}
},
{
"$sort": {
"createdTs": -1
}
}
])
I get the result in expected sorting order. i.e Sort createdTs in descending order (Minimal result)
/* 0 */
{
"result" : [
{
"_id" : ObjectId("5603fad216e90d53d6795132"),
"createdTs" : ISODate("2015-09-24T13:16:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795131"),
"createdTs" : ISODate("2015-09-24T13:15:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795134"),
"createdTs" : ISODate("2015-09-24T13:14:31.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d679512e"),
"createdTs" : ISODate("2015-09-24T13:13:36.609Z")
}
],
"ok" : 1
}
If I apply the below aggregation with $group, the resultant is inversely sorted(i.e Ascending sort)
db.transaction.aggregate([
{
"$match": {
"userId": "100",
"statusId": "65c719e6727d"
}
},
{
"$sort": {
"createdTs": -1
}
},
{
$group: {
"_id": {
"statusId": "$statusId",
"relatedWith": "$relatedWith",
"status": "$status"
},
"status": {$first: "$status"},
"statusId": {$first: "$statusId"},
"relatedWith": {$first: "$relatedWith"},
"createdTs": {$first: "$createdTs"}
}
}
]);
I get the result in inverse Order i.e. ** Sort createdTs in Ascending order**
/* 0 */
{
"result" : [
{
"_id" : ObjectId("5603fad216e90d53d679512e"),
"createdTs" : ISODate("2015-09-24T13:13:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795134"),
"createdTs" : ISODate("2015-09-24T13:14:31.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795131"),
"createdTs" : ISODate("2015-09-24T13:15:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795132"),
"createdTs" : ISODate("2015-09-24T13:16:36.609Z")
}
],
"ok" : 1
}
Where am I wrong ?
The $group stage doesn't insure the ordering of the results. See here the first paragraph.
If you want the results to be sorted after a $group, you need to add a $sort after the $group stage.
In your case, you should move the $sort after the $group and before you ask the question : No, the $sort won't be able to use an index after the $group like it does before the $group :-).
The internal algorithm of $group seems to keep some sort of ordering (reversed apparently), but I would not count on that and add a $sort.
You are not doing anything wrong here, Its a $group behavior in Mongodb
Lets have a look in this example
Suppose you have following doc in collection
{ "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-01-01T08:00:00Z") }
{ "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-02-03T09:00:00Z") }
{ "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-03T09:05:00Z") }
{ "_id" : 4, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-02-15T08:00:00Z") }
{ "_id" : 5, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T09:05:00Z") }
{ "_id" : 6, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-15T12:05:10Z") }
{ "_id" : 7, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T14:12:12Z") }
Now if you run this
db.collection.aggregate([{ $sort: { item: 1,date:1}} ] )
the output will be in ascending order of item and date.
Now if you add group stage in aggregation pipeline it will reverse the order.
db.collection.aggregate([{ $sort: { item: 1,date:1}},{$group:{_id:"$item"}} ] )
Output will be
{ "_id" : "xyz" }
{ "_id" : "jkl" }
{ "_id" : "abc" }
Now the solution for your problem
change "createdTs": -1 to "createdTs": 1 for group
I have following data:
{ "id" : 1, "lsPairs" :[{"location" : "L0", "service" : "S0" }]}
{ "id" : 2, "lsPairs" :[{"location" : "L0", "service" : "S0" },{"location" : "L1", "service" : "S1"}]}
{ "id" : 3, "lsPairs" :[{"location" : "L0", "service" : "S0" },{"location" : "L1", "service" : "S1"}, {"location" : "L2", "service" : "S2"}]}
{ "id" : 4, "lsPairs" :[{"location" : "L0", "service" : "S0" },{"location" : "L1", "service" : "S1"},{"location" : "L2", "service" : "S2"}, {"location" : "L3", "service" : "S3"}]}`
I want to get location count, service count and (location,service) pair count
{ "_id" : "L3" , "count" : 1}
{ "_id" : "L2" , "count" : 2}
{ "_id" : "L1" , "count" : 3}
{ "_id" : "L0" , "count" : 4}
{ "_id" : "S3" , "count" : 1}
{ "_id" : "S2" , "count" : 2}
{ "_id" : "S1" , "count" : 3}
{ "_id" : "S0" , "count" : 4}
{ "_id" : { "loc" : "L2" , "srv" : "S2"} , "count" : 2}
{ "_id" : { "loc" : "L1" , "srv" : "S1"} , "count" : 3}
{ "_id" : { "loc" : "L3" , "srv" : "S3"} , "count" : 1}
{ "_id" : { "loc" : "L0" , "srv" : "S0"} , "count" : 4}`
Now I run group function three times, group different id.
Any idea for using one group to get these result?
You will need to deconstruct the array with $unwind then $group the documents.
collection.aggregate([
{ $unwind: "$lsPairs" },
{ $group: {
_id: {
"loc": "$lsPairs.location",
"srv": "$lsPairs.service"
},
"count": { $sum: 1 }
}}
])
Output
{ "_id" : { "loc" : "L3", "srv" : "S3" }, "count" : 1 }
{ "_id" : { "loc" : "L2", "srv" : "S2" }, "count" : 2 }
{ "_id" : { "loc" : "L1", "srv" : "S1" }, "count" : 3 }
{ "_id" : { "loc" : "L0", "srv" : "S0" }, "count" : 4 }
Keep the first round location-service pair to a collection and reused it.
db.locservice.aggregate([ {$unwind:"$lsPairs"},
{$group:{_id:"$lsPairs",count: { $sum: 1}}},
{$sort:{_id:1}},
{$out:"lsp"} ])
Take location from temp collection and group it.
db.lsp.aggregate([{$project:{_id:0, loc:"$_id.location", count:1}},
{$group:{_id:"$loc", cnt:{$sum:"$count"}}}, {$sort:{_id:1}} ])
Take service from temp collection and group it.
db.lsp.aggregate([{$project:{_id:0, srv:"$_id.service", count:1}},
{$group:{_id:"$srv", cnt:{$sum:"$count"}}}, {$sort:{_id:1}} ])
The following I add location and service to array, can I group two array same time
db.locservice.aggregate([ {$unwind:"$lsPairs"},
{$group:{_id:"$lsPairs",count: { $sum: 1},
locs:{$push:{item:"$lsPairs.location"}},
srvs:{$push:{item:"$lsPairs.service"}}}},
{$project:{count:1, locs:1, srvs:1}} ])
{ "_id" : { "location" : "L3", "service" : "S3" }, "count" : 1, "locs" : [ { "item" : "L3" } ], "srvs" : [ { "item" : "S3" } ] }
{ "_id" : { "location" : "L2", "service" : "S2" }, "count" : 2, "locs" : [ { "item" : "L2" }, { "item" : "L2" } ], "srvs" : [ { "item" : "S2" }, { "item" : "S2" } ] }
{ "_id" : { "location" : "L1", "service" : "S1" }, "count" : 3, "locs" : [ { "item" : "L1" }, { "item" : "L1" }, { "item" : "L1" } ], "srvs" : [ { "item" : "S1" }, { "item" : "S1" }, { "item" : "S1" } ] }
{ "_id" : { "location" : "L0", "service" : "S0" }, "count" : 4, "locs" : [ { "item" : "L0" }, { "item" : "L0" }, { "item" : "L0" }, { "item" : "L0" } ], "srvs" : [ { "item" : "S0" }, { "item" : "S0" }, { "item" : "S0" }, { "item" : "S0" } ] }
I have colletions containing records like
{ "type" : "me", "tid" : "1" }
{ "type" : "me", "tid" : "1" }
{ "type" : "me", "tid" : "1" }
{ "type" : "you", "tid" : "1" }
{ "type" : "you", "tid" : "1" }
{ "type" : "me", "tid" : "2" }
{ "type" : "me", "tid" : "2"}
{ "type" : "you", "tid" : "2"}
{ "type" : "you", "tid" : "2" }
{ "type" : "you", "tid" : "2"}
I have want result like below
[
{"tid" : "1","me" : 3,"you": 2},
{"tid" : "2","me" : 2,"you": 3}
]
I have tried group and; aggregate queries doesn't get required result format.
below is the group query.
db.coll.group({
key: {tid : 1,type:1},
cond: { tid : { "$in" : [ "1","2"]} },
reduce: function (curr,result) {
result.total = result.total + 1
},
initial: { total : 0}
})
it result is like
[
{"tid" : "1", "type" : "me" ,"total": 3 },
{"tid" : "1","type" : "you" ,"total": 2 },
{"tid" : "2", "type" : "me" ,"total": 2 },
{"tid" : "2","type" : "you" ,"total": 3 }
]
following is aggregate query
db.coll.aggregate([
{$match : { "tid" : {"$in" : ["1","2"]}}},
{$group : { _id : {tid : "$tid",type : "$type"},total : {"$sum" : 1}}}
])
gives following result
{
"result" :
[
{"_id" : {"tid" : "1","type" : "me"},"total" : 3},
{"_id" : {"tid" : "2","type" : "me" },"total" : 2},
{"_id" : {"tid" : "2","type" : "you"},"total" : 3}
]
"ok" : 1
}
it is possible to obtain I specified result or I have to do some manipulation in my code.
Thanks
If you change your aggregation to this:
db.so.aggregate([
{ $match : { "tid" : { "$in" : ["1", "2"] } } },
{ $group : {
_id : { tid : "$tid", type : "$type" },
total : { "$sum" : 1 }
} },
{ $group : {
_id : "$_id.tid",
values: { $push: { type: "$_id.type", total: '$total' } }
} }
])
Then your output is:
{
"result" : [
{
"_id" : "1",
"values" : [
{ "type" : "you", "total" : 2 },
{ "type" : "me", "total" : 3 }
]
},
{
"_id" : "2",
"values" : [
{ "type" : "me", "total" : 2 },
{ "type" : "you", "total" : 3 }
]
}
],
"ok" : 1
}
Although that is not the same as what you want, it is going to be the closest that you can get. And in your application, you can easily pull out the values in the same was as with what you would like to get out of it.
Just keep in mind, that in general you can not promote a value (you, me) to a key — unless your key is of a limited set (3-4 items max).
I have a collection of the following data:
{
"_id" : ObjectId("51f1fcc08188d3117c6da351"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 25,
"items" : [{
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}, {
"sku" : "ppp",
"qty" : 5,
"price" : 2.5
}]
}
I am using the query:
cmd { "aggregate" : "orders" , "pipeline" : [
{ "$unwind" : "$items"} ,
{ "$match" : { "items" : { "$elemMatch" : { "qty" : { "$in" : [ 7]}}}}} ,
{ "$group" : { "price" : { "$first" : "$price"} , "items" : { "$push" : { "sku" : "$items.sku"}} , "_id" : { "items" : "$items"}}} ,
{ "$sort" : { "price" : -1}} ,
{ "$project" : { "_id" : 0 , "price" : 1 , "items" : 1}}
]}
Not able to understand what is going wrong
It's because you're doing $match after $unwind. $unwind generates a new stream of documents where items is no longer an array (see docs).
It emits each document as many times as there are items in it.
If you want to select documents with desired element in it and then process all of its documents, you should call $match first:
db.orders.aggregate(
{ "$match" : { "items" : { "$elemMatch" : { "qty" : { "$in" : [ 7]}}}}},
{ "$unwind" : "$items"},
...
);
If you want to select items to be processed after $unwind, you shoul remove $elemMatch:
db.orders.aggregate(
{ "$unwind" : "$items"},
{ "$match" : { "items.qty" : { "$in" : [7]}}},
...
);
In first case you'll get two documents:
{
"price" : 25,
"items" : [
{"sku" : "ppp"}
]
},
{
"price" : 25,
"items" : [
{"sku" : "ggg"}
]
}
and in second case you'll get one:
{
"price" : 25,
"items" : [
{"sku" : "ggg"}
]
}
Update. After $unwind your documents will look like:
{
"_id" : ObjectId("51f1fcc08188d3117c6da351"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 25,
"items" : {
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}
}
For small number of documents, unwind and match is fine. But large number of documents, it better to do - match ($elemMatch), unwind, and match again.
db.orders.aggregate(
{ "$match" : { "items" : { "$elemMatch" : { "qty" : { "$in" : [ 7]}}}}},
{ "$unwind" : "$items"},
{ "$match" : { "items.qty" : { "$in" : [7]}}}
...
...
);
The first match will filter only documents that match qty criteria. Among the selected documents, the second match will remove the subdocuments not matching the qty criteria.