MongoDB: Sort in combination with Aggregation group - mongodb

I have a collection called transaction with below documents,
/* 0 */
{
"_id" : ObjectId("5603fad216e90d53d6795131"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e67267",
"status" : "A",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:15:36.609Z")
}
/* 1 */
{
"_id" : ObjectId("5603fad216e90d53d6795134"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e6726d",
"status" : "B",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:14:31.609Z")
}
/* 2 */
{
"_id" : ObjectId("5603fad216e90d53d679512e"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e6726d",
"status" : "C",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:13:36.609Z")
}
/* 3 */
{
"_id" : ObjectId("5603fad216e90d53d6795132"),
"statusId" : "65c719e6727d",
"relatedWith" : "65c719e6726d",
"status" : "D",
"userId" : "100",
"createdTs" : ISODate("2015-09-24T13:16:36.609Z")
}
When I run the below Aggregation query without $group,
db.transaction.aggregate([
{
"$match": {
"userId": "100",
"statusId": "65c719e6727d"
}
},
{
"$sort": {
"createdTs": -1
}
}
])
I get the result in expected sorting order. i.e Sort createdTs in descending order (Minimal result)
/* 0 */
{
"result" : [
{
"_id" : ObjectId("5603fad216e90d53d6795132"),
"createdTs" : ISODate("2015-09-24T13:16:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795131"),
"createdTs" : ISODate("2015-09-24T13:15:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795134"),
"createdTs" : ISODate("2015-09-24T13:14:31.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d679512e"),
"createdTs" : ISODate("2015-09-24T13:13:36.609Z")
}
],
"ok" : 1
}
If I apply the below aggregation with $group, the resultant is inversely sorted(i.e Ascending sort)
db.transaction.aggregate([
{
"$match": {
"userId": "100",
"statusId": "65c719e6727d"
}
},
{
"$sort": {
"createdTs": -1
}
},
{
$group: {
"_id": {
"statusId": "$statusId",
"relatedWith": "$relatedWith",
"status": "$status"
},
"status": {$first: "$status"},
"statusId": {$first: "$statusId"},
"relatedWith": {$first: "$relatedWith"},
"createdTs": {$first: "$createdTs"}
}
}
]);
I get the result in inverse Order i.e. ** Sort createdTs in Ascending order**
/* 0 */
{
"result" : [
{
"_id" : ObjectId("5603fad216e90d53d679512e"),
"createdTs" : ISODate("2015-09-24T13:13:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795134"),
"createdTs" : ISODate("2015-09-24T13:14:31.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795131"),
"createdTs" : ISODate("2015-09-24T13:15:36.609Z")
},
{
"_id" : ObjectId("5603fad216e90d53d6795132"),
"createdTs" : ISODate("2015-09-24T13:16:36.609Z")
}
],
"ok" : 1
}
Where am I wrong ?

The $group stage doesn't insure the ordering of the results. See here the first paragraph.
If you want the results to be sorted after a $group, you need to add a $sort after the $group stage.
In your case, you should move the $sort after the $group and before you ask the question : No, the $sort won't be able to use an index after the $group like it does before the $group :-).
The internal algorithm of $group seems to keep some sort of ordering (reversed apparently), but I would not count on that and add a $sort.

You are not doing anything wrong here, Its a $group behavior in Mongodb
Lets have a look in this example
Suppose you have following doc in collection
{ "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-01-01T08:00:00Z") }
{ "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-02-03T09:00:00Z") }
{ "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-03T09:05:00Z") }
{ "_id" : 4, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-02-15T08:00:00Z") }
{ "_id" : 5, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T09:05:00Z") }
{ "_id" : 6, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-15T12:05:10Z") }
{ "_id" : 7, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T14:12:12Z") }
Now if you run this
db.collection.aggregate([{ $sort: { item: 1,date:1}} ] )
the output will be in ascending order of item and date.
Now if you add group stage in aggregation pipeline it will reverse the order.
db.collection.aggregate([{ $sort: { item: 1,date:1}},{$group:{_id:"$item"}} ] )
Output will be
{ "_id" : "xyz" }
{ "_id" : "jkl" }
{ "_id" : "abc" }
Now the solution for your problem
change "createdTs": -1 to "createdTs": 1 for group

Related

Mongo aggregation - Sorting using a field value from previous pipeline as the sort field

I have produced the below output using mongodb aggregation (including $group pipeline inside levelsCount field) :
{
"_id" : "1",
"name" : "First",
"levelsCount" : [
{ "_id" : "level_One", "levelNum" : 1, "count" : 1 },
{ "_id" : "level_Three", "levelNum" : 3, "count" : 1 },
{ "_id" : "level_Four", "levelNum" : 4, "count" : 8 }
]
}
{
"_id" : "2",
"name" : "Second",
"levelsCount" : [
{ "_id" : "level_One", "levelNum" : 1, "count" : 5 },
{ "_id" : "level_Two", "levelNum" : 2, "count" : 2 },
{ "_id" : "level_Three", "levelNum" : 3, "count" : 1 },
{ "_id" : "level_Four", "levelNum" : 4, "count" : 3 }
]
}
{
"_id" : "3",
"name" : "Third",
"levelsCount" : [
{ "_id" : "level_One", "levelNum" : 1, "count" : 1 },
{ "_id" : "level_Two", "levelNum" : 2, "count" : 3 },
{ "_id" : "level_Three", "levelNum" : 3, "count" : 2 },
{ "_id" : "level_Four", "levelNum" : 4, "count" : 3 }
]
}
Now, I need to sort these documents based on the levelNum and count fields of levelsCount array elements. I.e. If two documents both had the count 5 forlevelNum: 1 (level_One), then the sort goes to compare the count of levelNum: 2 (level_Two) field and so on.
I see how $sort pipeline would work on multiple fields (Something like { $sort : { level_One : 1, level_Two: 1 } }), But the problem is how to access those values of levelNum of each array element and set that value as a field name to do sorting on that. (I couldn't handle it even after $unwinding the levelsCount array).
P.s: The initial order of levelsCount array's elements may differ on each document and is not important.
Edit:
The expected output of the above structure would be:
// Sorted result:
{
"_id" : "2",
"name" : "Second",
"levelsCount" : [
{ "_id" : "level_One", "levelNum" : 1, "count" : 5 }, // "level_One's count: 5" is greater than "level_One's count: 1" in two other documents, regardless of other level_* fields. Therefore this whole document with "name: Second" is ordered first.
{ "_id" : "level_Two", "levelNum" : 2, "count" : 2 },
{ "_id" : "level_Three", "levelNum" : 3, "count" : 1 },
{ "_id" : "level_Four", "levelNum" : 4, "count" : 3 }
]
}
{
"_id" : "3",
"name" : "Third",
"levelsCount" : [
{ "_id" : "level_One", "levelNum" : 1, "count" : 1 },
{ "_id" : "level_Two", "levelNum" : 2, "count" : 3 }, // "level_Two's count" in this document exists with value (3) while the "level_Two" doesn't exist in the below document which mean (0) value for count. So this document with "name: Third" is ordered higher than the below document.
{ "_id" : "level_Three", "levelNum" : 3, "count" : 2 },
{ "_id" : "level_Four", "levelNum" : 4, "count" : 3 }
]
}
{
"_id" : "1",
"name" : "First",
"levelsCount" : [
{ "_id" : "level_One", "levelNum" : 1, "count" : 1 },
{ "_id" : "level_Three", "levelNum" : 3, "count" : 1 },
{ "_id" : "level_Four", "levelNum" : 4, "count" : 8 }
]
}
Of course, I'd prefer to have an output document in the below format, But the first problem is to sort all docs:
{
"_id" : "1",
"name" : "First",
"levelsCount" : [
{ "level_One" : 1 },
{ "level_Three" : 1 },
{ "level_Four" : 8 }
]
}
You can sort by levelNum as descending order and count as ascending order,
db.collection.aggregate([
{
$sort: {
"levelsCount.levelNum": -1,
"levelsCount.count": 1
}
}
])
Playground
For key-value format result of levelsCount array,
$map to iterate loop of levelsCount array
prepare key-value pair array and convert to object using $arrayToObject
{
$addFields: {
levelsCount: {
$map: {
input: "$levelsCount",
in: {
$arrayToObject: [
[{ k: "$$this._id", v: "$$this.levelNum" }]
]
}
}
}
}
}
Playground

How to get last document of each day in MongoDB collection?

I have a model Entry to which includes details of a hospital at a particular time. The data looks like this:
{
"_id": "5ef9c7337874820008c1a026",
"date": 1593427763640,
//... some data
"hospital": {
"_id": "5ef8d06630c364000840bb6d",
"name": "City Hospital",
//... some data
},
}
I want to get the last query of each day grouped by the hospital ID. In MySQL, it can be achieved using INNER JOIN. How can I do it using MongoDB?
Given a day, calculate start and end of a day.
This is to be used for filtering records, $match
start_of_day_ephocs=
end_of_day_ephocs=
Aggregate Query
sort by date, Group by hospital id,and select first document
db.Entry.aggregate(
[
{ "$match": { "date": {"$gte":start_of_day_ephocs,"$lte":end_of_day_ephocs }} },
{ "$sort": { "date": -1 } },
{
$group:
{
"_id": "$hospital._id",
"last_document": { "$first": "$$ROOT" }
}
}
]
)
Consider a sales collection with the following documents:
{ "_id" : 1, "item" : "abc", "date" : ISODate("2014-01-01T08:00:00Z"), "price" : 10, "quantity" : 2 }
{ "_id" : 2, "item" : "jkl", "date" : ISODate("2014-02-03T09:00:00Z"), "price" : 20, "quantity" : 1 }
{ "_id" : 3, "item" : "xyz", "date" : ISODate("2014-02-03T09:05:00Z"), "price" : 5, "quantity" : 5 }
{ "_id" : 4, "item" : "abc", "date" : ISODate("2014-02-15T08:00:00Z"), "price" : 10, "quantity" : 10 }
{ "_id" : 5, "item" : "xyz", "date" : ISODate("2014-02-15T09:05:00Z"), "price" : 5, "quantity" : 10 }
{ "_id" : 6, "item" : "xyz", "date" : ISODate("2014-02-15T12:05:10Z"), "price" : 5, "quantity" : 5 }
{ "_id" : 7, "item" : "xyz", "date" : ISODate("2014-02-15T14:12:12Z"), "price" : 5, "quantity" : 10 }
The following operation first sorts the documents by item and date, and then in the following $group stage, groups the now sorted documents by the item field and uses the $last accumulator to compute the last sales date for each item:
db.sales.aggregate(
[
{ $sort: { item: 1, date: 1 } },
{
$group:
{
_id: "$item",
lastSalesDate: { $last: "$date" }
}
}
]
)
The operation returns the following results:
{ "_id" : "xyz", "lastSalesDate" : ISODate("2014-02-15T14:12:12Z") }
{ "_id" : "jkl", "lastSalesDate" : ISODate("2014-02-03T09:00:00Z") }
{ "_id" : "abc", "lastSalesDate" : ISODate("2014-02-15T08:00:00Z") }
Resource

What is $$ROOT in MongoDB aggregate and how it works?

I am watching a tutorial I can understand how this aggregate works, What is the use of pings, $$ROOT in it.
client = pymongo.MongoClient(MY_URL)
pings = client['mflix']['watching_pings']
cursor = pings.aggregate([
{
"$sample": { "size": 50000 }
},
{
"$addFields": {
"dayOfWeek": { "$dayOfWeek": "$ts" },
"hourOfDay": { "$hour": "$ts" }
}
},
{
"$group": { "_id": "$dayOfWeek", "pings": { "$push": "$$ROOT" } }
},
{
"$sort": { "_id": 1 }
}
]);
Let's assume that our collection looks like below:
{
"_id" : ObjectId("b9"),
"key" : 1,
"value" : 20,
"history" : ISODate("2020-05-16T00:00:00Z")
},
{
"_id" : ObjectId("ba"),
"key" : 1,
"value" : 10,
"history" : ISODate("2020-05-13T00:00:00Z")
},
{
"_id" : ObjectId("bb"),
"key" : 3,
"value" : 50,
"history" : ISODate("2020-05-12T00:00:00Z")
},
{
"_id" : ObjectId("bc"),
"key" : 2,
"value" : 0,
"history" : ISODate("2020-05-13T00:00:00Z")
},
{
"_id" : ObjectId("bd"),
"key" : 2,
"value" : 10,
"history" : ISODate("2020-05-16T00:00:00Z")
}
Now based on the history field you want to group and insert the whole documents in to an array field 'items'. Here $$ROOT variable will be helpful.
So, the aggregation query to achieve the above will be:
db.collection.aggregate([{
$group: {
_id: '$history',
items: {$push: '$$ROOT'}
}
}])
It will result in following output:
{
"_id" : ISODate("2020-05-12T00:00:00Z"),
"items" : [
{
"_id" : ObjectId("bb"),
"key" : 3,
"value" : 50,
"history" : ISODate("2020-05-12T00:00:00Z")
}
]
},
{
"_id" : ISODate("2020-05-13T00:00:00Z"),
"items" : [
{
"_id" : ObjectId("ba"),
"key" : 1,
"value" : 10,
"history" : ISODate("2020-05-13T00:00:00Z")
},
{
"_id" : ObjectId("bc"),
"key" : 2,
"value" : 0,
"history" : ISODate("2020-05-13T00:00:00Z")
}
]
},
{
"_id" : ISODate("2020-05-16T00:00:00Z"),
"items" : [
{
"_id" : ObjectId("b9"),
"key" : 1,
"value" : 20,
"history" : ISODate("2020-05-16T00:00:00Z")
},
{
"_id" : ObjectId("bd"),
"key" : 2,
"value" : 10,
"history" : ISODate("2020-05-16T00:00:00Z")
}
]
}
I hope it helps.

Mongodb query to group by multiple fields and filter

I want to be able to group each "Place" to show over time, how many "PatientIds" they are seeing on a given day and then be able to filter this by what the action is.
Basically Total Patients on y-axis, Date on x-axis and then a filter or stacked chart to show the action. I also thought about a mapreduce, but have never done that in mongo
I can't figure out the correct mongo query. Right now I have:
db.collection.aggregate({"$group":{_id:{place:"$place",date:"$date",action:"$action",count:{$sum:1}}},{$sort:{"_id.date":1,"_id.place":1}})
However, this is just listing out the data. I tried to do a match on all places, but that didn't give me the results I was looking for either. Any ideas?
Example json:
{
"_id" : ObjectId(""),
"patientId" : "100",
"place" : "1",
"action" : "DIAGNOSED",
"date" : ISODate("2017-01-20")
}
{
"_id" : ObjectId(""),
"patientId" : "101",
"place" : "1",
"action" : "PATIENT IN",
"date" : ISODate("2017-01-20)
}
{
"_id" : ObjectId(""),
"patientId" : "200",
"place" : "2",
"action" : "MEDICINE",
"date" : ISODate("2017-01-05")
}
{
"_id" : ObjectId(""),
"patientId" : "300",
"place" : "2",
"action" : "DIAGNOSED",
"date" : ISODate("2017-01-31")
}
EDIT - mapreduce
> var map = function(){emit(this.place,1)}
> var reduce = function(key,values){var res = 0;values.forEach(function(v){res+=1});return{count:res};}
> db.new.mapReduce(map,reduce,{out:"mapped_places"});
{
"result" : "mapped_places",
"timeMillis" : 88,
"counts" : {
"input" : 4,
"emit" : 4,
"reduce" : 2,
"output" : 2
},
"ok" : 1
}
> db.mapped_offices.find({})
{ "_id" : "1", "value" : { "count" : 2 } }
{ "_id" : "2", "value" : { "count" : 2 } }
>
You can try below aggregation query.
db.collection.aggregate([
{
"$group": {
"_id": {
"date": "$date",
"place": "$place"
},
"actions": {
"$push": "$action"
},
"count": {
"$sum": 1
}
}
},
{
"$unwind": "$actions"
},
{
"$sort": {
"_id.date": 1,
"_id.place": 1
}
}
]);
This should output something like
{ "_id" : { "date" : ISODate("2017-01-20T00:00:00Z"), "place" : "1"}, "count" : 2, "actions" : "PATIENT IN" }
{ "_id" : { "date" : ISODate("2017-01-20T00:00:00Z"), "place" : "1"}, "count" : 2, "actions" : "DIAGNOSED" }

Do $sort works for sub array document

I have a collection which has a field of array kind. I want to sort on the basis of a field of sub-array but Mongo is not sorting the data.
My collection is:
{
"_id" : ObjectId("51f1fcc08188d3117c6da351"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 25,
"items" : [{
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}, {
"sku" : "ppp",
"qty" : 5,
"price" : 2.5
}]
}
My Query is:
db.orders.aggregate([
{ "$unwind" : "$items"} ,
{ "$match" : { }} ,
{ "$group" : { "items" : { "$addToSet" : { "sku" : "$items.sku"}} , "_id" : { }}} ,
{ "$sort" : { "items.sku" : 1}} ,
{ "$project" : { "_id" : 0 , "items" : 1}}
])
Result is:
"result" : [
{
"items" : [
{
"sku" : "ppp"
},
{
"sku" : "ggg"
}
]
}
],
"ok" : 1
}
Whereas "sku":"ggg" should come first when it is ascending.
You weant to do the sort BEFORE you regroup:
db.orders.aggregate([
{ "$unwind" : "$items"} ,
{ "$sort" : { "items.sku" : 1}},
{ "$match" : { }} ,
{ "$group" : { "items" : { "$push" : { "sku" : "$items.sku"}} , "_id" : null}} ,
{ "$project" : { "_id" : 0 , "items" : 1}}
])