I have a collection in mongoDB that everyday a document with sampling data is added to it. I want to observe fields changes.
I want to use mongoDB aggregation to group similar items next to each other to the first:
+--+-------------------------+
|id|field | date |
+--+-------------------------+
| 1|hello | date1|
+--+-------------------------+
| 2|foobar | date2| \_ Condense these into one row with date2
+--+-------------------------+ /
| 3|foobar | date3|
+--+-------------------------+
| 4|hello | date4|
+--+-------------------------+
| 5|world | date5| \__ Condense these into a row with date5
+--+-------------------------+ /
| 6|world | date6|
+--+-------------------------+
| 7|puppies | date7|
+--+-------------------------+
| 8|kittens | date8| \__ Condense these into a row with date8
+--+-------------------------+ /
| 9|kittens | date9|
+--+-------------------------+
Is it possible to create a mongoDB aggregation for this problem?
Here is answer to similar problem in MySQL:
Grouping similar rows next to each other in MySQL
Sample Data
Data are already sorted by date.
These documents:
{ "_id" : "566ee064d56d02e854df756e", "date" : "2015-12-14T15:29:40.432Z", "score" : 59 },
{ "_id" : "566a8c70520d55771f2e9871", "date" : "2015-12-11T08:42:23.880Z", "score" : 60 },
{ "_id" : "566932f5572bd1720db7a4ef", "date" : "2015-12-10T08:08:21.514Z", "score" : 60 },
{ "_id" : "5667e652c021206f34e2c9e4", "date" : "2015-12-09T08:29:06.696Z", "score" : 60 },
{ "_id" : "5666a468cc45e9d9a82b81c9", "date" : "2015-12-08T09:35:35.837Z", "score" : 61 },
{ "_id" : "56653fe099799049b66dab97", "date" : "2015-12-07T08:14:24.494Z", "score" : 60 },
{ "_id" : "5663f6b3b7d0b00b74d9fdf9", "date" : "2015-12-06T08:49:55.299Z", "score" : 60 },
{ "_id" : "56629fb56099dfe31b0c72be", "date" : "2015-12-05T08:26:29.510Z", "score" : 60 }
should group to:
{ "_id" : "566ee064d56d02e854df756e", "date" : "2015-12-14T15:29:40.432Z", "score" : 59 }
{ "_id" : "566a8c70520d55771f2e9871", "date" : "2015-12-11T08:42:23.880Z", "score" : 60 }
{ "_id" : "5666a468cc45e9d9a82b81c9", "date" : "2015-12-08T09:35:35.837Z", "score" : 61 }
{ "_id" : "56653fe099799049b66dab97", "date" : "2015-12-07T08:14:24.494Z", "score" : 60 }
If you don't insist on using the aggregation framework, this could be done by iterating over the cursor and comparing each document to the previous one:
var cursor = db.test.find().sort({date:-1}).toArray();
var result = [];
result.push(cursor[0]); //first document must be saved
for(var i = 1; i < cursor.length; i++) {
if (cursor[i].score != cursor[i-1].score) {
result.push(cursor[i]);
}
}
result:
[
{
"_id" : "566ee064d56d02e854df756e",
"date" : "2015-12-14T15:29:40.432Z",
"score" : 59
},
{
"_id" : "566a8c70520d55771f2e9871",
"date" : "2015-12-11T08:42:23.880Z",
"score" : 60
},
{
"_id" : "5666a468cc45e9d9a82b81c9",
"date" : "2015-12-08T09:35:35.837Z",
"score" : 61
},
{
"_id" : "56653fe099799049b66dab97",
"date" : "2015-12-07T08:14:24.494Z",
"score" : 60
}
]
Related
I'm working with mongo and node. I have a collection with a large number of records an unknown number of which are duplicates. I'm trying to remove dups following Remove duplicate records from mongodb 4.0 and https://docs.mongodb.com/manual/aggregation/ .
So far I have:
db.hayes.aggregate([
... {"$group" : {_id:"$PropertyId", count:{$sum:1}}}
... ]
... );
{ "_id" : "R135418", "count" : 10 }
{ "_id" : "R47410", "count" : 17 }
{ "_id" : "R130794", "count" : 10 }
{ "_id" : "R92923", "count" : 18 }
{ "_id" : "R107811", "count" : 11 }
{ "_id" : "R91389", "count" : 15 }
{ "_id" : "R22047", "count" : 12 }
{ "_id" : "R103664", "count" : 10 }
{ "_id" : "R121349", "count" : 12 }
{ "_id" : "R143168", "count" : 8 }
{ "_id" : "R85918", "count" : 13 }
{ "_id" : "R41641", "count" : 13 }
{ "_id" : "R160910", "count" : 11 }
{ "_id" : "R48919", "count" : 11 }
{ "_id" : "M119387", "count" : 10 }
{ "_id" : "R161734", "count" : 12 }
{ "_id" : "R41259", "count" : 13 }
{ "_id" : "R156538", "count" : 7 }
{ "_id" : "R60868", "count" : 10 }
to get the number of groups I tried in the mongo shell:
> const cursor = db.hayes.aggregate([{"$group" :
{_id:"$PropertyId", count:{$sum:1}}} ]);
> cursor.count()
uncaught exception: TypeError: cursor.count is not a function :
#(shell):1:1
Apparently this works with the db.cllection.find statement. How do I do this with the aggregate framework?
Add the following stage after the group stage to see the groups count:
{$count:"Total"}
Count method on the cursor changes the query being sent from find to count. This only works if you are sending a find query to begin with, i.e., not when you are aggregating.
See https://docs.mongodb.com/manual/reference/method/cursor.count/#cursor.count which includes guidance for how to count when aggregating.
Below is a sample collection result sorted by an attribute.
{
"_id" : ObjectId("5d96f8245e1ffa18e26dd2e2"),
"name" : "A",
"discount" : 10
},
{
"_id" : ObjectId("5d96f8245e1ffa18e26dd2e2"),
"name" : "B",
"discount" : 15
},
{
"_id" : ObjectId("5d96f8245e1ffa18e26dd2e2"),
"name" : "C",
"discount" : 20
},
{
"_id" : ObjectId("5d96f8245e1ffa18e26dd2e2"),
"name" : "D",
"discount" : 30
} .
Want to write a query which will project the docs where sum(discount) < n
Ex: find all docs till sum(discount) = 25 .
This should return first 2 docs .
Output:
{
"_id" : ObjectId("5d96f8245e1ffa18e26dd2e2"),
"name" : "A",
"discount" : 10
},
{
"_id" : ObjectId("5d96f8245e1ffa18e26dd2e2"),
"name" : "B",
"discount" : 15
}
find all docs till sum(discount) = 45 .
This should return first 3 docs .
I have a MongoDB collection with unique user traits and I'm trying to combine it with their orders, produce a new collection with the first order date and sum the total of orders by the user.
Given this example collection:
{
"_id" : 1,
"name" : "bob",
"orders" : [ { "date" : "2019-01-01", "amount" : 10 }, { "date" : "2019-01-02", "amount" : 10 } ]
}
{
"_id" : 1,
"name" : "lisa",
"orders" : [ { "date" : "2019-01-02", "amount" : 10 }, { "date" : "2019-01-03", "amount" : 15 } ]
}
this would be my desired output:
{
"_id" : 1,
"name" : "bob",
"first_order" : "2019-01-01",
"total_amount" : 20
}
{
"_id" : 2,
"name" : "lisa",
"first_order" : "2019-01-02",
"total_amount" : 25
}
Thank you
I have some input data :
Brand | Model | Number
Peugeot | 208 | 1
Peugeot | 4008 | 2
Renault | Clio | 3
Renault | Megane | 4
I would like to get both :
the sum for each brand
the global sum
Here is my expected output :
Brand | Number
Peugeot | 3
Renault | 7
Total | 10
I think I have to create two $group operations and set Total with $literal.
What is the right way to do so ?
As you said this can be done by 2 group bys, so let's start by putting some data in to mongo similar to your example input:
> db.cars.insertMany([
{ "Brand" : "Peugeot", "Model" : "208", "Number": 1 },
{ "Brand" : "Peugeot", "Model" : "4008", "Number": 2 },
{ "Brand" : "Renault", "Model" : "Clio", "Number": 3 },
{ "Brand" : "Renault", "Model" : "Megane", "Number": 4 }
]);
Now we've got all our cars inserted we can then aggregate these using the 2 group aggregation operators:
db.cars.aggregate([
{ $group : { "_id" : "$Brand", "Number" : { $sum : "$Number" }}},
{ $group : { "_id" : null, "Rows" : { $push : { "Brand" : "$$ROOT._id", "Number" : "$Number" } }, "Total" : {$sum : "$Number" } }}
])
This will give us the following output
{
"_id" : null,
"Rows" : [
{
"Brand" : "Renault",
"Number" : 7
},
{
"Brand" : "Peugeot",
"Number" : 3
}
],
"Total" : 10
}
We can then clean it up with a projection
db.cars.aggregate([
{ "$group" : { "_id" : "$Brand", "Number" : { $sum : "$Number" }}},
{ "$group" : { "_id" : null, "Rows" : { $push : { "Brand" : "$$ROOT._id", "Number" : "$Number" } }, "Total" : {$sum : "$Number" } } },
{ "$project" : { "_id" : 0, "Data" : { "$concatArrays" : [ "$Rows", [ { "Brand": { $literal : "Total" }, "Number" : "$Total" } ] ] } } }
])
Giving us the following result
{
"Data" : [
{
"Brand" : "Renault",
"Number" : 7
},
{
"Brand" : "Peugeot",
"Number" : 3
},
{
"Brand" : "Total",
"Number" : 10
}
]
}
I'm attempting to use find (not aggregation) to give me the max score by student on a test. Essentially would be a sort by student, find top score for that student, then sort the result set. Here is the data:
{ "_id" : 1, "name" : "Pat", "score" : 97 }
{ "_id" : 1, "name" : "Pat", "score" : 92 }
{ "_id" : 2, "name" : "Pat", "score" : 89 }
{ "_id" : 3, "name" : "Ken", "score" : 91 }
{ "_id" : 4, "name" : "Ken", "score" : 81 }
I'm looking for the result to look like this (where only the students top score is returned):
{ "_id" : 1, "name" : "Pat", "score" : 97 }
{ "_id" : 2, "name" : "Ken", "score" : 91 }
I've tried many different combinations but can't get it to work. I know in SQL how I'd do it. Here is my current code, which is just sorting it:
db.grades.find().sort({score: -1})
You can use:
db.grades.find().sort('-score').distinct('name').sort('score')