How to group by date and by specific field in MongoDB - mongodb

I want to print grouped by date and by "productId" within the date. In this example, the output should be as follow:
[
{
"_id": "2018-03-04",
"product1": 2,
"product2": 2
}
]
Data: https://mongoplayground.net/p/gzvm11EIPn2
How to make it?

When you use the $group stage in aggregation you learn to group by one field as such: { $group: { "_id": "$field1"...
When you want to group by two or more fields "_id" needs to be a subdocument and you pass the fields as key value pairs inside the subdocument as such:
db.mycollection.aggregate([
{
$group:
{
"_id": { "product1": "$product1", "product2": "$product2", ... }
}
}
])
... etc.

$group - Group by createdAt (date string) and productId and perform count via $sum.
$group - Group by createdAtand push data from (1) to products array field.
$replaceRoot - Replace input document with new document.
3.1. $arrayToObject - Convert the object from products array field to key value pair with productId (key) and count (value).
3.2. $mergeObjects - Create object with _id and merge the object from (3.2) into 1 object.
db.collection.aggregate([
{
$group: {
_id: {
createdAt: {
$dateToString: {
format: "%Y-%m-%d",
date: "$createdAt"
}
},
productId: "$productId"
},
count: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.createdAt",
products: {
$push: {
productId: "$_id.productId",
count: "$count"
}
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
{
_id: "$_id"
},
{
$arrayToObject: {
$map: {
input: "$products",
in: {
k: {
$toString: "$$this.productId"
},
v: "$$this.count"
}
}
}
}
]
}
}
}
])
Sample Mongo Playground
Output
[
{
"5e345223b3aa703b8a9a4f34": 2,
"5e345223b3aa703b8a9a4f35": 2,
"_id": "2018-03-04"
}
]

Related

Flatten group of multiple fields in mongo?

I want to get the sum of amounts grouped by address and date:
db.coolCollection.aggregate([
{ $group:
{ _id :
{ address:"$address",
date: { $dateFromString: { dateString: "$block.time"}}}},
sum : { $sum:{ "$amount" }}} ])
Great, except the results look like this:
{
_id: {
address: "abc123",
date: 2021-03-22T00:00:00.000+00:00
},
sum: 48645
}
I want this:
{
address: "abc123",
date: 2021-03-22T00:00:00.000+00:00,
sum: 48645
}
Usually you'd just add a $project stage to restructure, here is how to do is using $replaceRoot under the assumption the _id can contain many fields you don't want to manually convert:
db.collection.aggregate([
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$_id",
"$$ROOT"
]
}
}
},
{
$project: {
_id: 0
}
}
])
Mongo Playground

MongoDB - convert an object to an array

I have two documents (obtained by other steps in an aggregation pipeline):
{
'_id': '2021-01-04',
'value': 1234.55
},
{
'_id': '2021-01-05',
'value': 345.67
}
I would now like to convert these two documents into an array that would look like this:
[
{ '2021-01-04': 1234.55 },
{ '2021-01-05': 345.67 }
]
I've tried to first convert the key/value pairs using a $group stage like so:
$group: {
_id: null,
data: {
$push: {
"k": "$_id",
"v": "$value"
}
}
}
This yields:
[
{
"_id": null,
"data": [
{
"k": "2019-01-04",
"v": 1234.55
},
{
"k": "2019-01-05",
"v": 345.67
}
]
}
]
While this would be useful as input for $arrayToObject, I don't want an object (as I need the objects to be ordered), but I cannot see how to get from here to the desired final output.
$sort order by _id in ascending order
$arrayToObject convert k and v array to object format
$group by null and push above converted object in data
db.collection.aggregate([
{ $sort: { _id: 1 } },
{
$group: {
_id: null,
data: {
$push: {
$arrayToObject: [
[{ k: "$_id", v: "$value" }]
]
}
}
}
}
])
Playground

Finding top 3 students in each subject MongoDB

I have tried searching for ways to solve my problem, except that my database is set up differently,
My documents in my collection are something like this:
{name:"MAX",
date:"2020-01-01"
Math:98,
Science:60,
English:80},
{name:"JANE",
date:"2020-01-01"
Math:80,
Science:70,
English:79},
{name:"ALEX",
date:"2020-01-01"
Math:95,
Science:68,
English:70},
{name:"JOHN",
date:"2020-01-01"
Math:95,
Science:68,
English:70}
{name:"MAX",
date:"2020-06-01"
Math:97,
Science:78,
English:90},
{name:"JANE",
date:"2020-06-01"
Math:78,
Science:76,
English:66},
{name:"ALEX",
date:"2020-06-01"
Math:93,
Science:75,
English:82},
{name:"JOHN",
date:"2020-06-01"
Math:92,
Science:80,
English:50}
I want to find the top 3 students for each subject without regard for the dates. I only managed to find the top 3 students in 1 subject.
So i group the students by name first, and add a column for max scores of a subject. Math in this case. Sort it in descending order and limit results to 3.
db.student_scores.aggregate(
[
{$group:{
_id: "$name",
maxMath: { $max: "$Math" }}},
{$sort:{"maxMath":-1}},
{$limit : 3}
]
)
Is there any way to get the top 3 students for each subject?
So, it would be top 3 for math, top 3 for science, top 3 for english
{
Math:{MAX, JANE, JOHN},
Science:{JOHN, ALEX, JANE},
English:{JANE, MAX, JOHN}
}
I just applied your code 3 times, using $facet
If you prefer a more compact result add
{$project:{English:"$Eng._id", Science:"$sci._id", Math:"$math._id"}}
PLAYGROUND
PIPELINE
db.collection.aggregate([
{
"$facet": {
"math": [
{
$group: {
_id: "$name",
maxMath: {
$max: "$Math"
}
}
},
{
$sort: {
"maxMath": -1
}
},
{
$limit: 3
}
],
"sci": [
{
$group: {
_id: "$name",
maxSci: {
$max: "$Science"
}
}
},
{
$sort: {
"maxSci": -1
}
},
{
$limit: 3
}
],
"Eng": [
{
$group: {
_id: "$name",
maxEng: {
$max: "$English"
}
}
},
{
$sort: {
"maxEng": -1
}
},
{
$limit: 3
}
]
}
}
])
Your question is not clear, but i can predict 2 scenario,
Get repetitive students along with date:
$project to show required fields and convert subjects object to array using $objectToArray
$unwind subjects array
$sort by subjects name in descending order
$group by subject name and get array of students
$project to get latest 3 students from students array
db.collection.aggregate([
{
$project: {
name: "$name",
date: "$date",
subjects: {
$objectToArray: {
Math: "$Math",
Science: "$Science",
English: "$English"
}
}
}
},
{ $unwind: "$subjects" },
{ $sort: { "subjects.v": -1 } },
{
$group: {
_id: "$subjects.k",
students: {
$push: {
name: "$name",
date: "$date",
score: "$subjects.v"
}
}
}
},
{
$project: {
_id: 0,
subject: "$_id",
students: { $slice: ["$students", 3] }
}
}
])
Playground
Sum of all date's score (means unique students):
$group by name, and get sum of all subjects using $sum,
$project to convert subjects object to array using $objectToArray
$unwind subjects array
$sort by subjects name in descending order
$group by subject name and get array of students
$project to get latest 3 students from students array
db.collection.aggregate([
{
$group: {
_id: "$name",
Math: { $sum: "$Math" },
Science: { $sum: "$Science" },
English: { $sum: "$English" }
}
},
{
$project: {
subjects: {
$objectToArray: {
Math: "$Math",
Science: "$Science",
English: "$English"
}
}
}
},
{ $unwind: "$subjects" },
{ $sort: { "subjects.v": -1 } },
{
$group: {
_id: "$subjects.k",
students: {
$push: {
name: "$_id",
score: "$subjects.v"
}
}
}
},
{
$project: {
_id: 0,
subject: "$_id",
students: { $slice: ["$students", 3] }
}
}
])
Playground

Mongodb while aggregate group value as key

I was trying to aggregate and group values but want one of the field as key.
[
{id:1, value: "x"},
{id:2, value: "y"},
{id:1, value: "a"},
{id:2, value: "b"},
]
used this query but no luck
db.getCollection('Test').aggregate([
{
$group: {
_id: "$id",
"value": {$push: "$$ROOT" }
}
}
])
Was trying to achieve this
[
{ 1:[x,a] },
{ 2:[y,b] }
]
Can anyone help me with this query?
You need to run $group twice to get single document which contains an array of k,v pairs. Then you can run $arrayToObject on that document along with $replaceRoot to promote new object into root level:
db.collection.aggregate([
{
$group: {
_id: "$id",
values: { $push: "$value" }
}
},
{
$group: {
_id: null,
root: { $push: { k: { $toString: "$_id" }, v: "$values" } }
}
},
{
$replaceRoot: {
newRoot: {
$arrayToObject: "$root"
}
}
}
])
Mongo Playground

MongoDB - aggregating with nested objects, and changeable keys

I have a document which describes counts of different things observed by a camera within a 15 minute period. It looks like this:
{
"_id" : ObjectId("5b1a709a83552d002516ac19"),
"start" : ISODate("2018-06-08T11:45:00.000Z"),
"end" : ISODate("2018-06-08T12:00:00.000Z"),
"recording" : ObjectId("5b1a654683552d002516ac16"),
"data" : {
"counts" : {
"5b434d05da1f0e00252566be" : 12,
"5b434d05da1f0e00252566cc" : 4,
"5b434d05da1f0e00252566ca" : 1
}
}
}
The keys inside the data.counts object change with each document and refer to additional data that is fetched at a later date. There are unlimited number of keys inside data.counts (but usually about 20)
I am trying to aggregate all these 15 minute documents up to daily aggregated documents.
I have this query at the moment to do that:
db.getCollection("segments").aggregate([
{$match:{
"recording": ObjectId("5bf7f68ad8293a00261dd83f")
}},
{$project:{
"start": 1,
"recording": 1,
"data": 1
}},
{$group:{
_id: { $dateToString: { format: "%Y-%m-%d", date: "$start" } },
"segments": { $push: "$$ROOT" }
}},
{$sort: {_id: -1}},
]);
This does the grouping and returns all the segments in an array.
I want to also aggregate the information inside data.counts, so that I get the sum of values for all keys that are the same within the daily group.
This would save me from having another service loop through each 15 minute segment summing values with the same keys. E.g. the query would return something like this:
{
"_id" : "2019-02-27",
"counts" : {
"5b434d05da1f0e00252566be" : 351,
"5b434d05da1f0e00252566cc" : 194,
"5b434d05da1f0e00252566ca" : 111
... any other keys that were found within a day
}
}
How might I amend the query I already have, or use a different query?
Thanks!
You could use the $facet pipeline stage to create two sub-pipelines; one for segments and another for counts. These sub-pipelines can be joined by using $zip to stitch them together and $map to merge each 2-element array produced from zip. Note this will only work correctly if the sub-pipelines output sorted arrays of the same size, which is why we group and sort by start_date in each sub-pipeline.
Here's the query:
db.getCollection("segments").aggregate([{
$match: {
recording: ObjectId("5b1a654683552d002516ac16")
}
}, {
$project: {
start: 1,
recording: 1,
data: 1,
start_date: { $dateToString: { format: "%Y-%m-%d", date: "$start" }}
}
}, {
$facet: {
segments_pipeline: [{
$group: {
_id: "$start_date",
segments: {
$push: {
start: "$start",
recording: "$recording",
data: "$data"
}
}
}
}, {
$sort: {
_id: -1
}
}],
counts_pipeline: [{
$project: {
start_date: "$start_date",
count: { $objectToArray: "$data.counts" }
}
}, {
$unwind: "$count"
}, {
$group: {
_id: {
start_date: "$start_date",
count_id: "$count.k"
},
count_sum: { $sum: "$count.v" }
}
}, {
$group: {
_id: "$_id.start_date",
counts: {
$push: {
$arrayToObject: [[{
k: "$_id.count_id",
v: "$count_sum"
}]]
}
}
}
}, {
$project: {
counts: { $mergeObjects: "$counts" }
}
}, {
$sort: {
_id: -1
}
}]
}
}, {
$project: {
result: {
$map: {
input: { $zip: { inputs: ["$segments_pipeline", "$counts_pipeline"] }},
in: { $mergeObjects: "$$this" }
}
}
}
}, {
$unwind: "$result"
}, {
$replaceRoot: {
newRoot: "$result"
}
}])
Try it out here: Mongoplayground.