After doing a $facet I receive this output:
[
{
"confirmed": [
{
"confirmed": 100
}
],
"denied": [
{
"denied": 50
}
],
"pending": [
{
"pending": 20
}
]
}
]
how can I project it into something like this?
[
{
category: "confirmed", count: 100,
category: "denied", count: 50,
category: "pending", count: 20
}
]
I need the faucet part because to extract those numbers I have to do several $match to the same data. Dont know if there is a better option.
Thank you!
What you ask is not a valid format. This is an object with duplicate keys. You may want:
[{"confirmed": 100, "denied": 50, "pending": 20}]
or
[
{category: "confirmed", count: 100},
{category: "denied", count: 50},
{category: "pending", count: 20}
]
which are both valid options
I guess you want the second option. If you want the generic solution, one option is:
db.collection.aggregate([
{$project: {res: {$objectToArray: "$$ROOT"}}},
{$project: {
res: {$map: {
input: "$res",
in: {category: "$$this.k", count: {$objectToArray: {$first: "$$this.v"}}}
}}
}},
{$project: {
res: {$map: {
input: "$res",
in: {category: "$$this.category", count: {$first: "$$this.count.v"}}
}}
}},
{$unwind: "$res"},
{$replaceRoot: {newRoot: "$res"}}
])
See how it works on the playground example - generic
If you want the literal option, just use:
db.collection.aggregate([
{$project: {
res: [
{category: "confirmed", count: {$first: "$confirmed.confirmed"}},
{category: "denied", count: {$first: "$denied.denied"}},
{category: "pending", count: {$first: "$pending.pending"}}
]
}
},
{$unwind: "$res"},
{$replaceRoot: {newRoot: "$res"}}
])
See how it works on the playground example - literal
Related
I have a collection of docs like
{'id':1, 'score': 1, created_at: ISODate(...)}
{'id':1, 'score': 2, created_at: ISODate(...)}
{'id':2, 'score': 1, created_at: ISODate(...)}
{'id':2, 'score': 20, created_at: ISODate(...)}
etc.
Does anyone know how to find docs that were created within the past 24hrs where the difference of the score value between the two most recent docs of the same id is less than 5?
So far I can only find all docs created within the past 24hrs:
[{
$project: {
_id: 0,
score: 1,
created_at: 1
}
}, {
$match: {
$expr: {
$gte: [
'$created_at',
{
$subtract: [
'$$NOW',
86400000
]
}
]
}
}
}]
Any advice appreciated.
Edit: By the two most recent docs, the oldest of the two can be created more than 24hrs ago. So the most recent doc would be created within the past 24hrs, but the oldest doc could be created over 24hrs ago.
If I understand you correctly, you want something like:
db.collection.aggregate([
{$match: {$expr: {$gte: ["$created_at", {$subtract: ["$$NOW", 86400000]}]}}},
{$sort: {created_at: -1}},
{$group: {_id: "$id", data: {$push: "$$ROOT"}}},
{$project: {pair: {$slice: ["$data", 0, 2]}, scores: {$slice: ["$data.score", 0, 2]}}},
{$match: {$expr: {
$lte: [{$abs: {$subtract: [{$first: "$scores"}, {$last: "$scores"}]}}, 5]
}}},
{$unset: "scores"}
])
See how it works on the playground example
EDIT:
according to you comment, one option is:
db.collection.aggregate([
{$setWindowFields: {
partitionBy: "$id",
sortBy: {created_at: -1},
output: {data: {$push: "$$ROOT", window: {documents: ["current", 1]}}}
}},
{$group: {
_id: "$id",
created_at: {$first: "$created_at"},
pair: {$first: "$data"}
}},
{$match: {$expr: {$and: [
{$gte: ["$created_at", {$dateAdd: {startDate: "$$NOW", unit: "day", amount: -1}},
{$eq: [{$size: "$pair"}, 2]},
{$lte: [{$abs: {$subtract: [{$first: "$pair.score"},
{$last: "$pair.score"}]}}, 5]}
]}}},
{$project: {_id: 0, pair: 1}}
])
See how it works on the playground example
If I've understood correctly you can try this query:
First the $match as you have to get documents since a day ago.
Then $sort by the date to ensure the most recent are on top.
$group by the id, and how the most recent were on top, using $push will be the two first elements in the array.
So now you only need to $sum these two values.
And filter again with these one that are less than ($lt) 5.
db.collection.aggregate([
{
$match: {
$expr: {
$gte: [
"$created_at",
{
$subtract: [
"$$NOW",
86400000
]
}
]
}
}
},
{
"$sort": {
"created_at": -1
}
},
{
"$group": {
"_id": "$id",
"score": {
"$push": "$score"
}
}
},
{
"$project": {
"score": {
"$sum": {
"$firstN": {
"n": 2,
"input": "$score"
}
}
}
}
},
{
"$match": {
"score": {
"$lt": 5
}
}
}
])
Example here
Edit: $firstN is new in version 5.2. Other way you can use $slice in this way.
I have the following documents in my MongoDB:
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 1
_id: ObjectId(...)
'timestamp': 2022-11-03T09:00:00.000+00:00
score: 3
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 6
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 10
I want to make an aggregation that counts the score within the range of (gte)1-(lt)5 as poor, (gte)5-(lt)7 as ok, (gte)7-(lt)8.5 as good and (gte)8.5-(lte)10 as excellent.
So the result would look like this:
{
"data": [
{
"name": "excellent",
"count": 1
},
{
"name": "good",
"count": 0
},
{
"name": "ok",
"count": 1
},
{
"name": "poor",
"count": 2
}
]
}
How do I achieve that?
If you accept an answer only with documents that have a count, you can do:
db.collection.aggregate([
{$project: {
_id: {
$arrayElemAt: [
["poor", "ok", "good", "excellent"],
{$floor: {$divide: ["$score", 10]}}
]}
}},
{$group: {_id: "$_id", count: {$sum: 1}}}
])
Otherwise you need to create all categories:
db.collection.aggregate([
{$group: {
_id: 0,
excellent: {$sum: {$cond: [{$gte: ["$score", 30]}, 1, 0]}},
good: {$sum: {$cond: [{$and: [{$gte: ["$score", 20]}, {$lt: ["$score", 30]}]}, 1, 0]}},
ok: {$sum: {$cond: [{$and: [{$gte: ["$score", 10]}, {$lt: ["$score", 20]}]}, 1, 0]}},
poor: {$sum: {$cond: [{$lt: ["$score", 10]}, 1, 0]}}
}},
{$unset: "_id"},
{$project: {data: {$objectToArray: "$$ROOT"}}},
{$project: {
data: {$map: {
input: "$data",
in: {nmae: "$$this.k", count: "$$this.v"}
}}
}}
])
See how it works on the playground example
I'm recieving an array of documents, each document has the data of some participants of a study.
"a" has some anatomic metrics, here represented as "foo" and "bar". (i.e. height, weight, etc.)
"b" has the performance per second on other tests:
"t" is the time in seconds and
"e" are the tests results mesured at that specific time. (i.e. cardiac rithm, blood pressure, temperature, etc. )
Example of data:
[
{
"a": { "foo":1, "bar": 100 },
"b": [
{ "t":1, "e":[3,4,5] },
{ "t":2, "e":[4,4,4] },
{ "t":3, "e":[7,4,7] }
],
},
{
"a": { "foo":2, "bar": 111 },
"b": [
{ "t":1, "e":[9,4,0] },
{ "t":2, "e":[1,4,2] },
{ "t":3, "e":[3,4,5] }
],
},
{
"a": { "foo":4, "bar": 200 },
"b": [
{ "t":1, "e":[1,4,2] },
{ "t":2, "e":[3,1,3] },
{ "t":3, "e":[2,4,1] }
],
}
]
I'm trying to get some averages of the participants.
I already manage to get the averages of the anatomic values stored in "a".
I used:
db.collection.aggregate([
{
$group: {
_id: null,
barAvg: {
$avg: {
$avg: "$a.bar"
}
}
}
}
])
However, I'm failing to get the average of every test per second. So that would be the average on every "t" of every individual element of "e".
Expected result:
"average": [
{ "t":1, "e":[4.33, 3.00, 2.33] },
{ "t":2, "e":[2.66, 3.00, 3.00] },
{ "t":3, "e":[4.33, 3.00, 5.00] }
]
Here, 4.33 is the average of every first test ( e[0] ), but just of the fisrt second ( t=1 ), of every person.
One option is to $unwind to separate for the documents according to their t value and use $zip to transpose it before calculating the average:
db.collection.aggregate([
{$unwind: "$b"},
{$group: {_id: "$b.t", data: {$push: "$b.e"}}},
{$set: {data: {$zip: {inputs: [
{$arrayElemAt: ["$data", 0]},
{$arrayElemAt: ["$data", 1]},
{$arrayElemAt: ["$data", 2]}
]
}
}
}
},
{$project: {
t: "$_id",
e: {$map: {input: "$data", in: {$trunc: [{$avg: "$$this"}, 2]}}}
}
},
{$sort: {t: 1}},
{$group: {_id: 0, average: {$push: {t: "$t", e: "$e"}}}},
{$unset: "_id"}
])
See how it works on the playground example - zip
Other option may be to $unwind twice and build the entire calculation from pieces, but the advantage is that you don't need to literally specify the number of items in each e array for the $arrayElemAt:
db.collection.aggregate([
{$project: {b: 1, t: 1, _id: 0}},
{$unwind: "$b"},
{$unwind: {path: "$b.e", includeArrayIndex: "index"}},
{$group: {_id: {t: "$b.t", index: "$index"}, data: {$push: "$b.e"}}},
{$sort: {"_id.index": 1}},
{$group: {_id: "$_id.t", average: {$push: {$avg: "$data"}}}},
{$sort: {_id: 1}},
{$group: {_id: 0, average: {$push: {t: "$_id", e: "$average"}}}},
{$unset: "_id"}
])
See how it works on the playground example - unwind twice
Let's say I have these collections members and positions
[
{
"church":"60dbb265a75a610d90b45c6b", "parentId":"60dbb265a75a610d90b45c6b", name: "Jonah John", status: 1, birth: "1983-01-01", position: "60f56f59-08be-49ec-814a-2a421f21bc08"
},
{
"church":"60dbb265a75a610d90b45c6b", "parentId":"60dbb265a75a610d90b45c6b", name: "March John", status: 1, birth: "1981-01-23", position: "60f56f59-08be-49ec-814a-2a421f21bc08"
},
{
"church":"60dbb265a75a610d90b45c6b", "parentId":"60dbb265a75a610d90b45c6b",name: "Jessy John", status: 0, birth: "1984-08-01", position: "e5bba609-082c-435a-94e3-0997fd229851"
}
]
[
{_id: "60f56f59-08be-49ec-814a-2a421f21bc08", name: "Receptionist"},
{_id: "5c78ba5a-3e6c-4d74-8d4a-fa23d02b8003", name: "Curtain"},
{_id: "e5bba609-082c-435a-94e3-0997fd229851", name: "Doorman"}
]
I want to aggregate in a way I can get:
inactiveMembers
activeMembers
totalMembers
totalPositionsOcuppied
And two arrays with:
positionsOcuppied {name, quantity}
birthdays {month, quantity.
I need an output like this:
{
"_id": {
"church":"60dbb265a75a610d90b45c6b",
"parentId":"60dbb265a75a610d90b45c6b"
},
"inactiveMembers":1,
"activeMembers":2,
"totalMembers":3,
"birthdays": [
{january:2}, {august:1}
],
"positionsOcuppied": [
{Doorman: 1}, {Receptionist:2}
],
"totalPositionsOcuppied": 3
}
How can I do that?
PS.: Very sorry for unclear values...
Update:
$addFields with birthMonth string
$lookup to add positions
$facet to $group by birthdays, positionsOcuppied, and all docs tougher as other
$map to format birthdays and positionsOcuppied
Format the answer
db.people.aggregate([
{$addFields: {
birthMonth: {
$arrayElemAt: [
["","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"],
{$month: {$toDate: "$birth"}}
]
}
}
},
{$lookup: {from: "positions", localField: "position", foreignField: "_id",
as: "position"}},
{$facet: {
birthdays: [{$group: {_id: "$birthMonth", count: {$sum: 1}}}],
positionsOcuppied: [{$group: {_id: {$first: "$position.name"}, count: {$sum: 1}}}],
other: [
{$group: {_id: 0,
activeMembers: {$sum: "$status"},
totalMembers: {$sum: 1},
church: {$first: "$church"},
parentId: {$first: "$parentId"},
totalPositionsOcuppied: {$sum: {$size: "$position"}}
}
}
]
}
},
{$set: {
birthdays: {
$map: {input: "$birthdays", in: [{k: "$$this._id", v: "$$this.count"}]}
},
positionsOcuppied: {
$map: {input: "$positionsOcuppied", in: [{k: "$$this._id", v: "$$this.count"}]}
},
other: {$first: "$other"}
}
},
{$set: {
"other.birthdays": {
$map: {input: "$birthdays", in: {$arrayToObject: "$$this"}}
},
"other.positionsOcuppied": {
$map: {input: "$positionsOcuppied", in: {$arrayToObject: "$$this"}}
},
"other.inactiveMembers": {
$subtract: ["$other.totalMembers", "$other.activeMembers"]
},
"other._id": {church: "$other.church", parentId: "$other.parentId"},
birthdays: "$$REMOVE",
"other.church": "$$REMOVE",
"other.parentId": "$$REMOVE",
positionsOcuppied: "$$REMOVE"
}
},
{$replaceRoot: {newRoot: "$other"}}
])
See how it works on the playground example
I have the following documents:
_id: "Team 1"
count: 1200
_id: "Team 2"
count: 1170
_id: "Team 3"
count: 1006
_id: "Team 4"
count: 932
_id: "Team 5"
count: 931
_id: "Team 6"
count: 899
_id: "Team 7"
count: 895
The list is already sorted and everything, I just need to project this as an array of top 5 based on count and then the rest should be summed as 'others'. If possible I'd like to also add the percentage that each element in the list makes up of the full count. Like this:
[
{"name":"Team 1", "count":1200, "percent":25},
{"name":"Team 2", "count":1170,"percent":15},
{"name":"Team 3", "count":1006,"percent":10},
{"name":"Team 4", "count":932,"percent":5},
{"name":"Team 5", "count":931,"percent":5},
{"name":"Other", "count":1794, "percent":40}]
]
Query
$setWindowFields to sort and add the sort-rank to each document
group by null with 3 accumulators
push the first 5 documents unchanged
sum the count of the rest (rank>5)
total sum
$map to divide the counts with the total sum for the 5 top documents, to get the percentage also
add also the percentage for the rest of documents
unwind and replace the root, with those documents that have count and percentage
Playmongo (put the mouse at the end of each stage to see the stage in and out)
aggregate(
[{"$setWindowFields":
{"output": {"rank": {"$rank": {}}}, "sortBy": {"count": -1}}},
{"$group":
{"_id": null,
"top5":
{"$push": {"$cond": [{"$lte": ["$rank", 5]}, "$$ROOT", "$$REMOVE"]}},
"other": {"$sum": {"$cond": [{"$lte": ["$rank", 5]}, 0, "$count"]}},
"all": {"$sum": "$count"}}},
{"$project":
{"_id": 0,
"docs":
{"$concatArrays":
[{"$map":
{"input": "$top5",
"in":
{"name": "$$this._id",
"count": "$$this.count",
"percentage":
{"$multiply": [{"$divide": ["$$this.count", "$all"]}, 100]}}}},
[{"name": "other",
"count": "$other",
"percentage":
{"$multiply": [{"$divide": ["$other", "$all"]}, 100]}}]]}}},
{"$unwind": "$docs"}, {"$replaceRoot": {"newRoot": "$docs"}}])
another way to do it using $facet since $setWindowFields only works with mongodb v5 or later
mongoPlayground
db.collection.aggregate([
{ $sort: { count: -1 } },
{
"$facet": {
others: [
{ "$skip": 5 },
{
"$group": {
"_id": "others",
"count": { "$sum": "$count" }
}
}
],
top5: [ { "$limit": 5 } ]
}
},
{
"$project": { result: { "$concatArrays": [ "$others", "$top5" ] } }
},
{
"$addFields": { totalCount: { "$sum": "$result.count" } }
},
{ $unwind: "$result" },
{
$project: {
_id: "$result._id",
count: "$result.count",
percent: {
$round: [
{ "$multiply": [ { $divide: [ "$result.count", "$totalCount" ] }, 100 ] },
0
]
}
}
}
])
If you have mongoDB version 5.0 or higher you can use $setWindowFields like in #Takis nice answer. Otherwise, you can group, $slice and $reduce your way to the answer:
$sort to have the highest count on top and group to put them all in one array called all and to $sum up.
$slice the all array to keep only the top N.
$reduce the top N to sum them up.
Add the others to the top N array with count sum-sum(topN)
$unwind and format
db.collection.aggregate([
{$sort: {count: -1}},
{$group: {_id: null, all: {$push: "$$ROOT"}, sum: {$sum: "$count"}}},
{$project: {_id: null, sum: 1, res: {$slice: ["$all", 5]}}},
{$project: {sum: 1, res: 1, topN: {
$reduce: {
input: "$res",
initialValue: 0,
in: {$add: ["$$value", "$$this.count"]}
}
}
}
},
{
$project: {_id: 0, sum: 1, res: {
$concatArrays: [
[{_id: "other", count: {$subtract: ["$sum", "$topN"]}}],
"$res"
]
}
}
},
{$unwind: "$res"},
{$project: {_id: "$res._id", count: "$res.count",
percent: { $round: [{$multiply:
[{$divide: ["$res.count", "$sum"]}, 100]}, 0]
}
}
}
])
Playground example