Filter nested objects - mongodb

I have a collection of docs like
{'id':1, 'score': 1, created_at: ISODate(...)}
{'id':1, 'score': 2, created_at: ISODate(...)}
{'id':2, 'score': 1, created_at: ISODate(...)}
{'id':2, 'score': 20, created_at: ISODate(...)}
etc.
Does anyone know how to find docs that were created within the past 24hrs where the difference of the score value between the two most recent docs of the same id is less than 5?
So far I can only find all docs created within the past 24hrs:
[{
$project: {
_id: 0,
score: 1,
created_at: 1
}
}, {
$match: {
$expr: {
$gte: [
'$created_at',
{
$subtract: [
'$$NOW',
86400000
]
}
]
}
}
}]
Any advice appreciated.
Edit: By the two most recent docs, the oldest of the two can be created more than 24hrs ago. So the most recent doc would be created within the past 24hrs, but the oldest doc could be created over 24hrs ago.

If I understand you correctly, you want something like:
db.collection.aggregate([
{$match: {$expr: {$gte: ["$created_at", {$subtract: ["$$NOW", 86400000]}]}}},
{$sort: {created_at: -1}},
{$group: {_id: "$id", data: {$push: "$$ROOT"}}},
{$project: {pair: {$slice: ["$data", 0, 2]}, scores: {$slice: ["$data.score", 0, 2]}}},
{$match: {$expr: {
$lte: [{$abs: {$subtract: [{$first: "$scores"}, {$last: "$scores"}]}}, 5]
}}},
{$unset: "scores"}
])
See how it works on the playground example
EDIT:
according to you comment, one option is:
db.collection.aggregate([
{$setWindowFields: {
partitionBy: "$id",
sortBy: {created_at: -1},
output: {data: {$push: "$$ROOT", window: {documents: ["current", 1]}}}
}},
{$group: {
_id: "$id",
created_at: {$first: "$created_at"},
pair: {$first: "$data"}
}},
{$match: {$expr: {$and: [
{$gte: ["$created_at", {$dateAdd: {startDate: "$$NOW", unit: "day", amount: -1}},
{$eq: [{$size: "$pair"}, 2]},
{$lte: [{$abs: {$subtract: [{$first: "$pair.score"},
{$last: "$pair.score"}]}}, 5]}
]}}},
{$project: {_id: 0, pair: 1}}
])
See how it works on the playground example

If I've understood correctly you can try this query:
First the $match as you have to get documents since a day ago.
Then $sort by the date to ensure the most recent are on top.
$group by the id, and how the most recent were on top, using $push will be the two first elements in the array.
So now you only need to $sum these two values.
And filter again with these one that are less than ($lt) 5.
db.collection.aggregate([
{
$match: {
$expr: {
$gte: [
"$created_at",
{
$subtract: [
"$$NOW",
86400000
]
}
]
}
}
},
{
"$sort": {
"created_at": -1
}
},
{
"$group": {
"_id": "$id",
"score": {
"$push": "$score"
}
}
},
{
"$project": {
"score": {
"$sum": {
"$firstN": {
"n": 2,
"input": "$score"
}
}
}
}
},
{
"$match": {
"score": {
"$lt": 5
}
}
}
])
Example here
Edit: $firstN is new in version 5.2. Other way you can use $slice in this way.

Related

How do I group and count values by value range in MongoDB

I have the following documents in my MongoDB:
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 1
_id: ObjectId(...)
'timestamp': 2022-11-03T09:00:00.000+00:00
score: 3
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 6
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 10
I want to make an aggregation that counts the score within the range of (gte)1-(lt)5 as poor, (gte)5-(lt)7 as ok, (gte)7-(lt)8.5 as good and (gte)8.5-(lte)10 as excellent.
So the result would look like this:
{
"data": [
{
"name": "excellent",
"count": 1
},
{
"name": "good",
"count": 0
},
{
"name": "ok",
"count": 1
},
{
"name": "poor",
"count": 2
}
]
}
How do I achieve that?
If you accept an answer only with documents that have a count, you can do:
db.collection.aggregate([
{$project: {
_id: {
$arrayElemAt: [
["poor", "ok", "good", "excellent"],
{$floor: {$divide: ["$score", 10]}}
]}
}},
{$group: {_id: "$_id", count: {$sum: 1}}}
])
Otherwise you need to create all categories:
db.collection.aggregate([
{$group: {
_id: 0,
excellent: {$sum: {$cond: [{$gte: ["$score", 30]}, 1, 0]}},
good: {$sum: {$cond: [{$and: [{$gte: ["$score", 20]}, {$lt: ["$score", 30]}]}, 1, 0]}},
ok: {$sum: {$cond: [{$and: [{$gte: ["$score", 10]}, {$lt: ["$score", 20]}]}, 1, 0]}},
poor: {$sum: {$cond: [{$lt: ["$score", 10]}, 1, 0]}}
}},
{$unset: "_id"},
{$project: {data: {$objectToArray: "$$ROOT"}}},
{$project: {
data: {$map: {
input: "$data",
in: {nmae: "$$this.k", count: "$$this.v"}
}}
}}
])
See how it works on the playground example

Mongo: Average on each position of a nested array for multiple documents

I'm recieving an array of documents, each document has the data of some participants of a study.
"a" has some anatomic metrics, here represented as "foo" and "bar". (i.e. height, weight, etc.)
"b" has the performance per second on other tests:
"t" is the time in seconds and
"e" are the tests results mesured at that specific time. (i.e. cardiac rithm, blood pressure, temperature, etc. )
Example of data:
[
{
"a": { "foo":1, "bar": 100 },
"b": [
{ "t":1, "e":[3,4,5] },
{ "t":2, "e":[4,4,4] },
{ "t":3, "e":[7,4,7] }
],
},
{
"a": { "foo":2, "bar": 111 },
"b": [
{ "t":1, "e":[9,4,0] },
{ "t":2, "e":[1,4,2] },
{ "t":3, "e":[3,4,5] }
],
},
{
"a": { "foo":4, "bar": 200 },
"b": [
{ "t":1, "e":[1,4,2] },
{ "t":2, "e":[3,1,3] },
{ "t":3, "e":[2,4,1] }
],
}
]
I'm trying to get some averages of the participants.
I already manage to get the averages of the anatomic values stored in "a".
I used:
db.collection.aggregate([
{
$group: {
_id: null,
barAvg: {
$avg: {
$avg: "$a.bar"
}
}
}
}
])
However, I'm failing to get the average of every test per second. So that would be the average on every "t" of every individual element of "e".
Expected result:
"average": [
{ "t":1, "e":[4.33, 3.00, 2.33] },
{ "t":2, "e":[2.66, 3.00, 3.00] },
{ "t":3, "e":[4.33, 3.00, 5.00] }
]
Here, 4.33 is the average of every first test ( e[0] ), but just of the fisrt second ( t=1 ), of every person.
One option is to $unwind to separate for the documents according to their t value and use $zip to transpose it before calculating the average:
db.collection.aggregate([
{$unwind: "$b"},
{$group: {_id: "$b.t", data: {$push: "$b.e"}}},
{$set: {data: {$zip: {inputs: [
{$arrayElemAt: ["$data", 0]},
{$arrayElemAt: ["$data", 1]},
{$arrayElemAt: ["$data", 2]}
]
}
}
}
},
{$project: {
t: "$_id",
e: {$map: {input: "$data", in: {$trunc: [{$avg: "$$this"}, 2]}}}
}
},
{$sort: {t: 1}},
{$group: {_id: 0, average: {$push: {t: "$t", e: "$e"}}}},
{$unset: "_id"}
])
See how it works on the playground example - zip
Other option may be to $unwind twice and build the entire calculation from pieces, but the advantage is that you don't need to literally specify the number of items in each e array for the $arrayElemAt:
db.collection.aggregate([
{$project: {b: 1, t: 1, _id: 0}},
{$unwind: "$b"},
{$unwind: {path: "$b.e", includeArrayIndex: "index"}},
{$group: {_id: {t: "$b.t", index: "$index"}, data: {$push: "$b.e"}}},
{$sort: {"_id.index": 1}},
{$group: {_id: "$_id.t", average: {$push: {$avg: "$data"}}}},
{$sort: {_id: 1}},
{$group: {_id: 0, average: {$push: {t: "$_id", e: "$average"}}}},
{$unset: "_id"}
])
See how it works on the playground example - unwind twice

Count the documents and sum of values of fields in all documents of a mongodb

I have a set of documents modified from mongodb using
[{"$project":{"pred":1, "base-url":1}},
{"$group":{
"_id":"$base-url",
"invalid":{"$sum": { "$cond": [{ "$eq": ["$pred", "invalid"] }, 1, 0] }},
"pending":{"$sum": { "$cond": [{ "$eq": ["$pred", "null"] }, 1, 0] }},
}},
]
to get the below documents
[{'_id': 'https://www.example1.org/', 'invalid': 3, 'pending': 6},
{'_id': 'https://example2.com/', 'invalid': 10, 'pending': 4},
{'_id': 'https://www.example3.org/', 'invalid': 2, 'pending': 6}]
How to get the count of documents and sum of other fields to obtain the following result
{"count":3, "invalid":15,"pending":16}
you just need a $group stage with $sum
playground
The $sum docs and here has good examples
db.collection.aggregate([
{
$group: {
_id: null,
pending: {
$sum: "$pending"
},
invalid: {
$sum: "$invalid"
},
count: {
$sum: 1 //counting each record
}
}
},
{
$project: {
_id: 0 //removing _id field from the final output
}
}
])

How can I project top 5 counts and sum the rest in MongoDB?

I have the following documents:
_id: "Team 1"
count: 1200
_id: "Team 2"
count: 1170
_id: "Team 3"
count: 1006
_id: "Team 4"
count: 932
_id: "Team 5"
count: 931
_id: "Team 6"
count: 899
_id: "Team 7"
count: 895
The list is already sorted and everything, I just need to project this as an array of top 5 based on count and then the rest should be summed as 'others'. If possible I'd like to also add the percentage that each element in the list makes up of the full count. Like this:
[
{"name":"Team 1", "count":1200, "percent":25},
{"name":"Team 2", "count":1170,"percent":15},
{"name":"Team 3", "count":1006,"percent":10},
{"name":"Team 4", "count":932,"percent":5},
{"name":"Team 5", "count":931,"percent":5},
{"name":"Other", "count":1794, "percent":40}]
]
Query
$setWindowFields to sort and add the sort-rank to each document
group by null with 3 accumulators
push the first 5 documents unchanged
sum the count of the rest (rank>5)
total sum
$map to divide the counts with the total sum for the 5 top documents, to get the percentage also
add also the percentage for the rest of documents
unwind and replace the root, with those documents that have count and percentage
Playmongo (put the mouse at the end of each stage to see the stage in and out)
aggregate(
[{"$setWindowFields":
{"output": {"rank": {"$rank": {}}}, "sortBy": {"count": -1}}},
{"$group":
{"_id": null,
"top5":
{"$push": {"$cond": [{"$lte": ["$rank", 5]}, "$$ROOT", "$$REMOVE"]}},
"other": {"$sum": {"$cond": [{"$lte": ["$rank", 5]}, 0, "$count"]}},
"all": {"$sum": "$count"}}},
{"$project":
{"_id": 0,
"docs":
{"$concatArrays":
[{"$map":
{"input": "$top5",
"in":
{"name": "$$this._id",
"count": "$$this.count",
"percentage":
{"$multiply": [{"$divide": ["$$this.count", "$all"]}, 100]}}}},
[{"name": "other",
"count": "$other",
"percentage":
{"$multiply": [{"$divide": ["$other", "$all"]}, 100]}}]]}}},
{"$unwind": "$docs"}, {"$replaceRoot": {"newRoot": "$docs"}}])
another way to do it using $facet since $setWindowFields only works with mongodb v5 or later
mongoPlayground
db.collection.aggregate([
{ $sort: { count: -1 } },
{
"$facet": {
others: [
{ "$skip": 5 },
{
"$group": {
"_id": "others",
"count": { "$sum": "$count" }
}
}
],
top5: [ { "$limit": 5 } ]
}
},
{
"$project": { result: { "$concatArrays": [ "$others", "$top5" ] } }
},
{
"$addFields": { totalCount: { "$sum": "$result.count" } }
},
{ $unwind: "$result" },
{
$project: {
_id: "$result._id",
count: "$result.count",
percent: {
$round: [
{ "$multiply": [ { $divide: [ "$result.count", "$totalCount" ] }, 100 ] },
0
]
}
}
}
])
If you have mongoDB version 5.0 or higher you can use $setWindowFields like in #Takis nice answer. Otherwise, you can group, $slice and $reduce your way to the answer:
$sort to have the highest count on top and group to put them all in one array called all and to $sum up.
$slice the all array to keep only the top N.
$reduce the top N to sum them up.
Add the others to the top N array with count sum-sum(topN)
$unwind and format
db.collection.aggregate([
{$sort: {count: -1}},
{$group: {_id: null, all: {$push: "$$ROOT"}, sum: {$sum: "$count"}}},
{$project: {_id: null, sum: 1, res: {$slice: ["$all", 5]}}},
{$project: {sum: 1, res: 1, topN: {
$reduce: {
input: "$res",
initialValue: 0,
in: {$add: ["$$value", "$$this.count"]}
}
}
}
},
{
$project: {_id: 0, sum: 1, res: {
$concatArrays: [
[{_id: "other", count: {$subtract: ["$sum", "$topN"]}}],
"$res"
]
}
}
},
{$unwind: "$res"},
{$project: {_id: "$res._id", count: "$res.count",
percent: { $round: [{$multiply:
[{$divide: ["$res.count", "$sum"]}, 100]}, 0]
}
}
}
])
Playground example

how to get last character from a string in mongodb?

Data:
{code: "XXXXXXXX1", total: 400},
{code: "YYYYY2", total: 500}
{code: "ZZZZZZ3", total: 100}
{code: "AAA5", totala: 200}
I want to create an aggregate function to group the data above by its last character in the code field. code field is a string and can be varied in length. I only want to get its last index/number. Something like:
db.transactions.aggregate([
{$project: {
last_index: {$getMyLastCharInMyCode: "$code"},
total: 1
}},
{$group: {_id: "$last_index", {total: {$sum: "$total"}}}}
])
I searched the internet and mongodb manuals, it seems impossible. Any ideas? Thank you
There you go:
db.transactions.aggregate({
$addFields: {
"last_index": { $substr: [ "$code", { $subtract: [ { $strLenCP: "$code" }, 1 ] }, 1 ] }
}
})
db.transactions.aggregate([
{"$project": {
total: 1,
code: 1,
last_index: { $substr: [ "$code", { $subtract: [ {"$strLenCP": "$code"}, 1 ] }, -1 ]}
}
},
{"$group": {
"_id": "$last_index",
"total": {"$sum": "$total"}
}
}
])