Get chain of nested documents using MongoDB aggregation framework - mongodb

Suppose there is a document, that contains an array of documents each of which in turn contains an array of documents, like
{_id: 1, level1: [
{_id: 10, level2: [
{_id: 100},
{_id: 101},
{_id: 102},
]},
{_id: 11, level2: [
{_id: 103},
{_id: 104},
{_id: 105},
]},
{_id: 12, level2: [
{_id: 106},
{_id: 107},
{_id: 108},
]}
]}
and there is an _id = 101 of some inner (third level) document, that I want to search for. I would like to get the inner document and all the enclosing documents in one result, i.e.
{
doc1: {_id: 1, level1: [
{_id: 10, level2: [
{_id: 100},
{_id: 101},
{_id: 102},
]},
{_id: 11, level2: [
{_id: 103},
{_id: 104},
{_id: 105},
]},
{_id: 12, level2: [
{_id: 106},
{_id: 107},
{_id: 108},
]}
]},
doc2: {_id: 10, level2: [
{_id: 100},
{_id: 101},
{_id: 102},
]},
doc3: {_id: 101}
}
Is it possible to achieve this using the Aggregation Framework?

Related

How do I group and count values by value range in MongoDB

I have the following documents in my MongoDB:
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 1
_id: ObjectId(...)
'timestamp': 2022-11-03T09:00:00.000+00:00
score: 3
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 6
_id: ObjectId(...)
'timestamp': 2022-11-03T10:00:00.000+00:00
score: 10
I want to make an aggregation that counts the score within the range of (gte)1-(lt)5 as poor, (gte)5-(lt)7 as ok, (gte)7-(lt)8.5 as good and (gte)8.5-(lte)10 as excellent.
So the result would look like this:
{
"data": [
{
"name": "excellent",
"count": 1
},
{
"name": "good",
"count": 0
},
{
"name": "ok",
"count": 1
},
{
"name": "poor",
"count": 2
}
]
}
How do I achieve that?
If you accept an answer only with documents that have a count, you can do:
db.collection.aggregate([
{$project: {
_id: {
$arrayElemAt: [
["poor", "ok", "good", "excellent"],
{$floor: {$divide: ["$score", 10]}}
]}
}},
{$group: {_id: "$_id", count: {$sum: 1}}}
])
Otherwise you need to create all categories:
db.collection.aggregate([
{$group: {
_id: 0,
excellent: {$sum: {$cond: [{$gte: ["$score", 30]}, 1, 0]}},
good: {$sum: {$cond: [{$and: [{$gte: ["$score", 20]}, {$lt: ["$score", 30]}]}, 1, 0]}},
ok: {$sum: {$cond: [{$and: [{$gte: ["$score", 10]}, {$lt: ["$score", 20]}]}, 1, 0]}},
poor: {$sum: {$cond: [{$lt: ["$score", 10]}, 1, 0]}}
}},
{$unset: "_id"},
{$project: {data: {$objectToArray: "$$ROOT"}}},
{$project: {
data: {$map: {
input: "$data",
in: {nmae: "$$this.k", count: "$$this.v"}
}}
}}
])
See how it works on the playground example

Finding ranges of continuous values

I have the following Mongo collection:
[
{
"key": 1,
"user": "A",
"comment": "commentA1"
},
{
"key": 2,
"user": "A",
"comment": "commentA2"
},
{
"key": 5,
"user": "A",
"comment": "commentA5"
},
{
"key": 2,
"user": "B",
"comment": "commentB2"
},
{
"key": 3,
"user": "B",
"comment": "commentB3"
},
{
"key": 6,
"user": "B",
"comment": "commentB6"
}
]
and I need to find the first continuous keys, with no gaps, per user.
So, for user A I should get the first 2 documents, and for user B the first two also.
The collection might contain more than 2M documents, so the query should work fast.
I have found SQL solutions for this problem (http://www.silota.com/docs/recipes/sql-gap-analysis-missing-values-sequence.html in section number 3), but I am looking for a Mongo solution.
How can I do it in Mongo 4.0 (DocumentDB) ?
EDIT: according to further elaboration on the comments,
One option is:
db.collection.aggregate([
{$sort: {key: 1}},
{$group: {
_id: "$user",
data: {$push: {key: "$key", comment: "$comment"}},
shadow: {$push: {$add: ["$key", 1]}}
}},
{$project: {
data: 1,
shadow: {$filter: {input: "$shadow", cond: {$in: ["$$this", "$data.key"]}}}
}},
{$project: {data: 1, shadow: 1, firstItem: {$subtract: [{$first: "$shadow"}, 1]}}},
{$project: {data: 1, firstItem: 1, shadow: {$concatArrays: [["$firstItem"], "$shadow"]}}},
{$project: {
data: 1,
shadow: {$reduce: {
input: {$range: [0, {$size: "$shadow"}]},
initialValue: [],
in: {
$concatArrays: [
"$$value",
{$cond: [
{$eq: [
{$arrayElemAt: ["$shadow", "$$this"]},
{$add: ["$$this", "$firstItem"]}
]},
[{$arrayElemAt: ["$shadow", "$$this"]}],
[]
]},
]
}
}
}
}
},
{$project: {data: {$filter: {input: "$data", cond: {$in: ["$$this.key", "$shadow"]}}}}},
{$unwind: "$data"},
{$project: {comment: "$data.comment", key: "$data.key"}}
])
See how it works on the playground example

MongoDB collect / aggregate time series into an array

Following the examples I have two types of data in the same time series
db.weather.insertMany( [
{
"metadata": { "sensorId": 5578, "type": "temperature" },
"timestamp": ISODate("2021-05-18T00:00:00.000Z"),
"temp": 72
},//....
and..
db.weather.insertMany([
{
"metadata": {"sensorId": 5578, "type": "humidity" },
"timestamp": ISODate("2021-05018T00:00:001Z"),
"humpercent": 78
},//...
and I want to be able to serve simple requests by aggregating the data as:
{
sensorId: 5578,
humidityData: [78, 77, 75 ...],
tempData: [72, 72, 71...]
}
which seems like the obvious use case, but the
db.foo.aggregate([{$group: {_id: "$sensorId"}}])
function on sensorId only returns the ids with no other fields. am i missing a simple identity aggregation function or a way to collect into an array?
What you are looking for is the $addToSet Operator:
db.foo.aggregate([{
$group: {
_id: "$metadata.sensorId",
temp: {
$addToSet: "$temp"
},
humidity: {
$addToSet: "$humpercent"
}
}
}])
Note that the order of elements in the returned array is not specified.
If all you have is two categories, you can simply $push them:
db.collection.aggregate([
{$sort: {timestamp: 1}},
{$group: {
_id: {sensorId: "$metadata.sensorId"},
temp: {$push: "$temp"},
humidity: {$push: "$humpercent"}
}
}
])
See how it works on the playground example - small
But if you want the generic solution for multiple measurements you need something like:
db.collection.aggregate([
{$sort: {timestamp: 1}},
{$set: {m: "$$ROOT"}},
{$unset: ["m.metadata", "m.timestamp", "m._id"]},
{$set: {m: {$first: {$objectToArray: "$m"}}}},
{$group: {
_id: {type: "$metadata.type", sensorId: "$metadata.sensorId"},
data: {$push: "$m.v"}}
},
{$project: {_id: 0, data: 1, type: {k: "type", v: "$_id.type"}, sensorId: "$_id.sensorId"}},
{$group: {
_id: "$sensorId",
data: {$push: {k: "$type.v", v: "$data"}}
}},
{$project: {_id: 0, data: {"$mergeObjects": [{$arrayToObject: "$data"}, {sensorId: "$_id"}]}
}},
{$replaceRoot: {newRoot: "$data"}}
])
See how it works on the playground example - generic

MongoDB - orderby syntax breaks statement

The following query works until I add the orderby statement, but when I try to run after adding it an exception is thrown
db.games.aggregate([
{$unwind: "$games"},
{$project: {_id: 0, year: {$substr: ["$join_date", 0, 4]},
status: 1,
games: 1}},
{$match: {status: {$ne: "disabled"}}},
{$group: {_id: {"Year": "$year", code: "$games.code"}, "AverageWins": {$avg: "$games.wins"}}},
{$orderby: {"AverageWins": -1}}
])
Can anyone help please?

How can I make aggregator to output N documents instead of one document with N rows?

db.entities.aggregate([
{$limit: 80000},
{$unwind: '$documents'},
{$unwind: '$documents.subjects'},
{$unwind: '$documents.subjects.journal_entries'},
{$match: {'documents.subjects.journal_entries.date':{$gte: ISODate("2014-01-01T00:00:00.0Z"), $lte: ISODate("2015-01-01T00:00:00.0Z")}}},
{$group: {_id: "$_id", "total": {$sum: '$documents.subjects.journal_entries.amount'}}},
]);
Here is my aggregator query. It outputs one document with field results and the results field have rows. Each row is basically a document.
I want each row to be it is own document. I will be querying a large number of rows and this behavior of Mongo breaks my querying because it exceeds the maximum size of the document (16Mb).
How can I achieve this?
EDIT:
Current output:
{
results: [
{_id: ..., total: ...},
{_id: ..., total: ...}
{_id: ..., total: ...}
{_id: ..., total: ...}
{_id: ..., total: ...}
{_id: ..., total: ...}
]
}
Expected output:
{_id: ..., total: ...};
{_id: ..., total: ...};
{_id: ..., total: ...};
{_id: ..., total: ...};
{_id: ..., total: ...};
{_id: ..., total: ...};
Semicolon is a separator for a new document