Lets consider the example:
https://docs.mongodb.com/manual/reference/operator/aggregation/last/
I'l like to include additional fields like price or quantity of last sale. I mean how to include field which is not part of key nor aggregate expression. Is it possible?
Basically it is possible to get entire document as an output of $group stage. There's a special variable $$ROOT which is helpful in such situations. So for instance if you want to get last processed document you can use following code:
db.sales.aggregate([
{ "$sort": { "date": 1 } },
{
$group: {
_id: "$item",
lastDocument: { $last: "$$ROOT" }
}
}
])
Related
I have a mongodbdb collection which has
{ _id, field: { product: { _id } }
what I need is to $group by product so that _id will be unique (does not have duplicates), but at the same time, preserve the data structure?
What I did: None so far (just reading the Mongodb docs on how to do this)
The $group stage alters the document structure, so you'll need to add another stage after it to convert the document back to the original structure, here is an example on how to do it by saving the first document for each product _id then replace the root with that object:
db.collection.aggregate([
{
$group: {
_id: "$field.product._id",
firstRoot: {
$first: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: "$firstRoot"
}
}
])
Mongo Playground
I have a collection "TokenBalance" like this holding documents of this structure
{
_id:"SvVV1qdUcxNwSnSgxw6EG125"
balance:Array
address:"0x6262998ced04146fa42253a5c0af90ca02dfd2a3"
timestamp:1648156174658
_created_at:2022-03-24T21:09:34.737+00:00
_updated_at:2022-03-24T21:09:34.737+00:00
}
Each address has multiple documents like of structure above based on timestamps.
So address X can have 1000 objects with different timestamps.
What I want is to only get the last created documents per address but also pass all the document fields into the next stage which is where I am stuck. I don't even know if the way I am grouping is correctly done with the $last operator. I would appreciate some guidance on how to achieve this task.
What I have is this
$group stage (1st stage)
{
_id: '$address',
timestamp: {$last: '$timestamp'}
}
This gives me a result of
_id:"0x6262998ced04146fa42253a5c0af90ca02dfd2a3"
timestamp:1648193827320
But I want the other fields of each document as well so I can further process them.
Questions
1) Is it the correct way to get the last created document per "address" field?
2) How can I get the other fields into the result of that group stage?
Use $denseRank
db.collection.aggregate([
{
$setWindowFields: {
partitionBy: "$address",
sortBy: { timestamp: -1 },
output: { rank: { $denseRank: {} } }
}
},
{
$match: { rank: 1 }
}
])
mongoplayground
I guess you mean this:
{ $group: {
_id: '$address',
timestamp: {$last: '$timestamp'},
data: { $push: "$$ROOT" }
} }
If the latest timestamp is also the last sorted by _id you can use something like this:
[{$group: {
_id: '$_id',
latest: {
$last: '$$ROOT'
}
}}, {$replaceRoot: {
newRoot: '$latest'
}}]
let´s say I have a document of blog posts that has fields "_id, userName, age". a user could have made more than one blog post, I want to find the users that have made 4 posts.
db.blogs.aggregate([{$group: {_id: {"$userName", "age"}, : {$sum: ""eq", 3}}])
To know how many times a field comes, use a $group stage to group by that field and add an extra field for the count using the $count operator. Then if you want to filter by that count, just add a $match stage to filter by that new field:
{
$group: {
"_id": "$username",
"count": {
$count: {}
}
}
},
{
$match: {
"count": {
$gte: 4
}
}
}
Mongo playground
There are documents with structure:
{"appId":<id>,"time":<number>}
For the example let we assume we have:
{"appId":"A","time":1}
{"appId":"A","time":3}
{"appId":"A","time":5}
{"appId":"B","time":1}
{"appId":"B","time":2}
{"appId":"B","time":4}
{"appId":"B","time":6}
Is it possible to group the documents by appId, each group to be sorted by time, and all results to be shown from the latest time for the group like:
{"appId":"B","time":6}
{"appId":"B","time":4}
{"appId":"B","time":2}
{"appId":"B","time":1}
{"appId":"A","time":5}
{"appId":"A","time":3}
{"appId":"A","time":1}
I tried this query:
collection.aggregate([{"$group":{"_id":{"a":"$appId"},"ttt":{"$max":"$time"}}},
{"$sort":{"_id.ttt":-1,"time":-1}}])
but i recieved only the last time for particular appId -> 2 results and this query change the structure of the data.
I want to keep the structure of the documents and only to group and sort them like the example.
You can try below aggregation:
db.collection.aggregate([
{
$sort: { time: -1 }
},
{
$group: {
_id: "$appId",
max: { $max: "$time" },
items: { $push: "$$ROOT" }
}
},
{
$sort: { max: -1 }
},
{
$unwind: "$items"
},
{
$replaceRoot: {
newRoot: "$items"
}
}
])
You can $sort before grouping to get the right order inside of each group. Then you can use special variable $$ROOT while grouping to capture whole orinal object. In the next step you can sort by $max value and use $unwind with $replaceRoot to get back the same amount of documents and to promote original shape to root level.
See if the below find & sort operation works with your real data.
collection.find({}, {_id : 0}).sort({appId:1, time:-1})
If this is a huge collection and this is going to be a repetitive query, make sure to create a compound index on these two fields.
i want to find accepted bodypart which have status active
i tried this
db.patients.find({
"injury.injurydata.injuryinformation.dateofinjury": {
"$gte": ISODate("2014-05-21T08:00:00Z") ,
"$lt": ISODate("2014-06-03T08:00:00Z")
},
{
"injury.injurydata.acceptedbodyparts":1,
"injury.injurydata.injuryinformation.dateofinjury":1
"injury":{
$elemMatch: {
"injury.injurydata.acceptedbodyparts.status": "current"
}
}
})
but still get both array
If acceptedbodyparts is an array, you can't query acceptedbodyparts.status. If status is a field on the documents contained in the array, you would need to use another $elemMatch clause in your query. So the last part would look something like this:
{"injury":{ "$elemMatch": { "injurydata.acceptedbodyparts": {"$elemMatch": {"status":"current"} }} }}
I also removed the injury. prefix in the first $elemMatch because you're querying data within the injury array.
Note that this will return the entire document with the full array, as long as it contains the document you're searching for. If your intention is to retrieve a particular element in an array, $elemMatch is the wrong approach.
Standard projection will not work with nested arrays or limiting any fields inside arrays. For that you need the aggregation framework:
db.patients.aggregate([
// First match, Matches documents
{ "$match": {
"injury.injurydata.injuryinformation.dateofinjury": {
"$gte": ISODate("2014-05-21T08:00:00Z"),
"$lt": ISODate("2014-06-03T08:00:00Z")
}
}},
// Un-wind the arrays
{ "$unwind": "$injury" },
{ "$unwind": "$injury.injurydata" },
{ "$unwind": "$injury.injurydata.acceptedbodyparts" },
// Now match the required data in the array
{ "$match": {
"injury.injurydata.acceptedbodyparts.status": "current"
}},
// Group only wanted fields
{ "$group": {
"_id": "$_id",
"acceptedbodyparts": {
"$push": "injury.injurydata.acceptedbodyparts"
}
}}
])
You can add in other fields outside of the array either using $first or by akin g them part of the _id in the grouping.
This is just something that is outside of the scope of the standard projection available and the aggregation framework with the extended manipulation capabilities solves this.