mongodb aggregation: How to return a the object with min/max instead of the value - mongodb

Say my document has a date field, and I want to get the first and last occuring documents in an aggregation. Using $group and $min or $max, it's easy to get the dates themselves, e.g.:
db.mycollection.aggregate([
{ $group: {
_id: 1, // for the example say I'm grouping them all ...
first: { $min: "$date" },
result: { $push: { ... } } // ... and returning them all
}}
])
This would return a result like:
{ _id: 1, first: ISODate(...), result: [...] }
But what I want isn't the first date, but rather the result with the first date. How would I get at this using the pipeline?
I've been tinkering with using $project to scan the array afterwards for the object with the matching date, which seems like it could work, but I thought I'd see if there was a proper way to do this before I stumbled on an improper one.

You can use $first here to be able to get the first of a sorted set ( http://docs.mongodb.org/manual/reference/aggregation/first/#_S_first ):
db.mycollection.aggregate([
{ $group: {
_id: 1, // for the example say I'm grouping them all ...
first: { $first: "$date" },
result: { $push: { ... } } // ... and returning them all
}}
])
This will also you to use indexes for sorts on the $group which increases performance.

Related

MongoDB get only the last documents per grouping based on field

I have a collection "TokenBalance" like this holding documents of this structure
{
_id:"SvVV1qdUcxNwSnSgxw6EG125"
balance:Array
address:"0x6262998ced04146fa42253a5c0af90ca02dfd2a3"
timestamp:1648156174658
_created_at:2022-03-24T21:09:34.737+00:00
_updated_at:2022-03-24T21:09:34.737+00:00
}
Each address has multiple documents like of structure above based on timestamps.
So address X can have 1000 objects with different timestamps.
What I want is to only get the last created documents per address but also pass all the document fields into the next stage which is where I am stuck. I don't even know if the way I am grouping is correctly done with the $last operator. I would appreciate some guidance on how to achieve this task.
What I have is this
$group stage (1st stage)
{
_id: '$address',
timestamp: {$last: '$timestamp'}
}
This gives me a result of
_id:"0x6262998ced04146fa42253a5c0af90ca02dfd2a3"
timestamp:1648193827320
But I want the other fields of each document as well so I can further process them.
Questions
1) Is it the correct way to get the last created document per "address" field?
2) How can I get the other fields into the result of that group stage?
Use $denseRank
db.collection.aggregate([
{
$setWindowFields: {
partitionBy: "$address",
sortBy: { timestamp: -1 },
output: { rank: { $denseRank: {} } }
}
},
{
$match: { rank: 1 }
}
])
mongoplayground
I guess you mean this:
{ $group: {
_id: '$address',
timestamp: {$last: '$timestamp'},
data: { $push: "$$ROOT" }
} }
If the latest timestamp is also the last sorted by _id you can use something like this:
[{$group: {
_id: '$_id',
latest: {
$last: '$$ROOT'
}
}}, {$replaceRoot: {
newRoot: '$latest'
}}]

How to sort a dictionary keys and pick the first in MongoDb?

I'm running the following query as described in the docs.
db.getCollection('things')
.find(
{ _id: UUID("...") },
{ _id: 0, history: 1 }
)
It produces a single element that, when unfolded in the GUI, shows the dictonary history. When I unfold that, I get to see the contents: bunch of keys and correlated values.
Now, I'd like to sort the keys alphabetically and pick n first ones. Please note that it's not an array but a dictionary that is stored. Also, it would be great if I could flatten the structure and pop up my history to be the head (root?) of the document returned.
I understand it's about projection and slicing. However, I'm not getting anywhere, despite many attempts. I get syntax errors or a full list of elements. Being rather nooby, I fear that I require a few pointers on how to diagnose my issue to begin with.
Based on the comments, I tried with aggregate and $sort. Regrettably, I only seem to be sorting the current output (that produces a single document due to the match condition). I want to access the elements inside history.
db.getCollection('things')
.aggregate([
{ $match: { _id: UUID("...") } },
{ $sort: { history: 1 } }
])
I'm sensing that I should use projection to pull out a list of elements residing under history but I'm getting no success using the below.
db.getCollection('things')
.aggregate([
{ $match: { _id: UUID("...") } },
{ $project: { history: 1, _id: 0 } }
])
It is a long process to just sort object properties by alphabetical order,
$objectToArray convert history object to array in key-value format
$unwind deconstruct above generated array
$sort by history key by ascending order (1 = ascending, -1 = descending)
$group by _id and reconstruct history key-value array
$slice to get your number of properties from dictionary from top, i have entered 1
$arrayToObject back to convert key-value array to object format
db.getCollection('things').aggregate([
{ $match: { _id: UUID("...") } },
{ $project: { history: { $objectToArray: "$history" } } },
{ $unwind: "$history" },
{ $sort: { "history.k": 1 } },
{
$group: {
_id: "$_id",
history: { $push: "$history" }
}
},
{
$project: {
history: {
$arrayToObject: { $slice: ["$history", 1] }
}
}
}
])
Playground
There is another option, but as per MongoDB, it can not guarantee this will reproduce the exact result,
$objectToArray convert history object to array in key-value format
$setUnion basically this operator will get unique elements from an array, but as per experience, it will sort elements by key ascending order, so as per MongoDB there is no guarantee.
$slice to get your number of properties from dictionary from top, i have entered 1
$arrayToObject back to convert key-value array to object format
db.getCollection('things').aggregate([
{ $match: { _id: UUID("...") } },
{
$project: {
history: {
$arrayToObject: {
$slice: [
{ $setUnion: { $objectToArray: "$history" } },
1
]
}
}
}
}
])
Playground

MongoDb Aggregate group and sort applications

There are documents with structure:
{"appId":<id>,"time":<number>}
For the example let we assume we have:
{"appId":"A","time":1}
{"appId":"A","time":3}
{"appId":"A","time":5}
{"appId":"B","time":1}
{"appId":"B","time":2}
{"appId":"B","time":4}
{"appId":"B","time":6}
Is it possible to group the documents by appId, each group to be sorted by time, and all results to be shown from the latest time for the group like:
{"appId":"B","time":6}
{"appId":"B","time":4}
{"appId":"B","time":2}
{"appId":"B","time":1}
{"appId":"A","time":5}
{"appId":"A","time":3}
{"appId":"A","time":1}
I tried this query:
collection.aggregate([{"$group":{"_id":{"a":"$appId"},"ttt":{"$max":"$time"}}},
{"$sort":{"_id.ttt":-1,"time":-1}}])
but i recieved only the last time for particular appId -> 2 results and this query change the structure of the data.
I want to keep the structure of the documents and only to group and sort them like the example.
You can try below aggregation:
db.collection.aggregate([
{
$sort: { time: -1 }
},
{
$group: {
_id: "$appId",
max: { $max: "$time" },
items: { $push: "$$ROOT" }
}
},
{
$sort: { max: -1 }
},
{
$unwind: "$items"
},
{
$replaceRoot: {
newRoot: "$items"
}
}
])
You can $sort before grouping to get the right order inside of each group. Then you can use special variable $$ROOT while grouping to capture whole orinal object. In the next step you can sort by $max value and use $unwind with $replaceRoot to get back the same amount of documents and to promote original shape to root level.
See if the below find & sort operation works with your real data.
collection.find({}, {_id : 0}).sort({appId:1, time:-1})
If this is a huge collection and this is going to be a repetitive query, make sure to create a compound index on these two fields.

Mongo Query to return common values in array

I need a Mongo Query to return me common values present in an array.
So if there are 4 documents in match, then the values are returned if those are present in in all the 4 documents
Suppose I have the below documents in my db
Mongo Documents
{
"id":"0",
"merchants":["1","2"]
}
{
"id":"1",
"merchants":["1","2","4"]
}
{
"id":"2",
"merchants":["4","5"]
}
Input : List of id
(i) Input with id "0" and "1"
Then it should return me merchants:["1","2"] as both are present in documents with id "0" & id "1"
(ii) Input with id "1" and "2"
Then it should return me merchants:["4"] as it is common and present in both documents with id "1" & id "2"
(iii) Input with id "0" and "2"
Should return empty merchants:[] as no common merchants between these 2 documents
You can try below aggregation.
db.collection.aggregate(
{$match:{id: {$in: ["1", "2"]}}},
{$group:{_id:null, first:{$first:"$merchants"}, second:{$last:"$merchants"}}},
{$project: {commonToBoth: {$setIntersection: ["$first", "$second"]}, _id: 0 } }
)
Say you have a function query that does the required DB query for you, and you'll call that function with idsToMatch which is an array containing all the elements you want to match. I have used JS here as the driver language, replace it with whatever you are using.
The following code is dynamic, will work for any number of ids you give as input:
const query = (idsToMatch) => {
db.collectionName.aggregate([
{ $match: { id: {$in: idsToMatch} } },
{ $unwind: "$merchants" },
{ $group: { _id: { id: "$id", data: "$merchants" } } },
{ $group: { _id: "$_id.data", count: {$sum: 1} } },
{ $match: { count: { $gte: idsToMatch.length } } },
{ $group: { _id: 0, result: {$push: "$_id" } } },
{ $project: { _id: 0, result: "$result" } }
])
The first $group statement is to make sure you don't have any
repetitions in any of your merchants attribute in a document. If
you are certain that in your individual documents you won't have any
repeated value for merchants, you need not include it.
The real work happens only upto the 2nd $match phase. The last two
phases ($group and $project) are only to prettify the result,
you may choose not use them, and instead use the language of your
choice to transform it in the form you want
Assuming you want to reduce the phases as per the points given above, the actual code will reduce to:
aggregate([
{ $match: { id: {$in: idsToMatch} } },
{ $unwind: "$merchants" },
{ $group: { _id: "merchants", count: {$sum: 1} } },
{ $match: { count: { $gte: idsToMatch.length } } }
])
Your required values will be at the _id attribute of each element of the result array.
The answer provided by #jgr0 is correct to some extent. The only mistake is the intermediate match operation
(i) So if input ids are "1" & "0" then the query becomes
aggregate([
{"$match":{"id":{"$in":["1","0"]}}},
{"$unwind":"$merchants"},
{"$group":{"_id":"$merchants","count":{"$sum":1}}},
{"$match":{"count":{"$eq":2}}},
{"$group":{"_id":null,"merchants":{"$push":"$_id"}}},
{"$project":{"_id":0,"merchants":1}}
])
(ii) So if input ids are "1", "0" & "2" then the query becomes
aggregate([
{"$match":{"id":{"$in":["1","0", "2"]}}},
{"$unwind":"$merchants"},
{"$group":{"_id":"$merchants","count":{"$sum":1}}},
{"$match":{"count":{"$eq":3}}},
{"$group":{"_id":null,"merchants":{"$push":"$_id"}}},
{"$project":{"_id":0,"merchants":1}}
])
The intermediate match operation should be the count of ids in input. So in case (i) it is 2 and in case (2) it is 3.

Mongo query using aggregation for dates

In the following query I'm trying to find entries in my articles collection made in the last week, sorted by the number of votes on that article. The $match doesn't seem to work(maybe I dont know how to use it). The following query works perfectly, so its not a date format issue,
db.articles.find(timestamp:{
'$lte':new Date(),
'$gte':new Date(ISODate().getTime()-7*1000*86400)}
})
But this one doesn't fetch any results. Without the $match it also fetches the required results(articles sorted by votecount).
db.articles.aggregate([
{
$project:{
_id:1,
numVotes:{$subtract:[{$size:"$votes.up"},{$size:"$votes.down"}]}}
},
{
$sort:{numVotes:-1}
},
{
$match:{
timestamp:{
'$lte':new Date(),
'$gte':new Date(ISODate().getTime()-7*1000*86400)}
}
}
])
You are trying to match at the end of your pipeline, which supposes you have projected timestamp field, and you haven't done that.
I believe what you want is to filter data before aggregation, so you should place match at the top of your aggregation array.
Try this:
db.articles.aggregate([{
$match: {
timestamp: {
'$lte': new Date(),
'$gte': new Date(ISODate().getTime() - 7 * 1000 * 86400)
}
}
}, {
$project: {
_id: 1,
numVotes: {
$subtract: [{
$size: "$votes.up"
}, {
$size: "$votes.down"
}]
}
}
}, {
$sort: {
numVotes: -1
}
}])