MongoDB partition window, getting the document where a field has the greatest value - mongodb

I am translating some things from SQL into MongoDB.
I have a Mongo table set up where a document can contain lots of information. There are two ids, id_1 and id_2. id_2 has a default value of -1. There is a 1->n relationship between id_1 and id_2. For instance, the data can look like:
id_1 id_2 info
---- | ---- | ----
120 -1 'dont'
120 444 'show'
123 -1 'test'
124 -1 'hello'
125 -1 'world'
125 123 'oh wait'
126 -1 'help'
126 201 'me'
127 -1 'sql'
127 206 'hell'
I want to have a MongoDB query that gets the highest id_2 associated with an id_1.
Here is what the answer should look like given id_1 containing (123,124,125,126,127) and id_2 containing (-1,-1,123,201,206):
id_1 id_2 info
---- | ---- | ----
123 -1 'test'
124 -1 'hello'
125 123 'oh wait'
126 201 'me'
127 206 'hell'
In SQL this could be done using the following:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id_1 ORDER BY id_2 DESC) rn
FROM ids
WHERE id_1 IN (123, 124, 125, 126, 127) AND
id_2 IN (-1, -1, 123, 201, 206)
)
SELECT id_1, id_2, info
FROM cte
WHERE rn = 1;
In Mongo this can be done with a $group clause, however, it is very slow. See below:
{
'$sort' : {
'id_1': 1,
'id_2': 1
}
},
{
'$group' : {
'_id': '$id_1',
'id_1': {'$first': '$id_1'},
'info': { '$last': '$info'}
}
}
I found this in the documentation:
https://docs.mongodb.com/manual/reference/operator/aggregation/setWindowFields/#std-label-setWindowFields-window
However, I'm not getting good results. I think I'm misunderstanding window. Here is what I have:
{
'$match' : {
'id_1': {'$in' : [123,124,125,126,127]},
'id_2': {'$in' : [-1,-1,123,201,206]}
}
},
{
'$setWindowFields': {
'partitionBy': 'id_1',
'sortBy' : {
'id_2': -1
},
'output': {
'info': {
'$last': '$info'
},
}
}
},
{
'$project' : {
'id_1' : 1,
'id_2' : 1,
'info' : 1
}
}
This doesn't really seem to do anything except output every info for every combination of id_1 and id_2. Similarly, adding a range of [0,1] to a window in the output just results in an error that says:
Missing _sortExpr with range-based bounds
Does anyone know how to get the same results that I got in SQL?

$match match id_1 and id_2
$setWindowFields unbounded check the whole group by partition
$match keep only id_2 = max, means it's the largest document.
$unset remove max since it's unnecessary
db.collection.aggregate([
{
"$match": {
"id_1": { "$in": [ 123, 124, 125, 126, 127 ] },
"id_2": { "$in": [ -1, 123, 201, 206 ] }
}
},
{
$setWindowFields: {
partitionBy: "$id_1",
sortBy: { id_2: 1 },
output: {
max: {
$max: "$id_2",
window: {
documents: [ "unbounded", "unbounded" ]
}
}
}
}
},
{
"$match": {
$expr: { "$eq": [ "$id_2", "$max" ] }
}
},
{
"$unset": "max"
}
])
mongoplayground

Related

How to search in MongoDB an element depending on the previous one?

I'm having to deal with a query that is kind of strange. I'm creating an app for boat tracking: I have a collections of documents with the timestamp and the Port ID where it was the board at that moment.
After sorting all the documents of this collection by the timestamp descending, I need to grab the elements that have the same Port ID in that range of time.
For example:
timestamp | port_id
2021-11-10T23:00:00.000Z | 1
2021-11-10T22:00:00.000Z | 1
2021-11-10T21:00:00.000Z | 1
2021-11-10T20:00:00.000Z | 2
2021-11-10T19:00:00.000Z | 2
2021-11-10T18:00:00.000Z | 2
2021-11-10T17:00:00.000Z | 1
2021-11-10T16:00:00.000Z | 1
2021-11-10T15:00:00.000Z | 1
Having this data (sorted by timestamp), I would have to grab the first 3 documents. The way I'm doing this now, is grabbing 2000 documents and implementing a filter function in the application level.
Another approch would be grabbing the first element, and then filtering by that port id, but that returns me 6 elements, not the first 3.
Do you know any way to perform a query like this in Mongo? Thanks!
Use $setWindowFields
db.collection.aggregate([
{
$setWindowFields: {
partitionBy: "",
sortBy: { timestamp: -1 },
output: {
c: {
$shift: {
output: "$port_id",
by: -1,
default: "Not available"
}
}
}
}
},
{
$set: {
c: {
$cond: {
if: { $eq: [ "$port_id", "$c" ] },
then: 0,
else: 1
}
}
}
},
{
$setWindowFields: {
partitionBy: "",
sortBy: { timestamp: -1 },
output: {
c: {
$sum: "$c",
window: { documents: [ "unbounded", "current" ] }
}
}
}
},
{
$match: { c: 1 }
},
{
$unset: "c"
}
])
mongoplayground

mongodb insert then conditional sum followed by updating sum to another collection

I am trying to
insert a new record with a points field in to reputationActivity collection
get sum of points from reputationActivity collection where user id matches
insert the resulting sum to users collection
Here is mongo playground which does not work right now - https://mongoplayground.net/p/tHgPpODjD6j
await req.db.collection('reputationActivity').insertOne({
_id: nanoid(),
"userId": userId,//insert a new record for this user
"points": points,//insert integer points
},
function(){
req.db.collection('reputationActivity').aggregate([ { $match: { userId: userId } },
{ TotalSum: { $sum: "$points" } } ]); // sum of point for this user
req.db.collection('users').updateOne(
{_id: userId},
{$set: {"userTotalPoints": TotalSum}},// set sum in users collection
)
}
)
});
The above code gives me an error that Total sum is not defined. Is it better to do this without a callback function and if so, how?
write in one single block of command, use drop command carefully. it's used here to show start to end and illustration purposes only.
> db.reputation.drop();
true
> db.reputation.insertOne(
... {_id: 189, "userId": 122, "points": 60});
{ "acknowledged" : true, "insertedId" : 189 }
> var v_userId = 122;
> db.reputation.aggregate([
... {
... $match: {
... userId: v_userId
... }
... },
... {
... $group: {
... "_id": null, "TotalSum": {$sum: "$points"}}
... },
... ]).forEach(function(doc)
... {print("TotalSum: " +doc.TotalSum,
... "userId: " +v_userId);
... db.users.updateOne(
... {"userId":v_userId}, {$set:{"userPoints":doc.TotalSum}});
... }
... );
TotalSum: 60 userId: 122
> db.users.find();
{ "_id" : 189, "userId" : 122, "points" : 60, "userPoints" : 60 }
>

MongoDB grouping and max item?

For the life of me, I can't figure out how to write this query in MongoDB!
I need to return rows from a query where the max "Revision" is the row returned for an item
I have two parts:
Return a count without actually getting the data.
Return all the data (not necessarily in order by column A but would
be helpful) based on the grouping.
Here's a sample set (In our real world data, we have ~40 columns and potentially millions of rows):
A B C D Revision
--------+-------+-------+-------+------------
Item1 100 200 300 1
Item1 111 222 333 2
Item2 200 500 800 1
Item2 222 555 888 2
Item2 223 556 889 3
Item3 300 600 900 1
Item4 400 700 1000 1
What I need to be returned:
The returned count: 4
The returned data:
A B C D Revision
--------+-------+-------+-------+--------
Item1 111 222 333 2
Item2 223 556 889 3
Item3 300 600 900 1
Item4 400 700 1000 1
I've been trying so many combination or $group and $count and even tried to use
SQL to MongoDB query translation tools but can't seem to make it work.
Please help and thanks in advance!!
Try below query :
db.collection.aggregate([
{
$group: {
_id: "$A",
A: { $first: "$A" },
B: { $max: "$B" },
C: { $max: "$C" },
D: { $max: "$D" },
Revision: { $max: "$Revision" },
}
},
{
$project: { _id: 0 },
},
{
$group: {
_id: "",
count: { $sum: 1 },
data: { $push: "$$ROOT" }
}
}
]);
Test : mongoplayground
So in the data field you'll have actual documents where as count represent no.of unique 'A''s in collection.
In other way if you don't want result to be in that way you can just try below query :
db.collection.aggregate([
{
$group: {
_id: "$A",
A: { $first: "$A" },
B: { $max: "$B" },
C: { $max: "$C" },
D: { $max: "$D" },
Revision: { $max: "$Revision" },
}
},
{
$project: { _id: 0 },
}
])
Test : mongoplayground
It would return an array of objects cause aggregation will always return a cursor which will be resolved to an array in code then you can just do .length on array to get total count instead of using second $group stage in first query to get just the total count this is another option to do.

Count from mongo of the same query with limit 1 (Node.js)

My query is :
event.find({
id:{$get:id},
num:"5"
}).limit(1)
but i also do the same query and get count result (from the same query)
so 1 result is the last one with limit 1 and second is the count without the limit
how do i do that in MongoDB ?
one option i was think about is do the same query but without limit , for example, get 2000 results and count in node.js code.
my node code:
result = await this.collection.find({
'val':obj.val,
'id' : {$lt: id}
}).sort({id:-1}).limit(1).project({_id:0})
let count = await result.count()
but count always return 1 (because the count should ignore the id option)
is it possible?
example:
request is name=yy and id =3
1- { id :3 , name:yy },
2 - {id : 2 , name:yy}
another example
result will be : row 2 (3 greater than 2) . with count = 2
1- { id :3 , name:yy },
2 - {id : 2 , name:xx}
result will be : [] with count = 1 . (the same query without id < id)
You can try below aggregation using $facet
result = await this.collection.aggregate([
{ "$facet": {
"count": [
{ "$match": { "id": { "$lte": id }}},
{ "$count": "count" }
],
"data": [
{ "$match": { "val": obj.val, "id" : id }},
{ "$sort": { "id": -1 }},
{ "$limit": 1 },
{ "$project": { "_id": 0 }}
]
}}
])
result = result[0].data
const count = result[0].count

how to pass multiple element in $match(aggregation) in MongoDB

db.tst_col.aggregate( { $match : { abc : {1,2,3} }})
How to check for the multiple values for the abc, like i have to check the value for abc from 1 to 20 in one statement
How to do that?
You can use any query arguments in the aggregation framework like you can with normal queries, so you can do:
db.tst_col.aggregate( { $match: { abc: { $gte: 1, $lte: 20 } } } );
If you don't want a range, you can do it as follows:
db.tst_col.aggregate( { $match: { abc: { $in: [ 1, 4, 12, 17, 20 ] } } } );