Cumulative SUM with MongoDB 4.2 - mongodb

I am using MongoDB 4.2, the collection is the following:
{ "_id" : 1, "group_id": 1, "amount" : 10 }
{ "_id" : 2, "group_id": 1, "amount" : -5 }
{ "_id" : 3, "group_id": 1, "amount" : 8 }
{ "_id" : 4, "group_id": 2, "amount" : -1 }
{ "_id" : 5, "group_id": 2, "amount" : 7 }
{ "_id" : 6, "group_id": 2, "amount" : -2 }
{ "_id" : 7, "group_id": 3, "amount" : 10 }
{ "_id" : 8, "group_id": 3, "amount" : 15 }
How can i create a running sum of the documents?
EXPECTED RESULT:
Note1 :
The documents must be filtered by _id giving a range for example between _id:2 and _id:7.
The result should be the following:
{ "_id" : 2, "group_id": 1, "total_amount" : -5 }
{ "_id" : 3, "group_id": 1, "total_amount" : 3 } // (it was 8 but summing the previous -5 + 8 (current) = 3
{ "_id" : 4, "group_id": 2, "total_amount" : -1 }
{ "_id" : 5, "group_id": 2, "total_amount" : 6 } // (it was 7 but summing the previous -1 + 7 = 6
{ "_id" : 6, "group_id": 2, "total_amount" : 4 } // (it was -2 but summing the previous 6 - 2 = 4
{ "_id" : 7, "group_id": 3, "total_amount" : 10 }
Note2:
The running sum must deal with group_id and sort by _id (the sort is important for summing of course)
I should only sum documents with the same group_id (ordered by _id)

Related

Querying for arrays in MongoDB

Suppose in collection I have following documents:
[
{"title": "t1", "fingerprint":[1, 2, 3]},
{"title": "t2", "fingerprint":[4, 5, 6]}
]
I want to query documents in which at least one element in fingerprint at given position is equal to my querying array.
For example:
query([1, 7, 9]) should return [{"title": "t1", "fingerprint":[1, 2, 3]}]
query([1, 5, 9]) should return [{"title": "t1", "fingerprint":[1, 2, 3]}, {"title": "t2", "fingerprint":[4, 5, 6]}]
but query([5,1,9]) should return none records, because neither of records have same value at any of the positions in fingerprint array.
How to write given query?
When you are trying to match only documents with arrays where the sequence [ 1 2, 3 ] appears in values field and only in that exact order, you can do it this way:
db.testcol.find()
{ "_id" : "first", "value" : [ 1, 2, 3 ] }
{ "_id" : "second", "value" : [ 4, 5, 6 ] }
{ "_id" : "third", "value" : [ 1, 12, 13 ] }
{ "_id" : "fourth", "value" : [ 3, 2, 1 ] }
{ "_id" : "fifth", "value" : [ 1, 12, 13, 2, 3 ] }
{ "_id" : "sixth", "value" : [ 3, 2, 1, 2, 3 ] }
> db.testcol.aggregate([{$addFields:{
cmp: {$in:[
{$literal:[1,2,3]},
{$map: {
input:{$range:[0, {$subtract:[{$size:"$value"},2]}]},
as:"l",
in: {$slice: [ "$value", "$$l", 3] }
}}
]}
}}])
{ "_id" : "first", "value" : [ 1, 2, 3 ], "cmp" : true }
{ "_id" : "second", "value" : [ 4, 5, 6 ], "cmp" : false }
{ "_id" : "third", "value" : [ 1, 12, 13 ], "cmp" : false }
{ "_id" : "fourth", "value" : [ 3, 2, 1 ], "cmp" : false }
{ "_id" : "fifth", "value" : [ 1, 12, 13, 2, 3 ], "cmp" : false }
{ "_id" : "sixth", "value" : [ 3, 2, 1, 2, 3 ], "cmp" : true }
What the $addFields stage does is checks if [1,2,3] appears in a list of three element arrays starting at position 0 of value array and moving forward till two positions before the end.
As you can see, it's now trivial to add a $match stage to filter out documents where cmp is not true.
You can use the .$index notation to perform such a search.
Example for your query([1, 7, 9])
db.coll.find({$or: [{"fingerprint.0": 1}, {"fingerprint.1": 7 }, {"fingerprint.2": 9}]})
{ "_id" : ObjectId("59170da907e34e73c0c93a9b"), "title" : "t1", "fingerprint" : [ 1, 2, 3 ] }
And query([1, 5, 9])
db.coll.find({$or: [{"fingerprint.0": 1}, {"fingerprint.1": 5 }, {"fingerprint.2": 9}]})
{ "_id" : ObjectId("59170da907e34e73c0c93a9b"), "title" : "t1", "fingerprint" : [ 1, 2, 3 ] }
{ "_id" : ObjectId("59170da907e34e73c0c93a9c"), "title" : "t2", "fingerprint" : [ 4, 5, 6 ] }
$in operator is used to match a value against list of values.
According to above mentioned description please try executing following query in MongoDB shell
db.collection.find({fingerprint:{$in:[1,7,9]}})

Mongo db sorting on multikey index fields

I was going through mongo db indexes and found this when i create index on multi key field and try to sort the result the behavior is strange.
For example:
> db.testIndexes.find();
{ "_id" : ObjectId("584e6ca8d23d3b48f9cb819d"), "type" : "depart", "item" : "aaa", "ratings" : [ 5, 8, 9 ] }
{ "_id" : ObjectId("584e6cb2d23d3b48f9cb819e"), "type" : "depart", "item" : "aaa", "ratings" : [ 2, 3, 4 ] }
{ "_id" : ObjectId("584e6cbdd23d3b48f9cb819f"), "type" : "depart", "item" : "aaa", "ratings" : [ 10, 6, 1 ] }
db.testIndexes.createIndex({ratings:1});
Now if i sue these queries :
db.testIndexes.find().sort({ratings:1}).pretty();
Result is like this
{
"_id" : ObjectId("584e6cbdd23d3b48f9cb819f"),
"type" : "depart",
"item" : "aaa",
"ratings" : [
10,
6,
1
]
}
{
"_id" : ObjectId("584e6cb2d23d3b48f9cb819e"),
"type" : "depart",
"item" : "aaa",
"ratings" : [
2,
3,
4
]
}
{
"_id" : ObjectId("584e6ca8d23d3b48f9cb819d"),
"type" : "depart",
"item" : "aaa",
"ratings" : [
5,
8,
9
]
}
and for query
db.testIndexes.find().sort({ratings:-1}).pretty();
Results are:
{
"_id" : ObjectId("584e6cbdd23d3b48f9cb819f"),
"type" : "depart",
"item" : "aaa",
"ratings" : [
10,
6,
1
]
}
{
"_id" : ObjectId("584e6ca8d23d3b48f9cb819d"),
"type" : "depart",
"item" : "aaa",
"ratings" : [
5,
8,
9
]
}
{
"_id" : ObjectId("584e6cb2d23d3b48f9cb819e"),
"type" : "depart",
"item" : "aaa",
"ratings" : [
2,
3,
4
]
}
As results does not seems to follow and order so can anyone help how mongo is sorting these results.
Thanks
Virendra
Well it does seem like the results are not following any order but actually they are. In your first sort {ratings:1}, what's happening here is the results are ordered by the smallest element in ratings. Since these are your lists:
[ 10, 6, 1 ] [ 2, 3, 4 ] [ 5, 8, 9 ]
So the list [ 10, 6, 1 ] smallest element is 1, the list [ 2, 3, 4 ] smallest element is 2 and the list [ 5, 8, 9 ] smallest element is 5. So the results are ordered in that way.
When you sort by descending, the same order happens but by maximum element in ratings.
Hope this helps.

How do I calculate a field in all documents based on a value of a particular document in the same collection?

I am new to MongoDb and trying to achieve some basic calculation in it. I have collection, calc, as below
{ "_id" : 1, "value" : 10}
{ "_id" : 2, "value" : 20}
{ "_id" : 3, "value" : 20}
{ "_id" : 4, "value" : 30}
{ "_id" : 5, "value" : 30}
{ "_id" : 6, "value" : 30}
I want to add the value of "_id":1 to all value field of the documents in that collection and create a new field with the calculated result. So the final result I am looking for is as below.
{ "_id" : 1, "value" : 10, "sumup":20 }
{ "_id" : 2, "value" : 20, "sumup":30 }
{ "_id" : 3, "value" : 20, "sumup":30 }
{ "_id" : 4, "value" : 30, "sumup":40 }
{ "_id" : 5, "value" : 30, "sumup":40 }
{ "_id" : 6, "value" : 30, "sumup":40 }
You could try this in mongo shell:
db.collection.aggregate([
{
"$project": {
"value": 1,
"sumup": {
"$add": [ "$value", (db.collection.findOne({"_id": 1})).value ]
}
}
}
])

How to aggregate to get every combination of two users per movie grouping key?

Here is my collection:
{ "user" : 1, "rate" : 1, "movie" : 1}
{ "user" : 1, "rate" : 3, "movie" : 3}
{ "user" : 1, "rate" : 2, "movie" : 4}
{ "user" : 1, "rate" : 3, "movie" : 5}
{ "user" : 2, "rate" : 4, "movie" : 1}
{ "user" : 2, "rate" : 2, "movie" : 3}
{ "user" : 2, "rate" : 5, "movie" : 6}
{ "user" : 3, "rate" : 1, "movie" : 3}
Here is the result I want get:
{ "user1" : 1, "rate1" : 1,"user2" : 2, "rate2" : 4, "movie" : 1}
{ "user1" : 1, "rate1" : 3,"user2" : 2, "rate2" : 2, "movie" : 3}
{ "user1" : 1, "rate1" : 3,"user2" : 3, "rate2" : 1, "movie" : 3}
{ "user1" : 2, "rate1" : 2,"user2" : 3, "rate2" : 1, "movie" : 3}
For every "movie" that is every combination of the "user" and "rate" values as a pair where there is more than one "user" for that "movie"
You are looking for the "permutations" when grouping on "movie" for two or more "user" values. It's not the sort of thing you really get from a "database" query.
But if you $group on "movie" and $push the other data, you can then filter out any content with less than two entries in that array and work out your "possible combinations" from there. So you can ask the database to do the grouping and filtering, but the the rest is for algorithms in "set theory".
So this is "part" aggregation statement and "part" processing of the code in result:
db.movies.aggregate([
{ "$group": {
"_id": "$movie",
"people": {
"$push": {
"user": "$user",
"rate": "$rate"
}
}
}},
{ "$redact": {
"$cond": {
"if": { "$gt": [{ "$size": "$people" }, 1] },
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$sort": { "_id": 1 } }
]).forEach(function(doc) {
var n = doc.people.length;
var i,j;
for (i = 0; i < n; i++) {
for (j = i + 1; j < n; j++) {
printjson({
"user1": doc.people[i].user,
"rate1": doc.people[i].rate,
"user2": doc.people[j].user,
"rate2": doc.people[j].rate,
"movie": doc._id
})
}
}
})
So the aggregation part itself first does a $group on the "movie" values as mentioned and creates the array with $push. Since not all results are going to have an array with more than one entry, you then remove those with $redact. This is a "logical filter" that uses $size to compare the generated array and see if it has "more than" ( $gt ) one entry.
The results at this stage look like this, after also applying a $sort:
{
"_id" : 1,
"people" : [
{
"user" : 1,
"rate" : 1
},
{
"user" : 2,
"rate" : 4
}
]
}
{
"_id" : 3,
"people" : [
{
"user" : 1,
"rate" : 3
},
{
"user" : 2,
"rate" : 2
},
{
"user" : 3,
"rate" : 1
}
]
}
The next part is really up to an "algorithm" to generate the possible "pair" combinations. It's a pretty common and well known approach, so you just run the loops on the arrays of each document returned in response to produce the result:
{ "user1" : 1, "rate1" : 1, "user2" : 2, "rate2" : 4, "movie" : 1 }
{ "user1" : 1, "rate1" : 3, "user2" : 2, "rate2" : 2, "movie" : 3 }
{ "user1" : 1, "rate1" : 3, "user2" : 3, "rate2" : 1, "movie" : 3 }
{ "user1" : 2, "rate1" : 2, "user2" : 3, "rate2" : 1, "movie" : 3 }

Mongo Query question $gt,$lt

I have a query below. I want get items between 4 and 6 so only a:1 should match because it has the value 5 in b.
> db.test.find({ b : { $gt : 4 }, b: {$lt : 6}});
{ "_id" : ObjectId("4d54cff54364000000004331"), "a" : 1, "b" : [ 2, 3, 4, 5 ] }
{ "_id" : ObjectId("4d54d0074364000000004332"), "a" : 2, "b" : [ 2, 4, 6, 8 ] }
>
Can someone tell be why a:2 is matching this query? I can't really see why it is being returned.
I also tried what was specified in the tutorial but id did not seem to work:
> db.test.find({ b : { $gt : 4, $lt : 6}});
{ "_id" : ObjectId("4d54cff54364000000004331"), "a" : 1, "b" : [ 2, 3, 4, 5 ] }
{ "_id" : ObjectId("4d54d0074364000000004332"), "a" : 2, "b" : [ 2, 4, 6, 8 ] }
>
And this one to avoid any confusion regarding GT/GTE
> db.test.find({b: {$gt: 4.5, $lt: 5.5}});
{ "_id" : ObjectId("4d54cff54364000000004331"), "a" : 1, "b" : [ 2, 3, 4, 5 ] }
{ "_id" : ObjectId("4d54d0074364000000004332"), "a" : 2, "b" : [ 2, 4, 6, 8 ] }
>
only a:1 should be returned.
As suggested, I gave $elemMatch a try but it did not appear to work either (objectIds are different because I am on a different machine)
> db.test.find();
{ "_id" : ObjectId("4d5a24a5e82e00000000433f"), "a" : 1, "b" : [ 2, 3, 4, 5 ] }
{ "_id" : ObjectId("4d5a24bbe82e000000004340"), "a" : 2, "b" : [ 2, 4, 6, 8 ] }
> db.test.find({b: {$elemMatch: {$gt : 4, $lt: 6 }}});
>
No documents were returned.
This is a really confusing topic. I work at 10gen and I had to spend a while wrapping my head around it ;)
Let's walk through how the query engine processes this query.
Here's the query again:
> db.test.find({ b : { $gt : 4, $lt : 6}});
When it gets to the record that seems like it shouldn't match...
{ "_id" : ObjectId("4d54cff54364000000004331"), "a" : 1, "b" : [ 2, 4, 6, 8 ] }
The match is not performed against each element of the array, but rather against the array as a whole.
The comparison is performed in three steps:
Step 1: Find all documents where b has a value greater than 4
b: [2,4,6,8] matches because 6 & 8 are greater than 4
Step 2: Find all documents where b has a value less than 6
b: [2,4,6,8] matches because 2 & 4 are less than 6
Step 3: Find the set of documents that matched in both step 1 & 2.
The document with b: [2,4,6,8] matched both steps 1 & 2 so it is returned as a match. Note that results are also de-duplicated in this step, so the same document won't be returned twice.
If you want your query to apply to the individual elements of the array, rather than the array as a whole, you can use the $elemMatch operator. For example
> db.temp.find({b: {$elemMatch: {$gt: 4, $lt: 5}}})
> db.temp.find({b: {$elemMatch: {$gte: 4, $lt: 5}}})
{ "_id" : ObjectId("4d558b6f4f0b1e2141b66660"), "b" : [ 2, 3, 4, 5, 6 ] }
$gt
Syntax: {field: {$gt: value} }
eg:
db.inventory.find( { qty: { $gt: 20 } } )
$lt
Syntax: {field: {$lt: value} }
eg:
db.inventory.find( { qty: { $lt: 20 } } )
eg2:
db.inventory.find({ qty : { $gt : 20, $lt : 60}});
.find( {$and:[ {b:{$gt:4}}, {b:{$lt:6}} ]} )
Below is the detailed document for the understanding,
db.test.insertMany([
{"_id":1, "x":11, "a":1, "b":[1]},
{"_id":2, "x":15, "a":4, "b":[1,2,3]},
{"_id":3, "x":19, "a":5, "b":[1,2,3,4,5]},
{"_id":4, "x":13, "a":6, "b":[6,8,10]},
{"_id":5, "x":16, "a":13, "b":[11]},
{"_id":6, "x":18, "a":11, "b":[5]},
{"_id":7, "x":15, "a":15, "b":[3,5,7]},
{"_id":8, "x":12, "a":18, "b":[3,7,9]},
{"_id":9, "x":14, "a":21, "b":[4,6]}
]);
Below queries are included to make idea clear about comparision,
Query-1: db.test.find({b: {$lt: 6}}); //(any element of b) < 6
{ "_id" : 1, "x" : 11, "a" : 1, "b" : [ 1 ] }
{ "_id" : 2, "x" : 15, "a" : 4, "b" : [ 1, 2, 3 ] }
{ "_id" : 3, "x" : 19, "a" : 5, "b" : [ 1, 2, 3, 4, 5 ] }
{ "_id" : 6, "x" : 18, "a" : 11, "b" : [ 5 ] }
{ "_id" : 7, "x" : 15, "a" : 15, "b" : [ 3, 5, 7 ] }
{ "_id" : 8, "x" : 12, "a" : 18, "b" : [ 3, 7, 9 ] }
{ "_id" : 9, "x" : 14, "a" : 21, "b" : [ 4, 6 ] }
`
Query-2: db.test.find({b: {$gt: 4}, b:{$lt : 6}});// it is translated to db.test.find({b:{$lt : 6}}); hence the outcome of Query-1 and Query-2 is the same.
{ "_id" : 1, "x" : 11, "a" : 1, "b" : [ 1 ] }
{ "_id" : 2, "x" : 15, "a" : 4, "b" : [ 1, 2, 3 ] }
{ "_id" : 3, "x" : 19, "a" : 5, "b" : [ 1, 2, 3, 4, 5 ] }
{ "_id" : 6, "x" : 18, "a" : 11, "b" : [ 5 ] }
{ "_id" : 7, "x" : 15, "a" : 15, "b" : [ 3, 5, 7 ] }
{ "_id" : 8, "x" : 12, "a" : 18, "b" : [ 3, 7, 9 ] }
{ "_id" : 9, "x" : 14, "a" : 21, "b" : [ 4, 6 ] }
Query-3: db.test.find({b: {$gt: 4, $lt: 6}});
{ "_id" : 3, "a" : 5, "b" : [ 1, 2, 3, 4, 5 ] }//(element 5) > 4 and (element 5) < 6` => The matching element is same here element 5
{ "_id" : 6, "a" : 11, "b" : [ 5 ] }//(element 5) > 4 and (element 5) < 6 => The matching element is same here element 5
{ "_id" : 7, "a" : 15, "b" : [ 3, 5, 7 ] }//(element 5) > 4 and (element 5) < 6 => The matching element is same here element 5
{ "_id" : 8, "a" : 18, "b" : [ 3, 7, 9 ] }//(element 5) > 7 and (element 3) < 6 => The matching elements are different i.e. here element 5 and element 3
{ "_id" : 9, "a" : 21, "b" : [ 4, 6 ] }//(element 6) > 4 and (element 4) < 6 => The matching elements are different i.e. here element 4 and element 6
Query-4: db.test.find({b: {$elemMatch: {$gt : 4, $lt: 6 }}});
{ "_id" : 3, "a" : 5, "b" : [ 1, 2, 3, 4, 5 ] }//(element 5) > 4 and (element 5) <6
{ "_id" : 6, "a" : 11, "b" : [ 5 ] }//(element 5) > 4 and (element 5) <6
{ "_id" : 7, "a" : 15, "b" : [ 3, 5, 7 ] }//(element 5) > 4 and (element 5) <6
Query-3 and Query-4 are interesting to know about.
Query-3: List document having array b element x>4 and element y<6. The elements x and y may be the same or the different.
Query-4: List document having array b element x>4 and element y<6. The elements x and y must be the same.
Because you did not check the documentation.
See
http://www.mongodb.org/display/DOCS/Advanced+Queries
and check for "ranges" on the page.
Neither is your query syntax correct (compare against the example)
nor does your "why a:2" part of the question make any sense since 'a' is not involved in your query. If you want to search for a:1 then you have to include it in your query.
Keep in mind that all query clauses are AND combined by default unless you use the $or operator.