mongoDB - find first x documents, where rolling sum of their fields exceeds certain value - mongodb

I have a mongoDB collection of documents like this:
{
"_id": 1,
"size": 10,
"name": "ABCD"
}
I would like to:
Sort them by "name" in ascending order
Return however many first documents from the result, where their cumulative "size" will be greater or equal to 100
I have briefly looked into $redact stage of aggregation framework, but I can't figure out whether I can store the cumulative sum outside the document. What would be the best approach to solve this problem?
EDIT:
An example collection:
{ "name": "AAAA", "size": 2}
{ "name": "BBBB", "size": 4}
{ "name": "CCCC", "size": 3}
So the query would be designed to return the first X documents, in order of their appearance, when their cumulative size reaches 6.
So output will be (because 2+4 is 6):
{ "name": "AAAA", "size": 2}
{ "name": "BBBB", "size": 4}
The only thing I can think of is to use the Cursor on the application level, and keep adding documents to result set, incrementing the "size" counter by value in the document. But is there a way to do that using Aggregation framework, for example?
EDIT2:
I also came across the 'rolling sum' terminology and using map-reduce. Sadly, in my case I would want the map-reduce operation to terminate when a global scope variable gets to or over a certain value, and I don't think it's possible (mapReduce will go over all documents fed to it at the outset).

Related

Cursor-based pagination without `skip()` based on frequently dynamically updated field without skipping documents

The context
I have a MongoDB collection, items, that looks like this:
{
"_id": ObjectId(...),
"score": 42,
"data": "some text"
},
{
"_id": ObjectId(...),
"score": 95,
"data": "some text"
},
{
"_id": ObjectId(...),
"score": 1841,
"data": "some text"
},
{
"_id": ObjectId(...),
"score": 11,
"data": "some text"
},
It has potentially 50,000+ documents inside it, where the score field changes dynamically very frequently (it's a vote tally that records user's upvotes and downvotes).
What I need to do
I'm trying to infinitely paginate through this collection, sorting documents by the highest score, loading them sequentially, highest score to lowest, likely in bunches of ~25 at a time.
The only current way I know how
Use skip to provide an offset based on the last document I've loaded each call to the database, and only load new documents that have a score less than the last document's. The downside to this is that if I have multiple documents with the same score as the last seen one, I'd skip them when I only load new ones with a score less than the last seen one.
Additionally, I've read using skip() is extremely inefficient.
Conclusion
Do I have to use this inefficient solution, that would also result in me skipping documents?
Is there a better way?

MongoDB: returning documents in order until a condition match

In a MongoDB collection, I have documents with a "position" field for ordering and an optional "date" field, e.g.
[
{
"_id": "doc1",
"position": 1
},
{
"_id": "doc2",
"position": 2,
"date": "2021-05-20T08:00:00.000Z"
},
{
"_id": "doc3",
"position": 3
},
{
"_id": "doc4",
"position": 4,
"date": "2021-05-20T08:00:00.000Z"
}
]
I would like the query this collection to get the documents "before" a specified date, in position order. The algorithm would be:
find the first element whose date is "after" the specified date
return all the documents whose position is less than the position of the element found, sorted by "position"
I have implemented this algorithm naïvely with 2 independent queries. However, I suspect it can be done with a single call to the database, but I have no idea how to proceed. Maybe with an aggregation pipeline?
Can someone give me a clue how this can be done?
EDIT: Here are the current queries I use (roughly):
limit_element = db.getCollection('collection').find({
"date": { "$gte": ISODate("2021-05-20T08:00:00.000Z") }
}).sort({
"position": 1
}).limit(1)
position = limit_element['position']
elements = db.getCollection('collection').find({
"position": { "$lt": position }
}).sort({
"position": 1
})
You can use an aggregation pipeline with two match clauses. Essentially its the same thing as you do now but within one DB access so a bit faster. With aggregation you can acess results from the previus stage to use in the next stage. If that is worth it you have to decide. I think your naive approach is sensible. In any case this a conditional problem so you will have to first find one and then do the other. Difference is just where you do the steps.

Is there a way to find all items that fit a mathematic equation?

I want to create a query that will return every document that fits a given mathematic equation.
My goal is given a document's id, I will return every document whose value's AND the given document's value is larger than 0.
For example, if this is the DB:
[
{
"_id": 1,
"value": 24
},
{
"_id": 2,
"value": 32
},
{
"_id": 3,
"value": 56
},
]
Given the id 1, I want to return only 3.
If this is impossible in mongo, I would like to get recommendations for a DB which fits this action
If you use map/reduce you could probably perform the calculations on the server side. You'll still be performing full collection scans, rephrasing your formula to produce the _id you want will give you faster queries.

Query object with max field on MongoDB

I am new to MongoDB and I use Atlas & Charts in order to query and visualize the results.
I want to create a graph that shows the max amount of money every day, and indicate the person with the max amount of money.
for example:
if my collection contains the following documents:
{"date": "15-12-2020", "name": "alice", "money": 7}
{"date": "15-12-2020", "name": "bob", "money": 9}
{"date": "16-12-2020", "name": "alice", "money": 39}
{"date": "16-12-2020", "name": "bob", "money": 25}
what should be the query I put on query box (on "Charts") in order to create a graph with the following result?
date | max_money | the_person_with_max_money
15-12-2020 9 bob
16-12-2020 39 alice
You have to use an aggregation and I think this should works.
First of all $sort values by money (I'll explain later why).
And then use $group to group values by date.
The query looks like this:
db.collection.aggregate([
{
"$sort": { "money": -1 }
},
{
"$group": {
"_id": "$date",
"max_money": { "$max": "$money" },
"the_person_with_max_money": { "$first": "$name" }
}
}
])
Example here
How this works? Well, there is a "problem" using $group, is that you can't keep values for the next stage unless you uses an accumulator, so, the best way it seems is to use $first to get the first name.
And this is why is sorted by money descendent, to get the name whose money value is the greatest at first position.
So, sorting we ensure that the first value is what you want.
And then using group to group the documents with the same date and create the fields max_money and the_person_with_max_money.

Is updating Embedded Documents in MongoDB a Manual process?

I am not overly familiar with Mongodb yet , but I have a question about embedded documents.
I have seen a number of posts which show you how to update embedded documents through some update query.
My question is this: If I have a collection with embedded documents - which is denormalised for performance ; and one of the embedded documents changes, then do I need to manually update all the embedded documents or is there some way of specifying the link in MongoDB to Auto-Update?
For Example:
An Order record might look like the structure below. Note there is a Product item in one of the rows.
Lets say the ItemName field changed to "Product1a" in the product from a different collection and I want to update the product in every single order where this exists. Is that a manual process - or is there a way od setting it up in Mongodb to auto-update embedded documents?
{
"id": "ccc1beb1-e022-11e9-97f0-e7e789106ab2",
"type": "order",
"orderNumber": "ORD-100209857x",
"orderDate": "2019-09-26T17:42:31.000+12:00",
"orderItems": [
{
"discount": 0,
"price": 24.4944,
"product": {
"id": "ccc1beb1-e022-11e9-97f0-e7e789106ab2",
"itemNumber": "prd1",
"itemName": "Product1"
},
"qty": 4,
"rowTotal": 97.96,
"taxAmount": 9.8
},
{
"discount": 0,
"price": 3.21,
"itemName": "Shipping",
"qty": 1,
"rowTotal": 3.21,
"taxAmount": 0
}
]
}
Not sure what you mean by manual process, but here is some sample code to update all the documents
db.collection.updateMany({}, {$set:{"orderItems.product.itemName": "updatedProductName"}})
Let me know if this is not what you are looking for.