What pipeline can i use to select all ids until i hit the sum of 180 and get the _ids. Below is a sample of the data that i've filtered out already. In this case it should select the first two items.
[
{
"_id": "6048b2b190422d0066d90740",
"Code": "A0ABI61YH",
"Amount": 100
},
{
"_id": "6048b3cc7e4b350072424f4c",
"Code": "A0ABEAXX6",
"Amount": 100
},
{
"_id": "6048b5167e4b350072424f50",
"Code": "A0ABCENPD",
"Amount": 100
}
]
I don't think is there any straight way to achieve this, if you really want to then try, this will only work when your data size below 16MB, because we are going to group your documents in a document in array, second this may cause the performance issues,
$group by null and group all documents and required fields (_id, Amount) in result
$reduce to iterate loop of result array,
initialValue declare initial value for Amount is 0 and result is []
in check condition if initialValue's Amount is less than 180 then concat current object's _id and initialValue's _ids using $concatArrays and sum current object's Amount and initialValue's Amount, otherwise return same value
db.collection.aggregate([
{
$group: {
_id: null,
result: {
$push: {
_id: "$_id",
Amount: "$Amount"
}
}
}
},
{
$project: {
_id: 0,
result: {
$reduce: {
input: "$result",
initialValue: { Amount: 0, _ids: [] },
in: {
$cond: [
{ $lt: ["$$value.Amount", 180] },
{
_ids: { $concatArrays: ["$$value._ids", ["$$this._id"]] },
Amount: { $sum: ["$$value.Amount", "$$this.Amount"] }
},
"$$value"
]
}
}
}
}
}
])
Playground
Related
Im new to mongoDB, so having some difficulties filtering my collections as I need.
I have this collection
[
{
"id": "sdfsdfsdf",
"key": "tryrtyrty",
"createdAt": "2017-01-28T01:22:14.398Z",
"counts": [
170
],
"value": "Something"
},
{
"id": "hjmhjhjm",
"key": "yuiyuiyui",
"createdAt": "2017-01-28T01:22:14.398Z",
"counts": [
150,
160
],
"value": "Something"
}
]
I want to filter by range of dates (min-max date) and range of counts, meaning I want to give a min and max value for the totalCount of the sum in the field. Example, I would like to filter results whose min counts sum is 200 and max 400. This would only return the second result (the sum is 310, while the first result the sum is 170).
Right now I have this:
db.collection.aggregate([
{
$project: {
totalCount: {
$sum: {
"$filter": {
"input": "$counts",
"as": "bla",
"cond": {
"$gte": [
"$sum", // I think the error is here, I dont know how to reference the sum of the list
300 //I want records whose count sum is more than this value
]
}
}
}
}
}
}
])
This returns all the records with TotalCount on 0, which is not want I want, I would like the records matching the count condition with the correct TotalCount (and eventually matching the dates as well)
[
{
"_id": ObjectId("5a934e000102030405000000"),
"totalCount": 0
},
{
"_id": ObjectId("5a934e000102030405000001"),
"totalCount": 0
}
Desired output
[
{
"_id": ObjectId("5a934e000102030405000001"),
"totalCount": 310,
"key": "yuiyuiyui",
"createdAt": "2017-01-28T01:22:14.398Z"
}
]
Any help would be greatly appreciated. Even more if it comes with both dates and count filter.
You should not use $filter as it doesn't suitable for this scenario.
Stages:
set - Create totalCounts with $sum all elements in counts array.
$match - Fiter the documents which has totalCounts within the range.
$unset - Remove fields for decorating output document.
db.collection.aggregate([
{
$set: {
"totalCounts": {
$sum: "$counts"
}
}
},
{
$match: {
totalCounts: {
$gte: 200,
$lte: 400
}
}
},
{
$unset: [
"counts",
"id"
]
}
])
Sample Mongo Playground
For date range filter, you need $expr and $and operator as below
{
$match: {
totalCounts: {
$gte: 200,
$lte: 400
},
$expr: {
$and: [
{
$gte: [
{
"$toDate": "$createdAt"
},
/* Date from */
]
},
{
$lte: [
{
"$toDate": "$createdAt"
},
/* Date to */
]
}
]
}
}
}
Sample Mongo Playground (with date range filter)
I have a collection billow and i need to find date wise total cost and sum of all cost available in this collection. I can find total cost of a day but failed to get sum of all cost from the collection
[{
"date":"12-2-2015",
"cost":100
},
{
"date":"13-2-2015",
"cost":10
},
{
"date":"12-2-2015",
"cost":40
},
{
"date":"13-2-2015",
"cost":30
},
{
"date":"13-2-2015",
"cost":80
}]
I can find output like
[{
"day": "12-2-2015",
"cost": 140
},{
"day": "13-2-2015",
"cost": 120
}]
But I want output like this.
{
"day": "12-2-2015",
"cost": 140,
"total": 260
}
use this aggregate I dont add $match stage you could add to match date
db.collection.aggregate([
{
$group: {
_id: null,
orig: {
$push: "$$ROOT"
},
"total": {
$sum: "$cost"
},
}
},
{
$unwind: "$orig"
},
{
$project: {
date: "$orig.date",
cost: "$orig.cost",
total: "$total"
}
},
{
$group: {
_id: "$date",
cost: {
$sum: "$cost"
},
orig: {
$push: "$$ROOT.total"
}
},
},
{
"$unwind": "$orig"
},
{
$group: {
_id: {
_id: "$_id",
cost: "$cost",
total: "$orig"
},
},
},
{
$project: {
date: "$_id._id",
"cost": "$_id.cost",
total: "$_id.total",
_id: 0
}
}
])
https://mongoplayground.net/p/eN-pDg2Zz7u
It is like 2 queries.
There are 3 solutions that i can think of
2 queries (works no matter the collection size)
1 query and facet (the bellow solution)
group and pack each group in an array
(limitation = ngroups(distinct day dates) small enough to fit in 1 array 16MB distinct dates,
(which is true for like 200.000? distinct days see this)
1 query no facet
for example group and pack all collection into 1 array
(limitation = all collection must fit in 100MB memory
because of $push see this)
*for the limits i think they are like that, based on what i have understanded.
Query
Test code here
db.collection.aggregate([
{
"$facet": {
"total": [
{
"$group": {
"_id": null,
"total": {
"$sum": "$cost"
}
}
}
],
"coll": [
{
"$group": {
"_id": "$date",
"cost": {
"$sum": "$cost"
}
}
}
]
}
},
{
"$unwind": {
"path": "$coll"
}
},
{
"$project": {
"total": {
"$let": {
"vars": {
"t": {
"$arrayElemAt": [
"$total",
0
]
}
},
"in": "$$t.total"
}
},
"date": "$coll._id",
"cost": "$coll.cost"
}
}
])
I would do one query to get a cursor, then iterate the cursor and at the same time sum the total cost and push the relevant doc, then add the total to each group. In this way you perform only one query to mongodb and let your server do the rest while keeping the code simple.
// 1. Fetch the groups
const grouped = db.data.aggregate([
{ $group: {
_id: "$date",
cost: { $sum: "$cost" }
}}
]);
// 2. Iterate the cursor, push the results into an array while summing the total cost
let total = 0;
const result = [];
grouped.forEach(group => {
total += group.cost;
result.push(group); // push as much as your limit
});
// 3. Add total to each group
result.forEach(group => group.total = total);
Consider this mongo document, an order with an internal list of products with their counts:
{
ordernumber: "1234"
detail: [
{ "number": "987",
"count": 10 },
{ "number": "654",
"count": 5 }
]
}
How do we get the sum of all counts with mongodb shell? I always get zero for sum and dont know what to pass for _id.
db.preorders.aggregate([ { $match: {} }, { $group: { _id: "$_id", total: { $sum: "$detail.count" } } }])
You can do a $unwind first, then $group on null.
Here is the Mongo Playground for your reference.
db.getCollection('rien').aggregate([
{
$match: {
$and: [
{
"id": "10356"
},
{
$or: [
{
"sys_date": {
"$gte": newDate(ISODate().getTime()-90*24*60*60*1000)
}
},
{
"war_date": {
"$gte": newDate(ISODate().getTime()-90*24*60*60*1000)
}
}
]
}
]
}
},
{
$group: {
"_id": "$b_id",
count: {
$sum: 1
},
ads: {
$addToSet: {
"s": "$s",
"ca": "$ca"
}
},
files: {
$addToSet: {
"system": "$system",
"hostname": "$hostname"
}
}
}
},
{
$sort: {
"ads.s": -1
}
},
{
$group: {
"_id": "$b_id",
total_count: {
$sum: 1
},
"data": {
"$push": "$$ROOT"
}
}
},
{
$project: {
"_id": 0,
"total_count": 1,
results: {
$slice: [
"$data",
0,
50
]
}
}
}
])
When I execute this pipelines 5 times, it results in different set of documents. It is 3 node cluster. No sharding enabled. Have 10million documents. Data is static.
Any ideas about the inconsistent results? I feel I am missing some fundamentals here.
I can see 2 problems,
"ads.s": -1 will not work because, its an array field $sort will not apply in array field
$addToSet will not maintain sort order even its ordered from previous stage,
here mentioned in $addToSet documentation => Order of the elements in the output array is unspecified.
and also here mentioned in accumulators-group-addToSet => Order of the array elements is undefined
and also a JIRA ticket SERVER-8512 and DOCS-1114
You can use $setUnion operator for ascending order and $reduce for descending order result from $setUnion,
I workaround I am adding a solution below, I am not sure this is good option or not but you can use if this not affect performance of your query,
I am adding updated stages here only,
remain same
{ $match: {} }, // skipped
{ $group: {} }, // skipped
$sort, optional its up to your requirement if you want order by main document
{ $sort: { _id: -1 } },
$setUnion, treating arrays as sets. If an array contains duplicate entries, $setUnion ignores the duplicate entries, and second its return array in ascending order on the base of first field that we specified in $group stage is s, but make sure all element in array have s as first field,
$reduce to iterate loop of array and concat arrays current element $$this and initial value $$value, this will change order of array in descending order,
{
$addFields: {
ads: {
$reduce: {
input: { $setUnion: "$ads" },
initialValue: [],
in: { $concatArrays: [["$$this"], "$$value"] }
}
},
files: {
$reduce: {
input: { $setUnion: "$files" },
initialValue: [],
in: { $concatArrays: [["$$this"], "$$value"] }
}
}
}
},
remain same
{ $group: {} }, // skipped
{ $project: {} } // skipped
Playground
$setUnion mentioned in documentation: The order of the elements in the output array is unspecified., but I have tested every way its returning in ascending order perfectly, why I don't know,
I have asked question in MongoDB Developer Forum does-setunion-expression-operator-order-array-elements-in-ascending-order?, they replied => it will not guarantee of order!
I have a MongoDB database, which has a collection that contains all of the addresses from a country. Sometimes when I execute a query on that I have a chance that I receive about 200 results (house numbers within that street). I want to get the middle item of that result.
When I do that in my coding like this for example:
const result = Address.find({ street: "fooStreet" })
// results in an array with a length of let's say 200 (could also be 20, 49, 103, etc) items
I could split it in my coding like below:
const middleIndex = Math.round(result.length / 2);
const house = result[middleIndex];
But this means that the other records go to waste and use unnecessary bandwidth + computing power which should be handled by the database. Since the database OS is optimized for working with collections etc, I was wondering if I could achieve the same result in a mongodb query? See pseudo below:
db.getCollection("addresses")
.find({ street: "fooStreet" })
.helpMeHere()
// ^ do something to get the middle result from the N items
You can do as below
db.collection.aggregate([
{ //Any match condition
$match: {}
},
{
$group: {//get the total matching result
"_id": null,
data: {
$push: "$$ROOT"
},
count: {
$sum: 0.5
}
}
},
{
$project: {//get the second half
"result": {
"$slice": [
"$data",
{
"$toInt": {
"$multiply": [//Negating results records from the last
{
"$toInt": "$count"
},
-1
]
}
}
]
}
}
}
])
playground
To get one element:
playground
db.collection.aggregate([
{
$match: {}
},
{
$group: {
"_id": null,
data: {
$push: "$$ROOT"
},
count: {
$sum: 0.5
}
}
},
{
$project: {
"result": {
"$arrayElemAt": [//array access
"$data",
{
"$toInt": "$count"
}
]
}
}
}
])