How to group by multiple keys and values - mongodb

I have a collection of documents where each document has a nestes field outside with two values:
_id: 9287645ztiu234jgk2j3g5jh,
outside: {
temperature: 'low', // 'low' or 'high'
humidity: 'high', // 'low' or 'high'
},
... some more fields
temperature and humidity can have value low or high
I want to count how many times temperature: low, temperature: high, humidity: low, humidity: high is present in each document of the collection, so the query result for e.g. 14 documents should look like this:
{
temperatureLow: 2,
temperatureHigh: 12,
humidityLow: 8,
humidityHigh: 6,
}
I tried a $group (as the only stage in the aggregation pipeline) like this:
$group: {
_id: { temperature: '$outside.temperature', humidity: '$outside.humidity' },
count: { $sum: 1 },
},
And this gives me these documents (EDITED, first post had wrong data):
{
"_id": {
"temperature": "high",
"humidity": "high"
},
"count": 6
},
{
"_id": {
"temperature": "high",
"humidity": "low"
},
"count": 6
},
{
"_id": {
"temperature": "low",
"humidity": "low"
},
"count": 2
}
How can it be combined into on document?

It's possible. You need add project stage with the using cont operator before group:
{
$project: {
"temperatureLow": { $cond: { if: { $eq: ["$outside.temperature", "low"] }, then: 1, else: 0 }},
"temperatureHigh": { $cond: { if: { $eq: ["$outside.temperature", "high"] }, then: 1, else: 0 }},
"humidityLow": { $cond: { if: { $eq: ["$outside.humidity", "low"] }, then: 1, else: 0 }},
"humidityHigh": { $cond: { if: { $eq: ["$outside.humidity", "high"] }, then: 1, else: 0 }}
}
},
{
$group: {
_id: "result",
"temperatureLow": {$sum: "$temperatureLow"},
"temperatureHigh": {$sum: "$temperatureHigh"},
"humidityLow": {$sum: "$humidityLow"},
"humidityHigh": {$sum: "$humidityHigh"},
}
},
Update
or as notes Neil Lunn I can use cond inside sum operator without project stage:
{
$group: {
_id: "result",
"temperatureLow": {$sum: { $cond: { if: { $eq: ["$outside.temperature", "low"] }, then: 1, else: 0 }}},
"temperatureHigh": {$sum: { $cond: { if: { $eq: ["$outside.temperature", "high"] }, then: 1, else: 0 }}},
"humidityLow": {$sum:{ $cond: { if: { $eq: ["$outside.humidity", "low"] }, then: 1, else: 0 }}},
"humidityHigh": {$sum:{ $cond: { if: { $eq: ["$outside.humidity", "high"] }, then: 1, else: 0 }}}
}
},

Related

MongoDB - Query calculation and group multiple items

Let's say I have this data:
{"Plane":"5546","Time":"55.0", City:"LA"}
{"Plane":"5548","Time":"25.0", City:"CA"}
{"Plane":"5546","Time":"6.0", City:"LA"}
{"Plane":"5548","Time":"5.0", City:"CA"}
{"Plane":"5555","Time":"15.0", City:"XA"}
{"Plane":"5555","Time":"8.0", City:"XA"}
and more but I just visualize the data
I want to calculate and group all the time and plane, this is expected output:
{"_id:":["5546","LA"],"Sum":2,"LateRate":1,"Prob"0.5}
The sum is sum all the time, Late is sum all the time with time > "15" and Prob is Late/Sum
The code I have tried but it still is missing something:
db.Collection.aggregate([
{
$project: {
Sum: 1,
Late: {
$cond: [{ $gt: ["$Time", 15.0] }, 1, 0]
},
prob:1
}
},
{
$group:{
_id:{Plane:"$Plane", City:"$City"},
Sum: {$sum:1},
Late: {$sum: "$Late"}
}
},
{
$addFields: {
prob: {
"$divide": [
"$Late",
"$Sum"
]
}
}
},
])
db.collection.aggregate([
{
$project: {
Time: 1,
Late: {
$cond: [
{
$gt: [
{
$toDouble: "$Time"
},
15.0
]
},
"$Time",
0
]
},
prob: 1,
Plane: 1,
City: 1
}
},
{
$group: {
_id: {
Plane: "$Plane",
City: "$City"
},
Sum: {
$sum: {
"$toDouble": "$Time"
}
},
Late: {
$sum: {
$toDouble: "$Late"
}
}
}
},
{
$addFields: {
prob: {
"$divide": [
"$Late",
"$Sum"
]
}
}
}
])
Project limits the fields passed to the next stage
On string, you cannot perform all relational/arithmetic operations
Playground

Get current state from snapshot documents - mongoDB

I'm trying to get a list of current holders at specific times from a collection. My collection looks like this:
[
{
"time": 1,
"holdings": [
{ "owner": "A", "tokens": 2 },
{ "owner": "B", "tokens": 1 }
]
},
{
"time": 2,
"holdings": [
{ "owner": "B", "tokens": 2 }
]
},
{
"time": 3,
"holdings": [
{ "owner": "A", "tokens": 3 },
{ "owner": "B", "tokens": 1 },
{ "owner": "C", "tokens": 1 }
]
},
{
"time": 4,
"holdings": [
{ "owner": "C", "tokens": 0 }
]
}
]
tokens show the current holdings of an owner if the holdings have changed to the last document. I would like to change the collection so that holdings always includes the full current holdings for any point in time.
At time: 1, the holdings are: A: 2, B: 1.
At time: 2, the holdings are: A: 2, B: 2. The collections does not include A's holdings however, because they haven't changed. So what I'd like to get is:
[
{
"time": 1,
"holdings": [
{ "owner": "A", "tokens": 2 },
{ "owner": "B", "tokens": 1 }
]
},
{
"time": 2,
"holdings": [
{ "owner": "A", "tokens": 2 }, // merged from prev doc.
{ "owner": "B", "tokens": 2 }
]
},
{
"time": 3,
"holdings": [
{ "owner": "A", "tokens": 3 },
{ "owner": "B", "tokens": 1 },
{ "owner": "C", "tokens": 1 }
]
},
{
"time": 4,
"holdings": [
{ "owner": "A", "tokens": 3 }, // merged from prev
{ "owner": "B", "tokens": 1 }, // merged from prev
{ "owner": "C", "tokens": 0 }
]
}
]
From what I understand $mergeObjects does that, but I don't understand how I can merge all previous docs in order up to the current doc for each doc. So I'm looking for a way to combine setWindowFields with mergeObjects I think.
This is a nice challenge.
So far, I got this complicated solution:
Get all of our timestamps in all of our documents. This is the purpose of the first 4 steps. $setWindowFields is used to accumulate this data.
$group by owner and calculate the empty timestamps as wantedTimes- next 5 steps.
$set empty timestamps with tokens: null to be filled with actual data and $unwind to separate - next 3 steps
Use $setWindowFields to find the last known token for each owner at each timestamp.
Fill this last known state for documents with unknown token - 2 steps
$group and format answer:
db.collection.aggregate([
{
$setWindowFields: {
sortBy: {time: 1},
output: {
allTimes: {$addToSet: "$time", window: {documents: ["unbounded", "current"]}
}
}
}
},
{
$setWindowFields: {
sortBy: {time: -1},
output: {
allTimes: {$addToSet: "$allTimes", window: {documents: ["unbounded", "current"]}
}
}
}
},
{
$set: {
allTimes: {
$reduce: {
input: "$allTimes",
initialValue: [],
in: {"$concatArrays": ["$$value", "$$this"]}
}
}
}
},
{$set: {allTimes: {$setIntersection: "$allTimes"}}},
{$unwind: "$holdings"},
{$sort: {time: 1}},
{$group: { _id: "$holdings.owner",
tokens: {$push: {tokens: "$holdings.tokens", time: "$time"}},
times: {$push: "$time"}, firstTime: {$first: "$time"},
allTimes: {$first: "$allTimes"}}
},
{
$addFields: {
wantedTimes: {
$filter: {
input: "$allTimes",
as: "item",
cond: {$gte: ["$$item", "$firstTime"]}
}
}
}
},
{
$project: {
tokens: 1,
wantedTimes: {$setDifference: ["$wantedTimes", "$times"]}
}
},
{
$set: {
data: {
$map: {
input: "$wantedTimes",
as: "item",
in: {time: "$$item", tokens: null}
}
}
}
},
{$project: {tokens: {"$concatArrays": ["$tokens", "$data"]}}},
{$unwind: "$tokens"},
{
$setWindowFields: {
partitionBy: "$_id",
sortBy: {"tokens.time": 1},
output: {
lastTokens: {
$push: "$tokens.tokens",
window: {documents: ["unbounded", "current"]}
}
}
}
},
{
$set: {
lastTokens: {
$filter: {
input: "$lastTokens",
as: "item",
cond: {$ne: ["$$item", null]}
}
}
}
},
{
$set: {
"tokens.tokens": {$ifNull: ["$tokens.tokens", {$last: "$lastTokens"}]}
}
},
{
$group: {
_id: "$tokens.time",
holdings: {$push: {owner: "$_id", tokens: "$tokens.tokens" }}
}
},
{$project: {time: "$_id", holdings: 1, _id: 0}},
{$sort: {time: 1}}
])
Playground example
From a performance perspective I recommend you split it into 2 calls, the first will be a quick findOne just to get the maximum time value in the collection.
Once you have that value the pipeline can be much leaner:
const maxItem = await db.collection.findOne({}).sort({ time: -1 });
db.collection.aggregate([
{
$unwind: "$holdings"
},
{
$group: {
_id: "$holdings.owner",
times: {
$push: {
time: "$time",
tokens: "$holdings.tokens"
}
},
minTime: {
$min: "$time"
}
}
},
{
$addFields: {
times: {
$reduce: {
input: {
$range: [
"$minTime",
maxItem.time + 1 // this is max time
]
},
initialValue: {
values: [],
lastIndex: 0
},
in: {
values: {
"$concatArrays": [
"$$value.values",
[
{
$cond: [
{
$in: [
"$$this",
"$times.time"
]
},
{
"$arrayElemAt": [
"$times",
"$$value.lastIndex"
]
},
{
"$mergeObjects": [
{
tokens: 0
},
{
"$arrayElemAt": [
"$times",
{
$subtract: [
"$$value.lastIndex",
1
]
}
]
},
{
time: "$$this"
}
]
}
]
}
]
]
},
lastIndex: {
$cond: [
{
$in: [
"$$this",
"$times.time"
]
},
{
$sum: [
"$$value.lastIndex",
1
]
},
"$$value.lastIndex"
]
}
}
}
}
}
},
{
$unwind: "$times.values"
},
{
$group: {
_id: "$times.values.time",
holdings: {
$push: {
owner: "$_id",
tokens: "$times.values.tokens"
}
}
}
},
{
$project: {
_id: 0,
time: "$_id",
holdings: 1
}
},
{
$sort: {
time: 1
}
}
])
This is still quite a heavy query as it requires to $unwind and $group the entire collection, however there is no workaround this due to the requirements. if the collection is too big for this approach I recommend iteration owner by owner, or time by time and doing separate updates accordingly.
Mongo Playground
If you don't care about performance at all and want it in a single query you can still use the same pipeline, you will have to first extract the max time in the collection, this will require you to add an initial $group stage, like so:
db.collection.aggregate([
{
$group: {
_id: null,
maxTime: {
$max: "$time"
},
roots: {
$push: "$$ROOT"
}
}
},
{
$unwind: "$roots"
},
{
$replaceRoot: {
newRoot: {
"$mergeObjects": [
"$roots",
{
maxTime: "$maxTime"
}
]
}
}
},
... same pipeline ...
])

Mongodb aggregation $or condition on $eq

I want to increment the value of the field 'XR' on either getting the value of modality equals to XR or CR or DX. But unfortunately its somehow not working. I read somewhere that the $eq can take regex. So my question is, Is there any why by which i can make an or comparison within $cond and $eq.
{
$group: {
_id: { week: { $week: '$createdAt' }, year: { $year: '$createdAt' } },
XR: { $sum: { $cond: [{ $eq: ['$modality', /(XR|CR|DX)/g] }, 1, 0] } },
CT: { $sum: { $cond: [{ $eq: ['$modality', 'CT'] }, 1, 0] } },
MR: { $sum: { $cond: [{ $eq: ['$modality', 'MR'] }, 1, 0] } },
MG: { $sum: { $cond: [{ $eq: ['$modality', /(NM|MM|MG)/g] }, 1, 0] } },
},
},

Trying to use $cond to $sum and $subtract

My documents:
_id:"DwNMQtHYopXKK3rXt"
client_id:"ZrqKavXX8ieGpx5ae"
client_name:"luana"
companyId:"z3ern2Q7rdvviYCGv"
is_active:true
client_searchable_name:"luana"
status:"paid"
items:Object
id:912602
gross_amount:1000
type:"service"
description:"Pedicure com Zé (pacote)"
item_id:"bjmmPPjqKdWfwJqtC"
user_id:"gWjCskpHF2a3xHYy9"
user_id_commission:50
user_id_amount:0
use_package:true
quantity:1
item_costs:Array
discount_cost:Object
type:"package"
value:100
charge_from:"company_only"
entity_id:"LLRirWu5DabkRna7X"
created_at:2019-10-29T10:35:39.493+00:00
updated_at:2019-10-29T10:36:42.983+00:00
version:"2"
created_by:"2QBRDN9MACQagSkJr"
amount:0
multiple_payment_methods:Array
closed_at:2019-10-29T10:36:52.781+00:00
So i made a $project:
{
_id: 0,
closed_at: 1,
serviceId: "$items.item_id",
serviceAmount: "$items.gross_amount",
discounts:"$items.discount_cost"
}
And then $group
_id: {
month: { $month: "$closed_at" },
serviceId: "$serviceId",
discountType: "$discounts.type",
discountValue: "$discounts.value"
},
totalServiceAmount: {
$sum: "$serviceAmount"
}
}
I'm trying to make a $sum of values of the categories in my DB, actually i filtered all the data, so i have exactly what i need, like that;
_id:Object
"month":10
"serviceId":"MWBqhMyW8ataGxjBT"
"discountType":""courtesy"
"discountValue":100
"totalServiceAmount":5000
So, i have 5 types of discounts on my DB, they are: Percentage (discount in percentage), courtesy (make the service amount 0), package (make the service amount 0), gross (gross discount of value) and null if there's no discount o value.
so, if the type of discount is;
Percentage: I need to subtract the discountValue for the totalServiceAmount (discountValue will be in percentage, how i do that subtract if total serviceAmount is on gross value)
Courtesy and package: I need to transform the totalServiceAmount in 0 value.
Gross: i need to subtract the discountValue for the totalServiceAmount.
Null: just let totalServiceAmount.
I tried like that, to make some test, but i really don't know if i'm goign to the right path, the result was null for every amountWithDiscount.
{
$project: {
{
amountWithDiscount: {
$cond: {
if: {
$eq: ["$_id.discountType", "null"]
},
then: "$serviceAmount", else: {
$cond: {
if: {
$eq: ["$_id.discountType", "gross"]
},
then: {
$subtract: ["$serviceAmount", "$_id.discountValue"]
},
else: "$serviceAmount"
}
}
}
}
}
Make sense?
I create a collection with your grouping result:
01) Example of Documents:
[
{
"_id": "5db9ca609a17899b8ba6650d",
"month": 10,
"serviceId": "MWBqhMyW8ataGxjBT",
"discountType": "courtesy",
"discountValue": 0,
"totalServiceAmount": 5000
},
{
"_id": "5db9d0859a17899b8ba66856",
"month": 10,
"serviceId": "MWBqhMyW8ataGxjBT",
"discountType": "gross",
"discountValue": 100,
"totalServiceAmount": 5000
},
{
"_id": "5db9d0ac9a17899b8ba66863",
"month": 10,
"serviceId": "MWBqhMyW8ataGxjBT",
"discountType": "percentage",
"discountValue": 10,
"totalServiceAmount": 5000
},
{
"_id": "5db9d0d89a17899b8ba6687f",
"month": 10,
"serviceId": "MWBqhMyW8ataGxjBT",
"discountType": null,
"discountValue": 10,
"totalServiceAmount": 6000
}
]
02) Query:
db.collection.aggregate([
{
$project: {
discountType: "$discountType",
amountWithDiscount: {
$cond: {
if: {
$eq: [
"$discountType",
null
]
},
then: "$totalServiceAmount",
else: {
$cond: {
if: {
$eq: [
"$discountType",
"gross"
]
},
then: {
$subtract: [
"$totalServiceAmount",
"$discountValue"
]
},
else: {
$cond: {
if: {
$eq: [
"$discountType",
"percentage"
]
},
then: {
$multiply: [
"$totalServiceAmount",
{
$subtract: [
1,
{
$divide: [
"$discountValue",
100
]
}
]
}
]
},
else: "$totalServiceAmount"
}
}
}
}
}
}
}
}
])
A working example at https://mongoplayground.net/p/nU7vhGN-uSp.
I don't know if I fully understand your problem, but
take a look and see if it solves your problem.

MongoDB aggregation, Group by value interval,

MongoDB documents:
[{
_id: '123213',
elevation: 2300,
area: 25
},
{
_id: '343221',
elevation: 1600,
area: 35,
},
{
_id: '545322',
elevation: 500
area: 12,
},
{
_id: '234234',
elevation: null,
area: 5
}]
I want to group these on a given interval on elevation and summarize the area property.
Group 1: < 0
Group 2: 0 - 1500
Group 3: 1501 - 3000,
Group 4: > 3000
So the expected output would be:
[{
interval: '1501-3000',
count: 2,
summarizedArea: 60
},
{
interval: '0-1500',
count: 1,
summarizedArea: 12,
},
{
interval: 'N/A',
count: 1,
summarizedArea: 5
}]
If possible, I want to use the aggregation pipeline.
Maybe something with $range? Or a combination of $gte and $lte?
As Feliix suggested $bucket should do the job, but boundaries should be slightly different to play well with negative and N/A values:
db.collection.aggregate([
{
$bucket: {
groupBy: "$elevation",
boundaries: [ -Number.MAX_VALUE, 0, 1501, 3001, Number.POSITIVE_INFINITY ],
default: Number.NEGATIVE_INFINITY,
output: {
"count": { $sum: 1 },
"summarizedArea" : { $sum: "$area" }
}
}
}
])
The formatting stage below can be added to the pipeline to adjust shape of the response:
{ $group: {
_id: null,
documents: { $push: {
interval: { $let: {
vars: {
idx: { $switch: {
branches: [
{ case: { $eq: [ "$_id", -Number.MAX_VALUE ] }, then: 3 },
{ case: { $eq: [ "$_id", 0 ] }, then: 2 },
{ case: { $eq: [ "$_id", 1501 ] }, then: 1 },
{ case: { $eq: [ "$_id", 3001 ] }, then: 0 }
],
default: 4
} }
},
in: { $arrayElemAt: [ [ ">3000", "1501-3000", "0-1500", "<0", "N/A" ], "$$idx" ] }
} },
count: "$count",
summarizedArea: "$summarizedArea"
} }
} }
$group with _id: null $push es all groups into array of a single document.
$let maps $_id from previous stage to text labels of interval defined in the array [ ">3000", "1501-3000", "0-1500", "<0", "N/A" ]. For that it calculates idx index of the label using $switch.
It must be way simpler to implement the logic on application level unless you absolutely need to do it in the pipeline.
you can use $bucket introduced in MongoDB 3.4 to achive this:
db.collection.aggregate([
{
$bucket: {
groupBy: "$elevation",
boundaries: [
0,
1500,
3000,
5000
],
default: 10000,
output: {
"count": {
$sum: 1
},
"summarizedArea": {
$sum: "$area"
}
}
}
}
])
output:
[
{
"_id": 0,
"count": 1,
"summarizedArea": 12
},
{
"_id": 1500,
"count": 2,
"summarizedArea": 60
},
{
"_id": 10000,
"count": 1,
"summarizedArea": 5
}
]
you can try it here: mongoplayground.net/p/xFe7ZygMqaY