How to project a sum of a field in a sub-document based on a condition in MongoDB? - mongodb

I have a Customer collection with the following document:
{
"_id": 1,
firstname: "John",
lastname: "Doe",
credits: [
{
cDate: "2020-01-16",
cAmount: 350
},
{
cDate: "2021-02-07",
cAmount: 180
},
{
cDate: "2021-06-25",
cAmount: 650
},
]
}
{
"_id": 2,
firstname: "Bob",
lastname: "Smith",
credits: [
{
cDate: "2020-03-19",
cAmount: 200
},
{
cDate: "2020-08-20",
cAmount: 90
},
{
cDate: "2021-11-11",
cAmount: 300
},
]
}
Now I would like to return the total spent for a specific year i.e. 2021.
The data should look something like this:
{"firstname": "John", "lastname": "Doe", "total": 830},
{"firstname": "Bob", "lastname": "Smith", "total": 300}
First I tried to match the records that contain cDates within the expected year (2021) to reduce the number of records (the actual dataset has hundreds of customers) and then projected the wanted fields:
Customer.aggregate([
{
$match: {
credits: {
$elemMatch: {
cDate: {
$gte: ISODate("2021-01-01"),
$lte: ISODate("2021-12-31"),
},
},
},
},
},
{
$project: {
_id: 0,
firstname: 1,
lastname: 1,
total: {
$sum: "$credits.cAmount",
},
},
}
])
the result is:
{"firstname": "John", "lastname": "Doe", "total": 1180},
{"firstname": "Bob", "lastname": "Smith", "total": 590}
Almost there, now I'd like to skip the credit records that do not contain the expected year (2021), so that only the values with a cDate equal to 2021 are calculated.
The $match I kept the same and I tried to add a $cond in the $project bit.
Customer.aggregate([
{
$match: {
credits: {
$elemMatch: {
cDate: {
$gte: ISODate("2021-01-01"),
$lte: ISODate("2021-12-31"),
},
},
},
},
},
{
$project: {
_id: 0,
firstname: 1,
lastname: 1,
total: {
$cond: {
if: { credits: { cDate: { regex: "2021-" } } }, // if cDate contains 2021-
then: { $sum: "$credits.cAmount" }, // add the cAmount
else: { $sum: 0 } // else add 0
},
},
},
}
])
This results is still the same, all totals get calulated from all years.
{"firstname": "John", "lastname": "Doe", "total": 1180},
{"firstname": "Bob", "lastname": "Smith", "total": 590}
What am I missing?
Thanks for your help.

Property cDate has string value, you can not match by date type,
$match cDate by $regex and match "2021" year
$reduce to iterate loop of credits array, set initial value to 0
$substr to get substring of the cDate from 0 index and 4 character that is year
$cond to check is substring is "2021" then $sum the initial value with cAmount otherwise return initial value
Customer.aggregate([
{
$match: {
"credits.cDate": {
$regex: "2021"
}
}
},
{
$project: {
_id: 0,
firstname: 1,
lastname: 1,
total: {
$reduce: {
input: "$credits",
initialValue: 0,
in: {
$cond: [
{
$eq: [
{ $substr: ["$$this.cDate", 0, 4] },
"2021"
]
},
{ $sum: ["$$value", "$$this.cAmount"] },
"$$value"
]
}
}
}
}
}
])
Playground

Related

get rank in mongodb with date range

I have following stat data stored daily for users.
{
"_id": {
"$oid": "638df4e42332386e0e06d322"
},
"appointment_count": 1,
"item_id": 2,
"item_type": "user",
"company_id": 5,
"created_date": "2022-12-05",
"customer_count": 1,
"lead_count": 1,
"door_knocks": 10
}
{
"_id": {
"$oid": "638f59a9bf33442a57c3aa99"
},
"lead_count": 2,
"item_id": 2,
"item_type": "user",
"company_id": 5,
"created_date": "2022-12-06",
"video_viewed": 2,
"door_knocks": 9
}
And I'm using the following query to get the items by rank
user_stats_2022_12.aggregate([{"$match":{"company_id":5,"created_date":{"$gte":"2022-12-04","$lte":"2022-12-06"}}},{"$setWindowFields":{"partitionBy":"$company_id","sortBy":{"door_knocks":-1},"output":{"item_rank":{"$denseRank":{}},"stat_sum":{"$sum":"$door_knocks"}}}},{"$facet":{"metadata":[{"$count":"total"}],"data":[{"$skip":0},{"$limit":100},{"$sort":{"item_rank":1}}]}}])
It's giving me the rank but with the above data, the record with item_id: 2 are having different rank for same item_id. So I wanted to group them by item_id and then applied rank.
It's a little messy, but here's a playground - https://mongoplayground.net/p/JrJOo4cl9X1.
If you're going to sort by knocks after grouping, I'm assuming that you'll want the sum of door_knocks for a given item_id for this sort.
db.collection.aggregate([
{
$match: {
company_id: 5,
created_date: {
"$gte": "2022-12-04",
"$lte": "2022-12-06"
}
}
},
{
$group: {
_id: {
item_id: "$item_id",
company_id: "$company_id"
},
docs: {
$push: "$$ROOT"
},
total_door_knocks: {
$sum: "$door_knocks"
}
}
},
{
$setWindowFields: {
partitionBy: "$company_id",
sortBy: {
total_door_knocks: -1
},
output: {
item_rank: {
"$denseRank": {}
},
stat_sum: {
"$sum": "$total_door_knocks"
}
}
}
},
{
$unwind: "$docs"
},
{
$project: {
_id: "$docs._id",
appointment_count: "$docs.appointment_count",
company_id: "$docs.company_id",
created_date: "$docs.created_date",
customer_count: "$docs.customer_count",
door_knocks: "$docs.door_knocks",
item_id: "$docs.item_id",
item_type: "$docs.item_type",
lead_count: "$docs.lead_count",
item_rank: 1,
stat_sum: 1,
total_door_knocks: 1
}
},
{
$facet: {
metadata: [
{
"$count": "total"
}
],
data: [
{
"$skip": 0
},
{
"$limit": 100
},
{
"$sort": {
"item_rank": 1
}
}
]
}
}
])

MongoDB get count of field per season from MM/DD/YYYY date field

I am facing a problem in MongoDB. Suppose, I have the following collection.
{ id: 1, issueDate: "07/05/2021", code: "31" },
{ id: 2, issueDate: "12/11/2020", code: "14" },
{ id: 3, issueDate: "02/11/2021", code: "98" },
{ id: 4, issueDate: "01/02/2021", code: "14" },
{ id: 5, issueDate: "06/23/2020", code: "14" },
{ id: 6, issueDate: "07/01/2020", code: "31" },
{ id: 7, issueDate: "07/05/2022", code: "14" },
{ id: 8, issueDate: "07/02/2022", code: "20" },
{ id: 9, issueDate: "07/02/2022", code: "14" }
The date field is in the format MM/DD/YYYY. My goal is to get the count of items with each season (spring (March-May), summer (June-August), autumn (September-November) and winter (December-February).
The result I'm expecting is:
count of fields for each season:
{ "_id" : "Summer", "count" : 6 }
{ "_id" : "Winter", "count" : 3 }
top 2 codes (first and second most recurring) per season:
{ "_id" : "Summer", "codes" : {14, 31} }
{ "_id" : "Winter", "codes" : {14, 98} }
How can this be done?
You should never store date/time values as string, store always proper Date objects.
You can use $setWindowFields opedrator for that:
db.collection.aggregate([
// Convert string into Date
{ $set: { issueDate: { $dateFromString: { dateString: "$issueDate", format: "%m/%d/%Y" } } } },
// Determine the season (0..3)
{
$set: {
season: { $mod: [{ $toInt: { $divide: [{ $add: [{ $subtract: [{ $month: "$issueDate" }, 1] }, 1] }, 3] } }, 4] }
}
},
// Count codes per season
{
$group: {
_id: { season: "$season", code: "$code" },
count: { $count: {} },
}
},
// Rank occurrence of codes per season
{
$setWindowFields: {
partitionBy: "$_id.season",
sortBy: { count: -1 },
output: {
rank: { $denseRank: {} },
count: { $sum: "$count" }
}
}
},
// Get only top 2 ranks
{ $match: { rank: { $lte: 2 } } },
// Final grouping
{
$group: {
_id: "$_id.season",
count: { $first: "$count" },
codes: { $push: "$_id.code" }
}
},
// Some cosmetic for output
{
$set: {
season: {
$switch: {
branches: [
{ case: { $eq: ["$_id", 0] }, then: 'Winter' },
{ case: { $eq: ["$_id", 1] }, then: 'Spring' },
{ case: { $eq: ["$_id", 2] }, then: 'Summer' },
{ case: { $eq: ["$_id", 3] }, then: 'Autumn' },
]
}
}
}
}
])
Mongo Playground
I will give you clues,
You need to use $group with _id as $month on issueDate, use accumulator $sum to get month wise count.
You can divide month by 3, to get modulo, using $toInt, $divide, then put them into category using $cond.
Another option:
db.collection.aggregate([
{
$addFields: {
"season": {
$switch: {
branches: [
{
case: {
$in: [
{
$substr: [
"$issueDate",
0,
2
]
},
[
"06",
"07",
"08"
]
]
},
then: "Summer"
},
{
case: {
$in: [
{
$substr: [
"$issueDate",
0,
2
]
},
[
"03",
"04",
"05"
]
]
},
then: "Spring"
},
{
case: {
$in: [
{
$substr: [
"$issueDate",
0,
2
]
},
[
"12",
"01",
"02"
]
]
},
then: "Winter"
}
],
default: "No date found."
}
}
}
},
{
$group: {
_id: {
s: "$season",
c: "$code"
},
cnt1: {
$sum: 1
}
}
},
{
$sort: {
cnt1: -1
}
},
{
$group: {
_id: "$_id.s",
codes: {
$push: "$_id.c"
},
cnt: {
$sum: "$cnt1"
}
}
},
{
$project: {
_id: 0,
season: "$_id",
count: "$cnt",
codes: {
"$slice": [
"$codes",
2
]
}
}
}
])
Explained:
Add one more field for season based on $switch per month(extracted from issueDate string)
Group to collect per season/code.
$sort per code DESCENDING
group per season to form an array with most recurring codes in descending order.
Project the fields to the desired output and $slice the codes to limit only to the fist two most recurring.
Comment:
Indeed keeping dates in string is not a good idea in general ...
Playground

MongoDB: How to find objects and return a sum of values

I saw similar questions here, but can't really figure this problem out. I've got the following orders collection:
{
orders: [
{
userId: "abc",
orderId: "123",
balance: 2,
},
{
userId: "abc",
orderId: "123",
balance: 5,
},
{
userId: "def",
orderId: "456",
balance: 1,
},
{
userId: "abc",
orderId: "456",
balance: 3,
},
];
}
I need an aggregation query that would return the sum of balances for the given userId AND orderId. For this following example, given userId = "abc" and orderId = "123", the result of that query would be 7. So far, I have tried $map and $sum, but can't really put together the structure of the query.
How can I get the sum of the balances given the userId AND orderId ?
I managed to work it out with this:
db.collection.aggregate([
{
$match: {
"userId": "abc",
"orderId": "123"
}
},
{
$group: {
_id: "$userId",
total: {
$sum: "$amount"
}
}
},
{
$sort: {
total: -1
}
}
])
orders.aggregate([
{
"$group": {
"_id": {
"user": "$userId",
"order": "$orderId"
},
"amount": {
"$sum": "$balance"
}
}
}
])

MongoDB - Calculate time difference between documents based on the existence of a value inside an array?

I'm trying to calculate the time difference between the two documents but I'm not sure how to do this based on the existence of a value inside an array.
Let me explain in a little more detail. Say I have five documents: A, B, C, D, E in a collection.
Each document has referenceKey, timestamp and persons fields.
And each element inside a persons array have personType field along with other fields:
A: { referenceKey: 1, timestamp: ISODate, persons: [ { personType: "ALICE", ... }, { personType: "BOB", ... } ] }
B: { referenceKey: 1, timestamp: ISODate, persons: [ { personType: "ALICE", ... }, { personType: "BOB", ... } ] }
C: { referenceKey: 1, timestamp: ISODate, persons: [ { personType: "BOB", ... } ] }
D: { referenceKey: 1, timestamp: ISODate, persons: [ { personType: "ALICE", ... }, { personType: "BOB", ... } ] }
E: { referenceKey: 1, timestamp: ISODate, persons: [ { personType: "BOB", ... } ] }
What I want to achieve is to calculate how much time the person with type ALICE has spent for each visit.
In other words, this should calculate and return an array of time differences:
[{ timeSpent: C.timestamp - A.timestamp }, { timeSpent: E.timestamp - D.timestamp }]
Here is an example collection to test:
[
{
timestamp: ISODate("2019-04-12T20:00:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
},
{
personType: "ALICE"
}
]
},
{
timestamp: ISODate("2019-04-12T20:10:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
}
]
},
{
timestamp: ISODate("2019-04-12T21:00:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
},
{
personType: "ALICE"
}
]
},
{
timestamp: ISODate("2019-04-12T21:15:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
}
]
},
{
timestamp: ISODate("2019-04-12T21:20:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
}
]
},
{
timestamp: ISODate("2019-04-12T21:45:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
},
{
personType: "ALICE"
}
]
},
{
timestamp: ISODate("2019-04-12T22:05:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
},
{
personType: "ALICE"
}
]
},
{
timestamp: ISODate("2019-04-12T23:00:00.000Z"),
referenceKey: 1,
persons: [
{
personType: "BOB"
}
]
},
{
timestamp: ISODate("2019-04-12T18:30:00.000Z"),
referenceKey: 2,
persons: [
{
personType: "BOB"
},
{
personType: "JOHN"
}
]
}
]
I thought I can add a new boolean field hasAlice using $in based on the existence of person type ALICE. But the problem is the time spent calculation should be done for each visit so I cannot just use $reduce to calculate the total time. Can I somehow use $group by hasAlice field change and then use $reduce?
What I've tried (and failed) so far:
db.collection.aggregate([
{
"$match": { // I also filter by timestamp and referenceKey but it is not relevant to the problem
timestamp: {
"$gte": ISODate("2019-04-12T00:00:00.000Z"),
"$lt": ISODate("2019-04-12T23:59:00.000Z")
},
referenceKey: 1
}
},
{
"$project": {
_id: 0,
timestamp: 1,
hasAlice: {
"$in": [
"ALICE",
"$persons.personType"
]
}
}
},
{
"$sort": {
timestamp: 1
}
}
])
What I want to get:
[
{ timeSpent: 10 }, // in minutes
{ timeSpent: 15 },
{ timeSpent: 75 },
]
What I actually get when I run the aggregation:
[
{
"hasAlice": true, // 1. visit starts
"timestamp": ISODate("2019-04-12T20:00:00Z")
},
{
"hasAlice": false, // 1. visit ends
"timestamp": ISODate("2019-04-12T20:10:00Z")
},
{
"hasAlice": true, // 2. visit starts
"timestamp": ISODate("2019-04-12T21:00:00Z")
},
{
"hasAlice": false, // 2. visit ends
"timestamp": ISODate("2019-04-12T21:15:00Z")
},
{
"hasAlice": false,
"timestamp": ISODate("2019-04-12T21:20:00Z")
},
{
"hasAlice": true, // 3. visit starts
"timestamp": ISODate("2019-04-12T21:45:00Z")
},
{
"hasAlice": true, // NOTE: there are some misleading documents such as these (e.g. document B)
"timestamp": ISODate("2019-04-12T22:05:00Z")
},
{
"hasAlice": false, // 3. visit ends
"timestamp": ISODate("2019-04-12T23:00:00Z")
}
]
I don't know if my logic is correct or can I somehow reduce these documents to calculate the time spent for each visit. But any help is appreciated.
Thanks in advance.
I finally figured it out, thanks to this somewhat similar post.
The trick is to use $lookup to left-join the collection with itself and then getting the first element that does not contain the person type ALICE from the joined collection. This element from the joined collection gives us the ending of each visit (i.e leaveTimestamp).
From there, we can further $group by end of each visit and select only the first timestamp of matching documents so that we can eliminate any misleading documents (e.g. document B).
Here is the full aggregate pipeline:
db.collection.aggregate([
{
"$match": {
timestamp: {
"$gte": ISODate("2019-04-12T00:00:00.000Z"),
"$lt": ISODate("2019-04-12T23:59:00.000Z")
},
referenceKey: 1,
persons: {
"$elemMatch": {
"personType": "ALICE"
}
}
}
},
{
"$project": {
timestamp: 1,
hasAlice: {
"$in": [
"ALICE",
"$persons.personType"
]
}
}
},
{
$lookup: {
from: "collection",
let: {
root_id: "$_id"
},
pipeline: [
{
"$match": {
timestamp: {
"$gte": ISODate("2019-04-12T00:00:00.000Z"),
"$lt": ISODate("2019-04-12T23:59:00.000Z")
},
referenceKey: 1
}
},
{
"$project": {
timestamp: 1,
hasAlice: {
"$in": [
"ALICE",
"$persons.personType"
]
}
}
},
{
$match: {
$expr: {
$gt: [
"$_id",
"$$root_id"
]
}
}
}
],
as: "tmp"
}
},
{
"$project": {
_id: 1,
timestamp: 1,
tmp: {
$filter: {
input: "$tmp",
as: "item",
cond: {
$eq: [
"$$item.hasAlice",
false
]
}
}
}
}
},
{
"$project": {
timestamp: 1,
leaveTimestamp: {
$first: "$tmp.timestamp"
}
}
},
{
"$group": {
"_id": "$leaveTimestamp",
"timestamp": {
"$min": "$timestamp"
},
leaveTimestamp: {
$first: "$leaveTimestamp"
}
}
},
{
$addFields: {
"visitingTime": {
$dateToString: {
date: {
$toDate: {
$subtract: [
"$leaveTimestamp",
"$timestamp"
]
}
},
format: "%H-%M-%S"
}
}
}
},
{
"$sort": {
"timestamp": 1
}
}
])
Mongoplayground

Can we push object value into $project using mongodb

db.setting.aggregate([
{
$match: {
status: true,
deleted_at: 0,
_id: {
$in: [
ObjectId("5c4ee7eea4affa32face874b"),
ObjectId("5ebf891245aa27c290672325")
]
}
}
},
{
$lookup: {
from: "site",
localField: "_id",
foreignField: "admin_id",
as: "data"
}
},
{
$project: {
name: 1,
status: 1,
price: 1,
currency: 1,
numberOfRecord: {
$size: "$data"
}
}
},
{
$sort: {
numberOfRecord: 1
}
}
])
how to push the currency into price object using project please guide thanks a lot, also eager to know what is difference between $addtoSet and $push, what is good option to opt it from project or fix it from $addField
https://mongoplayground.net/p/RiWnnRtksb4
Output should be like this:
[
{
"_id": ObjectId("5ebf891245aa27c290672325"),
"currency": "USD",
"name": "Menz",
"numberOfRecord": 0,
"price": {
"numberDecimal": "20",
"currency": "USD",
},
"status": true
},
{
"_id": ObjectId("5c4ee7eea4affa32face874b"),
"currency": "USD",
"name": "Dave",
"numberOfRecord": 2,
"price": {
"numberDecimal": "10",
"currency": "USD"
},
"status": true
}
]
You can insert a field into an object with project directly, like this (field price):
$project: {
name: 1,
status: 1,
price: {
numberDecimal: "$price.numberDecimal",
currency: "$currency"
},
numberOfRecord: {
$size: "$data"
}
}
By doing it with project, there is no need to use $addField.
For the difference between $addToSet and $push, read this great answer.
You can just set the object structure while projecting, so in this case there's no need for either $push or $addToSet.
{
$project: {
name: "1",
status: 1,
price: {
currency: "$currency",
numberDecimal: "$price.numberDecimal"
},
currency: 1,
numberOfRecord: {
$size: "$data",
}
}
}
Now the difference between $push and $addToSet is pretty trivial and derived from the name, $push saves all items while $addToSet will just create a set of them, for example:
input:
[
//doc1
{
item: 1
},
//doc2
{
item: 2
},
//doc3
{
item: 1
}
]
Now this:
{
$group: {
_id: null,
items: {$push: "$item"}
}
}
Will result in:
{_id: null, items: [1, 2, 1]}
While:
{
$group: {
_id: null,
items: {$addToSet: "$item"}
}
}
Will result in:
{_id: null, items: [1, 2]}