Creating objects from array of objects in a grouped mongo aggregation - mongodb

I have been writing an aggregation pipeline to show a summarized version of data from a collection.
Sample Structure of Document:
{
_id: 'abcxyz',
eventCode: 'EVENTCODE01',
eventName: 'SOMEEVENT',
units: 1,
rate: 2,
cost: 2,
distribution: [
{
startDate: 2021-05-31T04:00:00.000+00:00
units: 1
}
]
}
I have grouped it and merged the distribution into a single list with $unwind step before $group:
[
$unwind: {
path: '$distribution',
preserveNullAndEmptyArrays: false
},
$group: {
_id: {
eventName: '$eventName',
eventCode: '$eventCode'
},
totalUnits: {
$sum: '$units'
},
distributionList: {
$push: '$distribution'
},
perUnitRate: {
$avg: '$rate'
},
perUnitCost: {
$avg: '$cost'
}
}
]
Sample Output:
{
_id: {
eventName: 'EVENTNAME101'
eventCode: 'QQQ'
},
totalUnits: 7,
perUnitRate: 2,
perUnitCost: 2,
distributionList: [
{
startDate: 2021-05-31T04:00:00.000+00:00,
units: 1
},
{
startDate: 2021-05-31T04:00:00.000+00:00,
units: 1
},
{
startDate: 2021-06-07T04:00:00.000+00:00,
units: 1
}
]
}
I'm getting stuck at the next step; I want to consolidate the distributionList into a new List with no repeating startDate.
Example: Since first 2 objects of distributionList have the same startDate, it should be a single object in output with sum of units:
Expected:
{
_id: {
eventName: 'EVENTNAME101'
eventCode: 'QQQ'
},
totalUnits: 7,
perUnitRate: 2,
perUnitCost: 2,
newDistributionList: [
{
startDate: 2021-05-31T04:00:00.000+00:00,
units: 2 //units summed for first 2 objects
},
{
startDate: 2021-06-07T04:00:00.000+00:00,
units: 1
}
]
}
I couldn't use $unwind or $bucket as I intend to keep the grouping I did in previous steps ($group).
Can I get suggestions or a different approach if this doesn't seem accurate?

You may want to do the first $group at eventName, eventCode, distribution.startDate level. Then, you can $group again at eventName, eventCode level and using $first to keep your original $group fields.
Here is the Mongo Playground to show the idea for your reference.

Related

How to map on array fields with a dynamic variable in MongoDB, while projection (aggregation)

I want to serve data from multiple collections, let's say product1 and product2.
Schemas of both can be referred to as -:
{ amount: Number } // other fields might be there but not useful in this case.
Now after multiple stages of aggregation pipeline, I'm able to get the data in the following format-:
items: [
{
amount: 10,
type: "product1",
date: "2022-10-05"
},
{
amount: 15,
type: "product2",
date: "2022-10-07"
},
{
amount: 100,
type: "product1",
date: "2022-10-10"
}
]
However, I want one more field added to each element of items - The sum of all the previous amounts.
Desired Result -:
items: [
{
amount: 10,
type: "product1",
date: "2022-10-05",
totalAmount: 10
},
{
amount: 15,
type: "product2",
date: "2022-10-07",
totalAmount: 25
},
{
amount: 100,
type: "product1",
date: "2022-10-10",
totalAmount: 125
}
]
I tried adding another $project stage, which goes as follows -:
{
items: {
$map: {
input: "$items",
in: {
$mergeObjects: [
"$$this",
{ totalAmount: {$add : ["$$this.amount", 0] } },
]
}
}
}
}
This just appends another field, totalAmount as the sum of 0 and the amount of that item itself.
I couldn't find a way to make the second argument (currently 0) in {$add : ["$$this.amount", 0] } as a variable (initial value 0).
What's the way to perform such action in MongoDb aggregation pipeline ?
PS-: I could easily perform this action by a later mapping in the code itself, but I need to add limit (for pagination) to it in the later stage.
You can use $reduce instead of $map for this:
db.collection.aggregate([
{$project: {
items: {
$reduce: {
input: "$items",
initialValue: [],
in: {
$concatArrays: [
"$$value",
[{$mergeObjects: [
"$$this",
{totalAmount: {$add: ["$$this.amount", {$sum: "$$value.amount"}]}}
]}]
]
}
}
}
}}
])
See how it works on the playground example

categraji data by using MongoDb aggregation

Payload in excel sheets that consist of 4 columns i.e Date, status, amount, orderId.You need to structure the data / categorize the columns according to months and in each month orders are categorized as per status.
Umbrella Status:
INTRANSIT - ‘intransit’, ‘at hub’, ‘out for delivery’
RTO - ‘RTO Intransit’, ‘RTO Delivered’
PROCESSING - ‘processing’
For example:
Response should look like: -
May :
1.INTRANSIT
2. RTO
3.PROCESSING
June:
1.INTRANSIT
2. RTO
3.PROCESSING
You can use different aggregation operators provided in MongoDB.For example: -group, facet, Match, unwind, bucket, project, lookup, etc.
I tried it with this:
const pipeline = [{
$facet:
{
"INTRANSIT": [{ $match: { Status: { $in: ['INTRANSIT', 'AT HUB', 'OUT FOR
DELIVERY'] } } }, { $group: { _id: "$Date", numberofbookings: { $sum: 1 } }
}],
"RTO": [{ $match: { Status: { $in: ['RTO INTRANSIT', 'RTO DELIVERED'] } } },
{ $group: { _id: "$Date", numberofbookings: { $sum: 1 } } }],
"PROCESSING": [{ $match: { Status: { $in: ['PROCESSING'] } } }, {
$group: {
_id: date.getMonth("$Date"),
numberofbookings: { $sum: 1 }
}
}]
}
}];
const aggCursor = coll.aggregate(pipeline);

MongoDB: Aggregating hourly data into daily aggregates

I am new to MongoDB and have not been able to find a solution to my problem.
I am collecting hourly crypto data. Each document is an array of objects. Within each of these objects there is another nested array of objects. It looks as follows:
timestamp: "2022-05-11T12:38:01.537Z",
positions: [
{
detail: 1,
name: "name",
importantNumber: 0,
arrayOfTokens: [
{
tokenName: "name",
tokenSymbol: "symbol",
tokenPrice: 1,
tokensEarned: 10,
baseAssetValueOfTokensEarned: 10,
},
{
tokenName: "name2",
tokenSymbol: "symbol2",
tokenPrice: 2,
tokensEarned: 10,
baseAssetValueOfTokensEarned: 20,
},
],
},
],};
My goal is to be able to aggregate the hourly data into daily groups, where the timestamp becomes the day's date, the position array still houses the primary details of each position, sums the importantNumber (these I believe I have been able to achieve), and aggregates each hour's token details into one object for each token, calculating the average token price, the total tokens earned etc.
What I have so far is:
const res = await Name.aggregate([
{
$unwind: {
path: "$positions",
},
},
{
$project: {
_id: 0,
timestamp: "$timestamp",
detail: "$positions.detail",
name: "$positions.name",
importantNumber: "$positions.importantNumber",
arrayOfTokens: "$positions.arrayOfTokens ",
},
},
{
$group: {
_id: {
date: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } },
name: "$name",
},
importantNumber: { $sum: "$importantNumber" },
arrayOfTokens: { $push: "$arrayOfTokens" }, // it is here that I am stuck
},
},
]);
return res;
};
With two hours recorded, the above query returns the following result, with the arrayOfTokens housing multiple arrays:
{
_id: {
date: '2022-05-11',
name: 'name',
},
importantNumber: //sum of important number,
arrayOfTokens: [
[ [Object], [Object] ], // hour 1: token1, token2.
[ [Object], [Object] ] // hour 2: token1, token2
]
}
I would like the arrayOfTokens to house only one instance of each token object. Something similar to the following:
...
arrayOfTokens: [
{allToken1}, {allToken2} // with the name and symbol preserved, and average of the token price, sum of tokens earned and sum of base asset value.
]
Any help would be greatly appreciated, thank you.
Should be this one:
db.collection.aggregate([
{ $unwind: { path: "$positions" } },
{
$group: {
_id: {
date: { $dateTrunc: { date: "$timestamp", unit: "day" } },
name: "$positions.name"
},
importantNumber: { $sum: "$positions.importantNumber" },
arrayOfTokens: { $push: "$positions.arrayOfTokens" }
}
}
])
I prefer $dateTrunc over group by string.
Mongo Playground

MongoDB Unwind In Parallel Stages or Combine

Given the following data:
let tasks = [
{
_id: 1,
task_number: 1,
description: 'Clean Bathroom',
customer: 'Walmart',
users_worked: [
{user: 'Jonny', hours: 1},
{user: 'Cindy', hours: 1}
],
supplies_used: [
{item_code: 'LD4949', description: 'Liquid Detergent', quantity: 1}
]
},
{
_id: 2,
task_number: 2,
description: 'Stock Cheeses',
customer: 'Walmart',
users_worked: [
{user: 'Mark', hours: 3.0},
{user: 'Shelby', hours: 2.0}
],
supplies_used: []
}
];
Suppose I want to show a table with each one of these in a list format:
task_number | description | customer | users | users_worked.hours (sum) | supplies_used.quantity (sum)
----------------------------------------------------------------------------------------------
1 | 'Clean Bathroom' | 'Walmart' | 'Jonny, Cindy' | 2 | 1
2 | 'Stock Cheeses' | 'Walmart' | 'Mark, Shelby' | 5 | 0
The aggregate:
[
{
"unwind: {
"path": "$users_worked",
"preserveNullAndEmptyArrays": true
}
},
{
"unwind: {
"path": "$supplies_used",
"preserveNullAndEmptyArrays": true
}
},
{
$group: {
_id: "$_id",
task_number: {
$first: "$task_number"
},
description: {
$first: "$description"
},
customer: {
$first: "$customer"
},
users: {
$push: "$users_worked.user"
},
users_worked: {
$sum: "$users_worked.hours"
},
supplies_used: {
$sum: "$supplies_used.quantity"
}
}
]
The problem is that I need to $unwind both arrays (users_worked and supplies_used), which ends up skewing my results (cartesian product). Since task #1 has 2 elements in users_worked array, it will make my supplies_used count go to 2.
This is a simple example, there could be many arrays and the more elements each have the more it skews the data.
Is there a way with aggregates to unwind multiple arrays separately so they don't skew each other? I have seen an example of creating 1 combined object, where then theres only 1 source of unwinding. Don't seem to understand how to do what I want though.
* EDIT *
I see that you can use the $zip mongo aggregate command to combine multiple arrays into 1 array. This is a good start:
arrays: {
$map: {
input: {
$zip: {
inputs: [
'$users_worked',
'$supplies_used'
],
}
},
as: 'zipped',
in: {
users_worked: {
$arrayElemAt: [
'$$zipped',
0
]
},
supplies_used: {
$arrayElemAt: [
'$$zipped',
1
]
}
How can I use this $zip command if I were to have an array of array. For example:
let tasks = [
{
_id: 1,
task_number: 1,
description: 'Clean Bathroom',
customer: 'Walmart',
users_worked: [
{user: 'Jonny', hours: 1},
{user: 'Cindy', hours: 1}
],
supplies_used: [
{item_code: 'LD4949', description: 'Liquid Detergent', quantity: 1}
],
invoices: [
{
invoicable: true,
items: [
{item_code: 'LD4949', price: 39.99, quantity: 1, total: 39.99},
{item_code: 'Hours', price: 50.00, quantity: 2, total: 100.00}
]
}
]
},
{
_id: 2,
task_number: 2,
description: 'Stock Cheeses',
customer: 'Walmart',
users_worked: [
{user: 'Mark', hours: 3.0},
{user: 'Shelby', hours: 2.0}
],
supplies_used: [],
invoices: []
}
];
And I want to include the sum of invoices.items.total in my list.
Instead of using $unwind and $group you can use $reduce to aggregate names and $sum to sum up the numbers:
db.collection.aggregate([
{
$project: {
task_number: 1,
description: 1,
customer: 1,
users: {
$reduce: {
input: "$users_worked",
initialValue: "",
in: {
$concat: [ "$$value", ", ", "$$this.user" ]
}
}
},
users_worked: { $sum: "$users_worked.hours" },
quantity: { $sum: "$supplies_used.quantity" }
}
}
])
Mongo Playground
I was able to solve this using multiple layers of $zip. Starts on root layer, all arrays on root need to be zipped and then unwound. Then you find the $sum of any fields on that layer, then you find next layer, $zip new arrays, then unwind, then sum that layer, etc.
The issue is you don't want to unwind multiple arrays or you get the cartesian product. Need to combine into 1 array then unwind then everything is correct!

MongoDB aggregation: $unwind after grouping by date

I have this model for purchases:
{
purchase_date: 2018-03-11 00:00:00.000,
total_cost: 400,
items: [
{
title: 'Pringles',
price: 200,
quantity: 2,
category: 'Snacks'
}
]
}
What I'm trying to do is to, first of all, to group the purchases by date, by doing so:
{$group: {
_id: {
date: $purchase_date,
items: '$items'
}
}}
However, now what I want to do is group the purchases of each day by items[].category and calculate how much was spent for each category in that day. I was able to do that with one day, but when I grouped each purchase by date I no longer able to $unwind the items.
I tried passing the path $items and it doesn't find it at all. If I try to use $_id.$items or _id.$items in both cases I get an error stating that it is not a valid path for $unwind.
You can use purchase_data and items.category as a grouping _id but you need to use $unwind on items before and then you can add another $group to get all groups per day
db.col.aggregate([
{ $unwind: "$items" },
{
$group: {
_id: {
purchase_date: "$purchase_date",
category: "$items.category",
},
total: { $sum: { $multiply: [ "$items.price", "$items.quantity" ] } }
}
},
{
$group: {
_id: "$_id.purchase_date",
categories: { $push: { name: "$_id.category", total: "$total" } }
}
}
])