I have made the two following insertions into my document. I have been trying to find a way to show the total number of domestic students and the total number of international students for
the recorded years for every university. I tried using $sum aggregating by using in $project stage but I just get the answer 0. I am also not sure whether it is adding all the domestic students from each year and adding all the international students from each year.
db.universities.insertMany([
{country: "Australia", city: "Melbourne", name: "SUT",
domestic_students : [
{ year: 2014, number: 24774 },
{ year: 2015, number: 23166 },
{ year: 2016, number: 21913 },
{ year: 2017, number: 21715}],
international_students : [
{ year: 2014, number: 32178 },
{ year: 2015, number: 36780 },
{ year: 2016, number: 67899 },
{ year: 2017, number: 65321 }]
},
{country: "Australia", city: "Sydney", name: "UTS",
domestic_students : [
{ year: 2014, number: 67891 },
{ year: 2015, number: 56312 },
{ year: 2016, number: 45679 },
{ year: 2017, number: 71235}]
}
]);
You can sum over an array of numbers, but you can't sum over an array of subdocuments. For that you need $unwind.
$unwind "explodes" the array into different documents (mid-aggregation). So if you do:
$unwind: {
path: '$domestic_students',
preserveNullAndEmptyArrays: false
}
You'll end up with several documents that have a subdocument of domestic_student (not an array of subdocuments).
I think this does what you want:
db.universities.aggregate[{
$unwind: {
path: '$domestic_students',
preserveNullAndEmptyArrays: false
}
}, {
$unwind: {
path: '$international_students',
preserveNullAndEmptyArrays: false
}
}, {
$group: {
_id: '$country',
dtotal: {
$sum: '$domestic_students.number'
},
itotal: {
$sum: '$international_students.number'
}
}
}]
I like using MongoDB compass to help with aggregations, because I can see the stages and outcome from a sample:
Related
I want to serve data from multiple collections, let's say product1 and product2.
Schemas of both can be referred to as -:
{ amount: Number } // other fields might be there but not useful in this case.
Now after multiple stages of aggregation pipeline, I'm able to get the data in the following format-:
items: [
{
amount: 10,
type: "product1",
date: "2022-10-05"
},
{
amount: 15,
type: "product2",
date: "2022-10-07"
},
{
amount: 100,
type: "product1",
date: "2022-10-10"
}
]
However, I want one more field added to each element of items - The sum of all the previous amounts.
Desired Result -:
items: [
{
amount: 10,
type: "product1",
date: "2022-10-05",
totalAmount: 10
},
{
amount: 15,
type: "product2",
date: "2022-10-07",
totalAmount: 25
},
{
amount: 100,
type: "product1",
date: "2022-10-10",
totalAmount: 125
}
]
I tried adding another $project stage, which goes as follows -:
{
items: {
$map: {
input: "$items",
in: {
$mergeObjects: [
"$$this",
{ totalAmount: {$add : ["$$this.amount", 0] } },
]
}
}
}
}
This just appends another field, totalAmount as the sum of 0 and the amount of that item itself.
I couldn't find a way to make the second argument (currently 0) in {$add : ["$$this.amount", 0] } as a variable (initial value 0).
What's the way to perform such action in MongoDb aggregation pipeline ?
PS-: I could easily perform this action by a later mapping in the code itself, but I need to add limit (for pagination) to it in the later stage.
You can use $reduce instead of $map for this:
db.collection.aggregate([
{$project: {
items: {
$reduce: {
input: "$items",
initialValue: [],
in: {
$concatArrays: [
"$$value",
[{$mergeObjects: [
"$$this",
{totalAmount: {$add: ["$$this.amount", {$sum: "$$value.amount"}]}}
]}]
]
}
}
}
}}
])
See how it works on the playground example
I am new to MongoDB and have not been able to find a solution to my problem.
I am collecting hourly crypto data. Each document is an array of objects. Within each of these objects there is another nested array of objects. It looks as follows:
timestamp: "2022-05-11T12:38:01.537Z",
positions: [
{
detail: 1,
name: "name",
importantNumber: 0,
arrayOfTokens: [
{
tokenName: "name",
tokenSymbol: "symbol",
tokenPrice: 1,
tokensEarned: 10,
baseAssetValueOfTokensEarned: 10,
},
{
tokenName: "name2",
tokenSymbol: "symbol2",
tokenPrice: 2,
tokensEarned: 10,
baseAssetValueOfTokensEarned: 20,
},
],
},
],};
My goal is to be able to aggregate the hourly data into daily groups, where the timestamp becomes the day's date, the position array still houses the primary details of each position, sums the importantNumber (these I believe I have been able to achieve), and aggregates each hour's token details into one object for each token, calculating the average token price, the total tokens earned etc.
What I have so far is:
const res = await Name.aggregate([
{
$unwind: {
path: "$positions",
},
},
{
$project: {
_id: 0,
timestamp: "$timestamp",
detail: "$positions.detail",
name: "$positions.name",
importantNumber: "$positions.importantNumber",
arrayOfTokens: "$positions.arrayOfTokens ",
},
},
{
$group: {
_id: {
date: { $dateToString: { format: "%Y-%m-%d", date: "$timestamp" } },
name: "$name",
},
importantNumber: { $sum: "$importantNumber" },
arrayOfTokens: { $push: "$arrayOfTokens" }, // it is here that I am stuck
},
},
]);
return res;
};
With two hours recorded, the above query returns the following result, with the arrayOfTokens housing multiple arrays:
{
_id: {
date: '2022-05-11',
name: 'name',
},
importantNumber: //sum of important number,
arrayOfTokens: [
[ [Object], [Object] ], // hour 1: token1, token2.
[ [Object], [Object] ] // hour 2: token1, token2
]
}
I would like the arrayOfTokens to house only one instance of each token object. Something similar to the following:
...
arrayOfTokens: [
{allToken1}, {allToken2} // with the name and symbol preserved, and average of the token price, sum of tokens earned and sum of base asset value.
]
Any help would be greatly appreciated, thank you.
Should be this one:
db.collection.aggregate([
{ $unwind: { path: "$positions" } },
{
$group: {
_id: {
date: { $dateTrunc: { date: "$timestamp", unit: "day" } },
name: "$positions.name"
},
importantNumber: { $sum: "$positions.importantNumber" },
arrayOfTokens: { $push: "$positions.arrayOfTokens" }
}
}
])
I prefer $dateTrunc over group by string.
Mongo Playground
I have this model for purchases:
{
purchase_date: 2018-03-11 00:00:00.000,
total_cost: 400,
items: [
{
title: 'Pringles',
price: 200,
quantity: 2,
category: 'Snacks'
}
]
}
What I'm trying to do is to, first of all, to group the purchases by date, by doing so:
{$group: {
_id: {
date: $purchase_date,
items: '$items'
}
}}
However, now what I want to do is group the purchases of each day by items[].category and calculate how much was spent for each category in that day. I was able to do that with one day, but when I grouped each purchase by date I no longer able to $unwind the items.
I tried passing the path $items and it doesn't find it at all. If I try to use $_id.$items or _id.$items in both cases I get an error stating that it is not a valid path for $unwind.
You can use purchase_data and items.category as a grouping _id but you need to use $unwind on items before and then you can add another $group to get all groups per day
db.col.aggregate([
{ $unwind: "$items" },
{
$group: {
_id: {
purchase_date: "$purchase_date",
category: "$items.category",
},
total: { $sum: { $multiply: [ "$items.price", "$items.quantity" ] } }
}
},
{
$group: {
_id: "$_id.purchase_date",
categories: { $push: { name: "$_id.category", total: "$total" } }
}
}
])
From a product stocks log I have created a MongoDB collection. The relevant fields are: sku, stock and date. Every time a products stock is updated there is a new entry with the total stock.
The skus are made up of two parts. A parent part, say 'A' and a variant or child part, say '1', '2', '3', etc.. So a sku might look like this: 'A2'.
I can query for a single products stock, grouped by day, with this query:
[{
$match: {
sku: 'A2'
}
},
{
$group: {
_id: {
year: {$year: '$date'},
day: {$dayOfYear: '$date'}
},
stock: {
$min: '$stock'
},
date: {
$first: '$date'
}
}
},
{
$sort: {
date: 1
}
}]
Note: I want the minimum stock for each day.
But I need to query for all variations (minimum) stocks added up. I can change the $match object to:
[{
$match: {
sku: /^A/
}
}
How do I create a 'sub' group in the $group stage?
EDIT:
The data looks like this:
{
sku: 'A1',
date: '2015-01-01',
stock: 15
}
{
sku: 'A1',
date: '2015-01-01',
stock: 14
}
{
sku: 'A2',
date: '2015-01-01',
stock: 20
}
Two stocks for 'A1' and one for 'A2' on a single day. My query (all skus grouped by day) would give me stock 14 as a result ($min of the 3 values). But I want the result to be 34. 20 (min for A2) plus 14 (min for A1)
If you add the sku to the _id field in the group phase it will aggregate on that as well, i.e. group per sku, year & day.
db.stocks.aggregate(
[
{
$group: {
_id: {
sku: '$sku',
year: {$year: '$date'},
day: {$dayOfYear: '$date'}
},
stock: {
$min: '$stock'
},
date: {
$first: '$date'
}
}
},
{
$sort: {
date: 1
}
}]
)
I've got a series of docs in MongoDB. An example doc would be
{
createdAt: Mon Oct 12 2015 09:45:20 GMT-0700 (PDT),
year: 2015,
week: 41
}
Imagine these span all weeks of the year and there can be many in the same week. I want to aggregate them in such a way that the resulting values are a sum of each week and all its prior weeks counting the total docs.
So if there were something like 10 in the first week of the year and 20 in the second, the result could be something like
[{ week: 1, total: 10, weekTotal: 10},
{ week: 2, total: 30, weekTotal: 20}]
Creating an aggregation to find the weekTotal is easy enough. Including a projection to show the first part
db.collection.aggregate([
{
$project: {
"createdAt": 1,
year: {$year: "$createdAt"},
week: {$week: "$createdAt"},
_id: 0
}
},
{
$group: {
_id: {year: "$year", week: "$week"},
weekTotal : { $sum : 1 }
}
},
]);
But getting past this to sum based on that week and those weeks preceding is proving tricky.
The aggregation framework is not able to do this as all operations can only effectively look at one document or grouping boundary at a time. In order to do this on the "server" you need something with access to a global variable to keep the "running total", and that means mapReduce instead:
db.collection.mapReduce(
function() {
Date.prototype.getWeekNumber = function(){
var d = new Date(+this);
d.setHours(0,0,0);
d.setDate(d.getDate()+4-(d.getDay()||7));
return Math.ceil((((d-new Date(d.getFullYear(),0,1))/8.64e7)+1)/7);
};
emit({ year: this.createdAt.getFullYear(), week: this.createdAt.getWeekNumber() }, 1);
},
function(values) {
return Array.sum(values);
},
{
out: { inline: 1 },
scope: { total: 0 },
finalize: function(value) {
total += value;
return { total: total, weekTotal: value }
}
}
)
If you can live with the operation occuring on the "client" then you need to loop through the aggregation result and similarly sum up the totals:
var total = 0;
db.collection.aggregate([
{ "$group": {
"_id": {
"year": { "$year": "$createdAt" },
"week": { "$week": "$createdAt" }
},
"weekTotal": { "$sum": 1 }
}},
{ "$sort": { "_id": 1 } }
]).map(function(doc) {
total += doc.weekTotal;
doc.total = total;
return doc;
});
It's all a matter of whether it makes the most sense to you of whether this needs to happen on the server or on the client. But since the aggregation pipline has no such "globals", then you probably should not be looking at this for any further processing without outputting to another collection anyway.