mongodb aggregation get all fields sum group by date - mongodb

Collection
{
"_id" : ObjectId("5a143a79ca78479b1dc90161"),
"createdAt" : ISODate("2017-11-21T14:38:49.375Z"),
"amount" : 227.93359186,
"pair" : "ant_eth"
}
Expected output
{
"12-12-2012": [
{
"pair": "ant_eth",
"sum": "sum of amounts in 12-12-2012"
},
{
"pair": "new_pair",
"sum": "sum of amounts in 12-12-2012"
},
],
"13-12-2012": [{
"pair": "ant_eth",
"sum": "sum of amounts in 13-12-2012"
}]
}
What I achieved so far from my knowledge is;
const criteria = [
{ $group: {
_id: '$pair',
totalAmount: { $sum: '$amount' } } }
]
Any help to achieve the expected output is much appreciated.

OK, so you want to sum up amount by just the date portion of a datetime and pair, and then "organize" all the pair+sum by date. You can do this by "regrouping" as follows. The first $group creates the sums but leaves you with repeating dates. The second $group fixes up the output to almost what you wish except that the dates remain as rvals to the _id instead of becoming lvals (field names) themselves.
db.foo.aggregate([
{
$group: {
_id: {d: {$dateToString: { format: "%Y-%m-%d", date: "$createdAt"}}, pair: "$pair"},
n: {$sum: "$amount"}
}
},
{
$group: {
_id: "$_id.d",
items: {$push: {pair: "$_id.pair", sum: "$n"}}
}
}
]);
If you REALLY want to have field names, then add these two stages after the second $group:
,{$project: {x: [["$_id","$items"]] }}
,{$replaceRoot: { newRoot: {$arrayToObject: "$x"} }}

This is what I could get to:
db.collection.aggregate([{
$group: {
_id: {
year: {
"$year": "$createdAt"
},
month: {
"$month": "$createdAt"
},
day: {
"$dayOfMonth": "$createdAt"
},
pair: "$pair"
},
sum: {
$sum: "$amount"
}
}
}])
For rest of the thing, you probably need to do app side parsing to generate output you want

Related

Mongodb aggregate group by array elements

I have a mongodb document that contains customer id, status (active, deactivate) and date.
[
{
id:1,
date:ISODate('2022-12-01'),
status:'activate'
},
{
id:2,
date:ISODate('2022-12-01'),
status:'activate'
},
{
id:1,
date:ISODate('2022-12-02'),
status:'deactivate'
},
{
id:2,
date:ISODate('2022-12-21'),
status:'deactivate'
}
]
I need to get daywise customer status count.
I came up with below aggregation.
db.collection.aggregate([
{
$addFields: {
"day": {
"$dateToString": {
"format": "%Y-%m-%d",
"date": "$date"
}
}
}
},
{
$group: {
_id: "$day",
type: {
$push: "$status"
}
}
}
])
this way I can get status in a array. like below.
[
{
_id:"2022-12-01",
type:[
0:"activate",
1:"activate"
]
},
{
_id:"2022-12-02",
type:[
0:"deactivate"
]
},
{
_id:"2022-12-21",
type:[
0:"deactivate"
]
}
]
now it's working as intended. but I need the output like below.
[
{
_id:"2022-12-01",
type:{
"activate":2,
}
},
{
_id:"2022-12-02",
type:{
"deactivate":1
}
},
{
_id:"2022-12-21",
type:{
"deactivate":1
}
}
]
this table has around 100,000 documents and doing this programmatically will take about 10 seconds. that's why I'm searching a way to do this as a aggregation
One option is to group twice and then use $arrayToObject:
db.collection.aggregate([
{$group: {
_id: {day: "$date", status: "$status"},
count: {$sum: 1}
}},
{$group: {
_id: {$dateToString: {format: "%Y-%m-%d", date: "$_id.day"}},
data: {$push: {k: "$_id.status", v: "$count"}}
}},
{$project: {type: {$arrayToObject: "$data"}}}
])
See how it works on the playground example

MongoDB how to $project aggregate with multiple $sum?

For MongoDB, need to group on event and supply multiple attribute fields.
My purpose is to show for a group the associated field attributes together with a list of sum totals from numeric fields.
For the MongoDB the aggregate has $group and $project. $project can show fields listed within the $group.
My $group is working fine by itself, e.g, without the $project. When I supply $project before or after the $group, I receive the following error:
query failed: (Location40323) A pipeline stage specification object must contain exactly one field.
My code is as follows:
db.collection.aggregate([
{
$project: {
eventName: "$EVENT_TYPE",
registeredDate: "$BEGIN_DATE_TIME",
stateName: "$STATE",
damageCosts: "$DAMAGE_PROPERTY",
peopleCosts: "$DEATHS_DIRECT",
injuriyCosts: "$INJURIES_DIRECT",
cropCosts: "$DAMAGE_CROPS"
},
$group: {
_id: "$EVENT_TYPE",
totalPropCost: {
$sum: "$DAMAGE_PROPERTY"
},
totalDeaths: {
$sum: "$DEATHS_DIRECTY"
},
totalInjury: {
$sum: "$INJURIES_DIRECT"
},
totalCropCost: {
$sum: "$DAMAGE_CROPS"
}
}
}
])
Alternately: attempted to use the $push command, and it looks like a valid run with $push:
db.collection.aggregate([
{
$group: {
_id: "$EVENT_TYPE",
totalPropCost: {
$sum: "$DAMAGE_PROPERTY"
},
totalDeaths: {
$sum: "$DEATHS_DIRECTY"
},
totalInjury: {
$sum: "$INJURIES_DIRECT"
},
totalCropCost: {
$sum: "$DAMAGE_CROPS"
},
events: {
$push: {
name: "$EVENT_TYPE",
date: "$BEGIN_DATE_TIME",
state: "$STATE"
}
}
}
}
])
I use the following sample of data for this mongoDb query, which without the $project, works correct on $sum.
[
{
"BEGIN_YEARMONTH": 201007,
"BEGIN_DAY": 7,
"END_YEARMONTH": 201007,
"END_DAY": 7,
"END_TIME": 1630,
"STATE": "NEW HAMPSHIRE",
"YEAR": 2010,
"EVENT_TYPE": "Heat",
"BEGIN_DATE_TIME": "07-JUL-10 12:51:00",
"END_DATE_TIME": "07-JUL-10 16:30:00",
"INJURIES_DIRECT": 0,
"DEATHS_DIRECT": 0,
"DAMAGE_PROPERTY": "0.00K",
"DAMAGE_CROPS": "0.00K"
},
{
"BEGIN_YEARMONTH": 201001,
"BEGIN_DAY": 17,
"END_YEARMONTH": 201001,
"END_DAY": 18,
"END_TIME": 1500,
"STATE": "NEW HAMPSHIRE",
"YEAR": 2010,
"MONTH_NAME": "January",
"EVENT_TYPE": "Heavy Snow",
"BEGIN_DATE_TIME": "17-JAN-10 23:00:00",
"END_DATE_TIME": "18-JAN-10 15:00:00",
"INJURIES_DIRECT": 0,
"DEATHS_DIRECT": 0,
"DAMAGE_PROPERTY": "0.00K",
"DAMAGE_CROPS": "0.00K"
}
]
For this investigation, after many tries, and other ideas welcome, especially using alternative solutions with $project. By and by, this much is giving me adequate results. The $push command appear to work best. If anyone knows how to get a $project working, please advise and provide explanation. For now, $push work on one solutions for group and list of attribute fields.
db.collection.aggregate([
{
$group: {
_id: "$STATE",
totalPropCost: {
$sum: "$DAMAGE_PROPERTY"
},
totalDeaths: {
$sum: "$DEATHS_DIRECTY"
},
totalInjury: {
$sum: "$INJURIES_DIRECT"
},
totalCropCost: {
$sum: "$DAMAGE_CROPS"
},
events: {
$push: {
eventName: "$EVENT_TYPE",
CzName: "$CZ_NAME",
cZTimeZone: "$CZ_TIMEZONE",
eventDate: "$BEGIN_DATE_TIME",
eventMonth: "$MONTH_NAME",
state: "$STATE",
observer: "$SOURCE"
}
}
}
}
])

How to group documents by month?

I have a collection of transaction data in mongodb, like this:
[
{timestamp: ISODate("2015-11-10T11:33:41.075Z"), nominal: 25.121},
{timestamp: ISODate("2015-11-22T11:33:41.075Z"), nominal: 25.121},
{timestamp: ISODate("2015-11-23T11:33:41.075Z"), nominal: 26.121},
{timestamp: ISODate("2015-12-03T11:33:41.075Z"), nominal: 30.121},
]
How can I use mongodb's aggregate to calculate my total transaction each month?
I tried:
db.getCollection('transaction').aggregate([
{ $group: {_id: "$timestamp", total: {$sum: "$nominal"} } }
])
But it failed since I use timestamp instead of month. I don't want to add another field for month to transaction data. I think about a custom made function for $group pipeline that returns month value.
You need a preliminary $project stage where you use the $month operator to return the "month".
db.transaction.aggregate([
{ "$project": {
"nominal": 1,
"month": { "$month": "$timestamp" }
}},
{ "$group": {
"_id": "$month",
"total": { "$sum": "$nominal" }
}}
])
Which returns:
{ "_id" : 12, "total" : 30.121 }
{ "_id" : 11, "total" : 76.363 }
In case you want to group per year-month (to avoid months from different years to be grouped together), you can use $dateToString:
// { timestamp: ISODate("2015-11-10T11:33:41.075Z"), nominal: 25.121 }
// { timestamp: ISODate("2015-11-22T11:33:41.075Z"), nominal: 25.121 }
// { timestamp: ISODate("2015-11-23T11:33:41.075Z"), nominal: 26.121 }
// { timestamp: ISODate("2015-12-03T11:33:41.075Z"), nominal: 30.121 }
db.collection.aggregate([
{ $group: {
_id: { $dateToString: { date: "$timestamp", format: "%Y-%m" } },
total: { $sum: "$nominal" }
}}
])
// { _id: "2015-12", total: 30.121 }
// { _id: "2015-11", total: 76.363 }

Mongodb aggregate for timeseries data

I have a timeseries dataset with a few hundred thousand records in it. I am trying to create an aggregate query in mongo to group this data in intervals all while averaging the price.
Ideally I would want 10minute intervals (600000ms) and the price averages. I'm not too sure how to carry on from where I am at.
Data ~a few hundred thousand records:
{
"time" : 1391485215000,
"price" : "0.00133355",
}
query = [
{
"$project": {
"_id":"$_id",
"price":"$price",
"time": {
xxxx
}
}
},
{
"$group": {xxxx}
}
]
So it would appear that I had a fundamental flaw in my Schema. I was using an epoch timestamp instead of mongo's Date type, as well as storing the other numbers as strings instead of doubles. I tried a few workarounds but it doesn't look like you are able to use the built in aggregate functions unless they are of the correct type.
$project: {
year: { $year: '$time'},
month: { $month: '$time'},
day: { $dayOfMonth: '$time'},
hour: { $hour: '$time'},
price: 1,
total: 1,
amount: 1
}
},
{
$group : {
_id: { year: '$year', month: '$month', day: '$day', hour: '$hour' },
price:{
$avg: "$price"
},
high:{
$max: "$price"
},
low:{
$min: "$price"
},
amount:{
$sum: "$amount"
},
total:{
$sum: "$total"
}
}

Mongodb Aggregation count array/set size

Here's my problem:
Model:
{ application: "abc", date: Time.now, status: "1" user_id: [ id1, id2,
id4] }
{ application: "abc", date: Time.yesterday, status: "1", user_id: [
id1, id3, id5] }
{ application: "abc", date: Time.yesterday-1, status: "1", user_id: [
id1, id3, id5] }
I need to count the unique number of user_ids in a period of time.
Expected result:
{ application: "abc", status: "1", unique_id_count: 5 }
I'm currently using the aggregation framework and counting the ids outside mongodb.
{ $match: { application: "abc" } }, { $unwind: "$users" }, { $group:
{ _id: { status: "$status"},
users: { $addToSet: "$users" } } }
My arrays of users ids are very large, so I have to iterate the dates or I'll get the maximum document limit (16mb).
I could also $group by
{ year: { $year: "$date" }, month: { $month: "$date" }, day: {
$dayOfMonth: "$date" }
but I also get the document size limitation.
Is it possible to count the set size in mongodb?
thanks
The following will return number of uniqueUsers per application. This will apply an group operation to a result of a group operation by using pipeline feature of mongodb.
{ $match: { application: "abc" } },
{ $unwind: "$users" },
{ $group: { _id: "$status", users: { $addToSet: "$users" } } },
{ $unwind:"$users" },
{ $group : {_id : "$_id", count : {$sum : 1} } }
Hopefully this will be done in an easier way in the following releases of mongo by a command which gives the size of an array under a projection. {$project: {id: "$_id", count: {$size: "$uniqueUsers"}}}
https://jira.mongodb.org/browse/SERVER-4899
Cheers
Sorry I'm a little late to the party. Simply grouping on the 'user_id' and counting the result with a trivial group works just fine and doesn't run into doc size limits.
[
{$match: {application: 'abc', date: {$gte: startDate, $lte: endDate}}},
{$unwind: '$user_id'},
{$group: {_id: '$user_id'}},
{$group: {_id: 'singleton', count: {$sum: 1}}}
];
Use $size to get the size of set.
[
{
$match: {"application": "abc"}
},
{
$unwind: "$user_id"
},
{
$group: {
"_id": "$status",
"application": "$application",
"unique_user_id": {$addToSet: "$user_id"}
}
},
{
$project:{
"_id": "$_id",
"application": "$application",
"count": {$size: "$unique_user_id"}
}
}
]