Does anyone know how to calculate the difference of values between documents? So, on current document value deduct the previous document value to create a new value of the movement. Each document represents a month and year balance.
I have a set of account_balances in a document which are dated at the end of each month. They represent the general ledger from accounting app where the integration only provides the balances and not the month by month movement.
How would I calculate the difference of a balance value from an array of one document and the previous month's document?
The parameters to group together correctly are the _id.company, _id.connection, _id.object_snapshot_date and then account_balances.account_id and account_balances.value_type.
The value I want to deduct total_value of each account from 2018-05-30 from total_value from the 2018-06-31 document. There may be multiple documents in here related to an entire year.
What I want to get is the same document back but the total_value for June is the movement instead of the balance.
Thanks, Matt
Example of two documents with different months:
{
"_id" : {
"company" : " a8aa7d3f-cef8-4895-a83e-3087b4cf529c ",
"connection" : "a4b52d3a-0c00-406f-9163-4b1d52df0271",
"object_snapshot_date" : 20180603135959,
"object_schema" : "timeline-balance",
"object_class" : "trial-balance",
"object_category" : "balance",
"object_type" : "month",
"object_origin_category" : "bookkeeping",
"object_origin_type" : "accounting",
"object_origin" : "Xero"
},
"account_balances" : [
{
"account_id" : "47cf9c6e-4ec7-4853-9efa-9e180636c96f",
"account_name" : "Sales",
"account_code" : "200",
"account_class" : "revenue",
"account_category" : "sales",
"account_group" : "",
"value_type" : "credit",
"total_value" : 29928.96,
"value_currency" : "NZD"
},
{
"account_id" : "47cf9c6e-4ec7-4853-9efa-9e180636aa43",
"account_name" : "Cost of Goods Sold",
"account_code" : "300",
"account_class" : "expense",
"account_category" : "sales",
"account_group" : "",
"value_type" : "debit",
"total_value" : 12452.50,
"value_currency" : "NZD"
}
]
},
{
"_id" : {
"company" : " a8aa7d3f-cef8-4895-a83e-3087b4cf529c ",
"connection" : "a4b52d3a-0c00-406f-9163-4b1d52df0271",
"object_snapshot_date" : 20180503035959,
"object_schema" : "timeline-balance",
"object_class" : "trial-balance",
"object_category" : "balance",
"object_type" : "month",
"object_origin_category" : "bookkeeping",
"object_origin_type" : "accounting",
"object_origin" : "Xero"
},
"account_balances" : [
{
"account_id" : "47cf9c6e-4ec7-4853-9efa-9e180636c96f",
"account_name" : "Sales",
"account_code" : "200",
"account_class" : "revenue",
"account_category" : "sales",
"account_group" : "",
"value_type" : "credit",
"total_value" : 24231.12,
"value_currency" : "NZD"
},
{
"account_id" : "47cf9c6e-4ec7-4853-9efa-9e180636aa43",
"account_name" : "Cost of Goods Sold",
"account_code" : "300",
"account_class" : "expense",
"account_category" : "sales",
"account_group" : "",
"value_type" : "debit",
"total_value" : 6875.10,
"value_currency" : "NZD"
}
]
}
Expected Output would be like this:
{
"_id" : {
"company" : " a8aa7d3f-cef8-4895-a83e-3087b4cf529c ",
"connection" : "a4b52d3a-0c00-406f-9163-4b1d52df0271",
"object_snapshot_date" : 20180603135959,
"object_schema" : "timeline-balance",
"object_type" : "month",
"object_origin_category" : "bookkeeping",
"object_origin" : "Xero"
},
"account_movements" : [
{
"account_id" : "47cf9c6e-4ec7-4853-9efa-9e180636c96f",
"account_name" : "Sales",
"account_code" : "200",
"account_class" : "revenue",
"movement" : 5697.84
},
{
"account_id" : "47cf9c6e-4ec7-4853-9efa-9e180636aa43",
"account_name" : "Cost of Goods Sales",
"account_code" : "200",
"account_class" : "revenue",
"movement" : 5577.4
}
]
}
I'm assuming that you can always put a filtering condition that will guarantee that only two documents remain after $match stage (like below). Then you can use $unwind to get single account_balance per document. In the next stage you can $sort by snapshot_date. Then you can $group by account_name with $push to get all balances. Since there is an assumption that there will be only two elements you can use $subtract with $arrayElemAt to get the movement.
db.col.aggregate([
{
$match: {
"_id.object_snapshot_date": {
$gte: 20180500000000,
$lte: 20180630000000
}
}
},
{
$unwind: "$account_balances"
},
{
$sort: { "_id.object_snapshot_date": 1 }
},
{
$group: {
_id: "$account_balances.account_name",
balances: { $push: "$account_balances.total_value" }
}
},
{
$project: {
_id: 0,
account_name: "$_id",
movement: { $subtract: [ { $arrayElemAt: [ "$balances", 1 ] }, { $arrayElemAt: [ "$balances", 0 ] } ] }
}
}
])
Outputs:
{ "account_name" : "Cost of Goods Sold", "movement" : 5577.4 }
{ "account_name" : "Sales", "movement" : 5697.84 }
If you need more generic solution (for more than two months) you can replace last pipeline stage with below:
{
$project: {
_id: 0,
account_name: "$_id",
movement: {
$map: {
input: { $range: [ 1, { $size: "$balances" } ] },
as: "index",
in: {
$subtract: [
{ $arrayElemAt: [ "$balances", "$$index" ] },
{ $arrayElemAt: [ "$balances", { $subtract: [ "$$index", 1 ] } ] }
]
}
}
}
}
}
This will calculate the differences for all the values in balances array using (you'll get n-1 results where n is a size of balances).
Related
The only thing I am trying to do is to get the average of Emision_C02 consumed at 10pm for all the days in location:1. The collection, db.datos_sensores2, has documents within like:
{
"_id" : ObjectId("609c2c2d420a73728827e87f"),
"timestamp" : ISODate("2020-07-01T02:15:00Z"),
"sensor_id" : 1,
"location_id" : 1,
"medidas" : [
{
"tipo_medida" : "Temperatura",
"valor" : 14.03,
"unidad" : "ÂșC"
},
{
"tipo_medida" : "Humedad_relativa",
"valor" : 84.32,
"unidad" : "%"
}
]
}
{
"_id" : ObjectId("609c2c2d420a73728827e880"),
"timestamp" : ISODate("2020-07-01T02:15:00Z"),
"sensor_id" : 2,
"location_id" : 1,
"medidas" : [
{
"tipo_medida" : "Emision_CO2",
"valor" : 1.67,
"unidad" : "gCO2/m2"
},
{
"tipo_medida" : "Consumo_electrico",
"valor" : 0.00155,
"unidad" : "kWh/m2"
}
]
}
I wrote this:
db.datos_sensores2.aggregate([
{$project:{timestamp:{$dateFromString:{dateString:'$timestamp'}},"_id":0, "me-didas":{$slice:["$medidas",-1]},"location_id":1}},
{$addFields:{Hora:{$hour:"$timestamp"}}},
{$match:{'Hora':{$in:[10]},'medidas.tipo_medida':"Emision_CO2", "location_id":1}},
{$group:{ _id: null, Avg_Emision_CO2:{$avg: "$medidas.valores"}}}])
But nothing happen....
pls refer to https://mongoplayground.net/p/-LqswomHWsY
I have noticed few things first of all hour comes to be 2 in above example and not 10. Second the variable/field names are not correct so i have updated it.
[{$unwind: {
path: '$medidas',
}}, {$addFields: {
Hora: {
$hour: "$timestamp"
}
} }, {$match: {
"Hora": {
$in: [2]
},
"medidas.tipo_medida": "Emision_CO2",
"location_id": 1
} }, {$group: {
_id: null,
Avg_Emision_CO2: {
$avg: "$medidas.valor"
}
}}]
Pipeline stages:
unwind: as $medidas is array we can unwind it so it will be easy to filter only "Emision_CO2",
addfield: add houre from timestamp
match: to match "medidas.tipo_medida": "Emision_CO2",
group: to get average
I have a database that contains information about flights. I'm trying to find the category that has the least minutes of delays. I managed to find and show the number of the minimum minutes of the category but not the category itself.
I've tried to put ":true" after each field to show it
db.delayData.aggregate([{
$group: {
"_id": "$carrier",
"arr_sum": {
$sum: "$arr_delay"
},
"carrier_sum": {
$sum: "$carrier_delay"
},
"weather_sum": {
$sum: "$weather_delay"
},
"nas_sum": {
$sum: "$nas_delay"
},
"sec_sum": {
$sum: "$security_delay"
},
"late_air_sum": {
$sum: "$late_aircraft_delay"
}
}
},
{
$project {
"min_delay_category": {
$min["$arr_sum", "$carrier_sum", "$weather_sum", "$nas_sum", "$sec_sum", "$late_air_sum"]
}
}
]).pretty()
I want to have something like that:
{ "_id" : "VX", "min_delay_category" : 1449, "sec_sum"... }
I've tried to write:
..."$sec_sum":1,"$late_air_sum":1]
but the error message is:
"missing ] after element list"
when I wrote:
...{"sec_sum":1},{"late_air_sum":1}]
I don't have error message but it will give me the least second result, not the first one.
for example:
{ "_id" : "VX", "min_delay_category" : 69081 }
but the true result for "VX" is 1449
The following query can get us the expected output:
db.collection.aggregate([
{
$project:{
"carrier":1,
"category.arr_delay":"$arr_delay",
"category.carrier_delay":"$carrier_delay",
"category.weather_delay":"$weather_delay",
"category.nas_delay":"$nas_delay",
"category.security_delay":"$security_delay",
"category.late_aircraft_delay":"$late_aircraft_delay"
}
},
{
$project:{
"carrier":1,
"categories":{
$objectToArray:"$category"
}
}
},
{
$unwind:"$categories"
},
{
$group:{
"_id":{
"carrier":"$carrier",
"category":"$categories.k"
},
"carrier":{
$first:"$carrier"
},
"category":{
$first:"$categories.k"
},
"total_delay":{
$sum:"$categories.v"
}
}
},
{
$sort:{
"total_delay":1
}
},
{
$group:{
"_id": "$carrier",
"carrier":{
$first:"$carrier"
},
"category":{
$first:"$category"
},
"minimum_delay":{
$first:"$total_delay"
}
}
},
{
$project:{
"_id":0
}
}
]).pretty();
Data set:
{
"_id" : ObjectId("5d5b5058435c7584459b7bae"),
"year" : 2003,
"month" : 6,
"carrier" : "AA",
"carrier_name" : "American Airlines Inc.",
"airport" : "ABQ",
"airport_name" : "Albuquerque, NM: Albuquerque International Sunport",
"arr_flights" : 307,
"arr_del15" : 56,
"carrier_ct" : 14.68,
"weather_ct" : 10.79,
"nas_ct" : 19.09,
"security_ct" : 1.48,
"late_aircraft_ct" : 9.96,
"arr_cancelled" : 1,
"arr_diverted" : 1,
"arr_delay" : 2530,
"carrier_delay" : 510,
"weather_delay" : 621,
"nas_delay" : 676,
"security_delay" : 25,
"late_aircraft_delay" : 698,
"" : ""
},
{
"_id" : ObjectId("5d5b5058435c7584459b7bbe"),
"year" : 2003,
"month" : 6,
"carrier" : "AA",
"carrier_name" : "American Airlines Inc.",
"airport" : "ABQ",
"airport_name" : "Albuquerque, NM: Albuquerque International Sunport",
"arr_flights" : 307,
"arr_del15" : 56,
"carrier_ct" : 14.68,
"weather_ct" : 10.79,
"nas_ct" : 19.09,
"security_ct" : 1.48,
"late_aircraft_ct" : 9.96,
"arr_cancelled" : 1,
"arr_diverted" : 1,
"arr_delay" : 2530,
"carrier_delay" : 510,
"weather_delay" : 621,
"nas_delay" : 676,
"security_delay" : 2512,
"late_aircraft_delay" : 698,
"" : ""
}
Output:
{ "carrier" : "AA", "category" : "carrier_delay", "minimum_delay" : 1020 }
Aggregation stage details:
STAGE I: Projecting all delays as a part of category document
STAGE II: Converting category into an array of key-value pair
where 'k' is delay type and 'v' is a delay
STAGE III: Unwinding the prepared array
STAGE IV: Grouping on the basis of carrier and delay type(k) and summing up delay for each type
STAGE V: Sorting on total calculated delay in ascending order
STAGE VI: Grouping on carrier and fetching the first document
which holds the minimum delay
In Db I have some sample data:
Object 1
"_id" : ObjectId("5b5934bb49b")
"payment" : {
"paid_total" : 500,
"name" : "havi",
"payment_mode" : "cash",
"pd_no" : "PD20725001",
"invoices" : [
{
"invoice_number" : "IN11803831583"
}
],
"type" : "Payment"
}
Object 2
"_id" : ObjectId("5b5934ee31e"),
"patient" : {
"invoice_date" : "2018-07-26",
"invoiceTotal" : 2000,
"pd_no" : "PD20725001",
"type" : "Invoice",
"invoice_number" : "IN11803831583"
}
Note: All the Data is In same Collection
As the above shown data I have many objects in my database. How can I get the Sum from the data above of invoiceTotal and sum of paid_total and then subtract the paid_total from invoiceTotal and show the balance amount for matching pd_no and invoice_number.
The output I expect looks like
invoiceTotal : 2000
paid_total : 500
Balance : 1500
Sample Input :
{
"_id" : ObjectId("5b596969a88e07f00d6dac17"),
"payment" : {
"paid_total" : 500,
"name" : "havi",
"payment_mode" : "cash",
"pd_no" : "PD20725001",
"invoices" : [
{
"invoice_number" : "IN11803831583"
}
],
"type" : "Payment"
}
}
{
"_id" : ObjectId("5b596986a88e07f00d6dac18"),
"patient" : {
"invoice_date" : "2018-07-26",
"invoiceTotal" : 2000,
"pd_no" : "PD20725001",
"type" : "Invoice",
"invoice_number" : "IN11803831583"
}
}
Use this aggregate query :
db.test.aggregate([
{
$project : {
_id : 0,
pd_no : { $ifNull: ["$payment.pd_no", "$patient.pd_no" ] },
invoice_no : { $ifNull: [ { $arrayElemAt : ["$payment.invoices.invoice_number", 0] },"$patient.invoice_number" ] },
type : { $ifNull: [ "$payment.type", "$patient.type" ] },
paid_total : { $ifNull: [ "$payment.paid_total", 0 ] },
invoice_total : { $ifNull: [ "$patient.invoiceTotal", 0 ] },
}
},
{
$group : {
_id : {
pd_no : "$pd_no",
invoice_no : "$invoice_no"
},
paid_total : {$sum : "$paid_total"},
invoice_total : {$sum : "$invoice_total"}
}
},
{
$project : {
_id : 0,
pd_no : "$_id.pd_no",
invoice_no : "$_id.invoice_no",
invoice_total : "$invoice_total",
paid_total : "$paid_total",
balance : {$subtract : ["$invoice_total" , "$paid_total"]}
}
}
])
In this query we are first finding the pd_no and invoice_no, which we are then using to group the documents. Next, we are getting the invoice_total and paid_total and then subtracting them to get the balance.
Output :
{
"pd_no" : "PD20725001",
"invoice_no" : "IN11803831583",
"invoice_total" : 2000,
"paid_total" : 500,
"balance" : 1500
}
I assume that you will only have documents with invoiceTotal or paid_total and never both at the same time.
you need first to get an amount to get the balance so if paid total it needs to be negative and positive on the case of the invoice total, and you can do this by using first the $project on the pipeline.
collection.aggregate([
{
$project : {
'patient.invoiceTotal': 1,
'payment.paid_total': 1,
ammount: {
$ifNull: ['$patient.invoiceTotal', { $multiply: [-1, '$payment.paid_total']}]
}
}
},
{
$group: {
_id: 'myGroup',
invoiceTotal: { $sum: '$patient.invoiceTotal' },
paid_total: { $sum: '$payment.paid_total' },
balance: { $sum: '$ammount' }
}
}
])
I have been trying to group and count registration collection data for a stats page, as well as to make for dynamic registration, but I can't get it to count for more than one grouping.
Sample registration collection data:
{
"_id" : ObjectId("58ec60078cc818505fb75ace"),
"event" : "Women's BB",
"day" : "Saturday",
"group" : "nonpro",
"division" : "Women's",
"level" : "BB"
}
{
"_id" : ObjectId("58ec60078cc818505fb75acf"),
"event" : "Coed BB",
"day" : "Sunday",
"group" : "nonpro",
"division" : "Coed",
"level" : "BB"
}
{
"_id" : ObjectId("58ec60098cc818505fb75ad0"),
"event" : "Men's BB",
"day" : "Saturday",
"group" : "nonpro",
"division" : "Men's",
"level" : "BB"
}
{
"_id" : ObjectId("58ec60168cc818505fb75ad1"),
"event" : "Men's B",
"day" : "Saturday",
"group" : "nonpro",
"division" : "Men's",
"level" : "B"
}
{
"_id" : ObjectId("58ec60178cc818505fb75ad2"),
"event" : "Women's Open",
"day" : "Saturday",
"group" : "pro",
"division" : "Women's",
"level" : "Pro"
}
{
"_id" : ObjectId("58ec60188cc818505fb75ad3"),
"event" : "Men's Open",
"day" : "Saturday",
"group" : "pro",
"division" : "Men's",
"level" : "Pro"
}
I'd like to reorganize it and do counts returning something like this:
[ {_id: { day: "Saturday", group: "nonpro" },
count: 3,
divisions: [
{ division: "Men's",
count: 2,
levels: [
{ level: "BB", count: 1 },
{ level: "B", count: 1 }]
},
{ division: "Women's",
count: 1,
levels: [
{ level: "BB", count: 1 }]
}
},
{_id: { day: "Saturday", group: "pro" },
count: 2,
divisions: [
{ division: "Men's",
count: 1,
levels: [
{ level: "Pro", count: 1 }
},
{ division: "Women's",
count: 1,
levels: [
{ level: "Pro", count: 1 }]
}
},
{_id: { day: "Sunday", group: "nonpro" },
count: 1,
divisions: [
{ division: "Coed",
count: 1,
levels: [
{ level: "BB", count: 1 }
}
}]
I know I should be using the aggregate() function, but am having a hard time making it work with the count. Here is what my aggregate looks like so far:
Registration
.aggregate(
{ $group: {
_id: { day: "$day", group: "$group" },
events: { $addToSet: { division: "$division", level: "$level"} },
total: { $sum: 1}
}
})
This returns the total registrations per day/group combination, but if I try adding total: {$sum: 1} to the events set, I just get 1 (which makes sense). Is there a way to make this work in one database call, or do I need to do it separately for each level of grouping I need counts for?
You essentially need 3 levels of $group pipeline stages. The first one will group the documents by all four keys i.e. day, group, division and level. Aggregate the counts for the group
which will be the counts for the level.
The preceding group will take three keys i.e. day, group and division and the aggregate count will sum the previous group counts as well as creating the levels array.
The last group will be the day and group keys + the divisions list embedded with the results from the previous group.
Consider running the following pipeline for the expected results:
Registration.aggregate([
{
"$group": {
"_id": {
"day": "$day",
"group": "$group",
"division": "$division",
"level": "$level"
},
"count": { "$sum": 1 }
}
},
{
"$group": {
"_id": {
"day": "$_id.day",
"group": "$_id.group",
"division": "$_id.division"
},
"count": { "$sum": "$count" },
"levels": {
"$push": {
"level": "$_id.level",
"count": "$count"
}
}
}
},
{
"$group": {
"_id": {
"day": "$_id.day",
"group": "$_id.group"
},
"count": { "$sum": "$count" },
"divisions": {
"$push": {
"division": "$_id.division",
"count": "$count",
"levels": "$levels"
}
}
}
}
], (err, results) => {
if (err) throw err;
console.log(JSON.stringify(results, null, 4));
})
Sample Output
/* 1 */
{
"_id" : {
"day" : "Saturday",
"group" : "nonpro"
},
"count" : 3,
"divisions" : [
{
"division" : "Women's",
"count" : 1,
"levels" : [
{
"level" : "BB",
"count" : 1
}
]
},
{
"division" : "Men's",
"count" : 2,
"levels" : [
{
"level" : "BB",
"count" : 1
},
{
"level" : "B",
"count" : 1
}
]
}
]
}
/* 2 */
{
"_id" : {
"day" : "Saturday",
"group" : "pro"
},
"count" : 2,
"divisions" : [
{
"division" : "Women's",
"count" : 1,
"levels" : [
{
"level" : "Pro",
"count" : 1
}
]
},
{
"division" : "Men's",
"count" : 1,
"levels" : [
{
"level" : "Pro",
"count" : 1
}
]
}
]
}
/* 3 */
{
"_id" : {
"day" : "Sunday",
"group" : "nonpro"
},
"count" : 1,
"divisions" : [
{
"division" : "Coed",
"count" : 1,
"levels" : [
{
"level" : "BB",
"count" : 1
}
]
}
]
}
I have the following collection:
{
"_id" : ObjectId("58503934034b512b419a6eab"),
"website" : "https://www.google.com",
"name" : "Google",
"keywords" : [
"Search",
"Websites",
],
"tracking" : [
{
"_id" : ObjectId("5874aa1df63258286528598d"),
"position" : 0,
"created_at" : ISODate("2017-01-1T09:32:13.831Z"),
"real_url" : "https://www.google.com",
"keyword" : "Search"
},
{
"_id" : ObjectId("5874aa1ff63258286528598e"),
"keyword" : "Page",
"real_url" : "https://www.google.com",
"created_at" : ISODate("2017-01-1T09:32:15.832Z"),
"found_url" : "https://google.com/",
"position" : 3
},
{
"_id" : ObjectId("5874aa21f63258286528598f"),
"keyword" : "Search",
"real_url" : "https://www.foamymedia.com",
"created_at" : ISODate("2017-01-2T09:32:17.017Z"),
"found_url" : "https://google.com/",
"position" : 2
},
{
"_id" : ObjectId("5874aa21f63258286528532f"),
"keyword" : "Search",
"real_url" : "https://www.google.com",
"created_at" : ISODate("2017-01-2T09:32:17.017Z"),
"found_url" : "https://google.com/",
"position" : 1
},
]
}
What I want to do is group all of the keywords together and calculate the average for that particular day, over a certain period.
So let's say for example:
Between: 2017-01-01 to 2017-01-31 the following keywords was tracked:
2017-01-01:
'Seach' => 1,
'Page' => 3,
Average = 2
2017-01-02:
'Search' => 4,
'Page' => 6,
Average = 5
....
So in the end result, I would be finished with (in this case):
{
"_id" : ObjectId("5874dccb9cd90425e41b7c54"),
"website" : "www.google.com",
"averages" : [
"2",
"5"
]
}
You can try something like this.
$unwind the tracking array followed by $sort on tracking.keyword and tracking.created_at.$group by day to get average for day across all categories. Final $group to push all the day's average values into array for a website.
db.website.aggregate([{
$match: {
"_id": ObjectId("58503934034b512b419a6eab")
}
}, {
$lookup: {
from: "seo_tracking",
localField: "website",
foreignField: "real_url",
as: "tracking"
}
}, {
$unwind: "$tracking"
}, {
$sort: {
"tracking.keyword": 1,
"tracking.created_at": -1
}
}, {
$group: {
_id: {
$dayOfMonth: "$tracking.created_at"
},
"website": {
$first: "$website"
},
"website_id": {
$first: "$_id"
},
"averageByDay": {
$avg: "$tracking.position"
}
}
}, {
$group: {
"_id": "$website_id",
"website": {
$first: "$website"
},
"average": {
$push: "$averageByDay"
}
}
}]);