Getting data from outside of group - mongodb

I have a lot of devices non-periodically inserting data into mongo.
I need to get statistics of this data (value by day/month/year). Currently i am doing this by adding a field where I parse the date to day month and year using $month, $year, $dayOfMonth. Then grouping them by these values. The problem is when I get no (or only one) data a day. Then I cant get actual value in this day because I need 2 values to subtract.
Is there a way to get the closest document by day to this group? in one query?
Lets say I have data:
{id : 1, ts : "2017-12-15T10:00:00.000Z", value : 10}
{id : 2, ts : "2017-12-15T17:00:00.000Z", value : 10}
{id : 2, ts : "2017-12-14T12:00:00.000Z", value : 6}
{id : 1, ts : "2017-12-14T15:00:00.000Z", value : 10}
{id : 1, ts : "2017-12-14T10:00:00.000Z", value : 10}
{id : 2, ts : "2017-12-14T09:00:00.000Z", value : 3}
Explanation of problem:
The value is actual read from the meter, for example lets say consumed energy. If device sonsumes 4W/min after 1 min it will be 4 after 2 minutes it will be 8. So the delta between 1. and 2. minute will be 4 . If i have record from 2017-12-14T23:58:00.000Z lets say 10W 23:59 it will be 14W so dValue should be 4 and 00:00 the next day i am not able to calculate the dValue because this is the first and only record in this group
If I group this data by day I can calculate the value difference only in 2017-12-14.
For now I am using this query:
{
$addFields : {
month : {$month : "$ts"},
year : {$year : "$ts"},
day : {$dayOfMonth : "$ts"}
}
},
{
$group : {
_id : {
year : "$year",
month : "$month",
day : "$day",
id : "$id"
},
first : {$min : "$$ROOT"},
last : {$max : "$$ROOT"},
}
},
{
$addFields : {
dValue: {$subtract : [last.value, first.value]} //delta value
}
},
This query works but only if there is more than one document in a day. If there is only one document i cant get accurate data. I want to do this in one query, because i have a lot of these devices and the number is going to only increase and if i have to do a query for every device i get insane number of queries to the database. Is there a way how to solve this ?

Related

how get the last 4 months average value

I am trying this aggregation last 4 months records each month OZONE average value but average value is null how to get the average value
db.Air_pollution.aggregate([
{$match:{CREATE_DATE:{$lte:new Date(),$gte:new Date(new Date().setDate(new
Date().getDate()-120))}}},
{$group:{_id:{month:{$month:"$CREATE_DATE"},year:
{$year:"$CREATE_DATE"}},avgofozone:{$avg:"$OZONE"}}},
{$sort:{"year":-1}},{$project:
{year:'$_id.year',month:'$_id.month',_id:0,avgofozone:1}}
])
output:
{ "avgofozone" : null, "year" : 2018, "month" : 2 }
{ "avgofozone" : null, "year" : 2018, "month" : 3 }
{ "avgofozone" : null, "year" : 2018, "month" : 1 }
It's not working because the OZONE field is a string, and you can't compute $avg on a string. Plus, it's not a valid number: "8:84" should be 8.84
from mongodb documentation:
$avg
Returns the average value of the numeric values that result from
applying a specified expression to each document in a group of
documents that share the same group by key. $avg ignores non-numeric
values.
Otherwise the aggregation query is correct, here is a link showing it: mongo playground.net/p/VaL-Nn8e21E

Save each month, each week etc. MongoDB logic

I'm having a hard time wrapping my head around how to logically solve an issue of mine regarding data that is being read from an API and inserted into MongoDB.
Let's say I have a field called "apples", that changes in amount from month to month, due to seasonal effects, and I want to record these changes up to 6 months back, what do I do? Obviously I can't save new values for months that have passed, but looking forward, what can I do to save Novembers value for November and then Decembers value for December?
I would like to use NodeJS for this btw.
Sorry if I am unclear, it was even hard to explain!
Kind regards,
Erik
It sounds like you want to group things together. There is this thing called aggregation framework in mongodb.
There are a lot of things which you can do with it and one of them is grouping.
More on that you can read in $group
You can insert each apple (document) separately for the given date.
So for example:
In "2017-11-26T16:00:00Z" we have 6 apples and price 15
In "2017-11-25T16:00:00Z" we have 4 apples and price 16
In "2017-10-25T16:00:00Z" we have 9 apples and price 30
1
Lets say we have these three entries:
/* 1 */
{
"_id" : ObjectId("5a1adc774d8a2fe38bec83e4"),
"date" : ISODate("2017-11-26T16:00:00.000Z"),
"apples" : 6,
"price" : 15
}
/* 2 */
{
"_id" : ObjectId("5a1adc924d8a2fe38bec83e8"),
"date" : ISODate("2017-11-25T16:00:00.000Z"),
"apples" : 4,
"price" : 16
}
/* 3 */
{
"date" : ISODate("2017-10-25T16:00:00.000Z"),
"apples" : 9,
"price" : 30
}
Now we want to group them by month and sum the apples per month we could do the following:
db.yourCollection.aggregate([
{
$project:
{
month: { $month: "$date" },
apples: 1, // here we just assign the value of apples. There is no change here
price: 1 // also just assigning the value to price. Nothing is happening here.
}
},
{
$group: // grouping phase
{
_id: "$month", // This is what we group by
monthApples: {$sum: "$apples"} // here we sum the apples per month
monthPrice: {$sum: "$price"} // here we sum the price for each month
}
}
])
In the $project we can make use of date aggregation operators.
The above aggregation pipeline would result to this:
/* 1 */
{
"_id" : 10, // month (October)
"monthApples" : 9 // sum of apples
"monthPrice" : 30 // sum of price for month 10
}
/* 2 */
{
"_id" : 11, // month (November)
"monthApples" : 10 // sum of apples
"monthPrice" : 31 // sum of price for month 11
}
2
Now imagine we have the apple type also saved in the database.
/* 1 */
{
"_id" : ObjectId("5a1adc774d8a2fe38bec83e4"),
"date" : ISODate("2017-11-26T16:00:00.000Z"),
"apples" : 6,
"price" : 15,
"appleType" : "Goldrush"
}
/* 2 */
{
"_id" : ObjectId("5a1adc924d8a2fe38bec83e8"),
"date" : ISODate("2017-11-25T16:00:00.000Z"),
"apples" : 4,
"price" : 16,
"appleType" : "Pink Lady"
}
/* 3 */
{
"_id" : ObjectId("5a1b1c144d8a2fe38bec8a56"),
"date" : ISODate("2017-10-25T16:00:00.000Z"),
"apples" : 9,
"price" : 30,
"appleType" : "Pink Lady"
}
We could group for example by apple type like that.
db.yourCollection.aggregate([
{
$project:
{
apples: 1, // here we just assign the value of apples. There is no change here
price: 1, // also just assigning the value to price. Nothing is happening here.
appleType: 1
}
},
{
$group: // grouping phase
{
_id: "$appleType", // group by appletype
monthApples: {$sum: "$apples"}, // here we sum the apples per month
monthPrice: {$sum: "$price"} // here we sum the price for each month
}
}
])
One of the possible way to model this data will be creating a document for each product that will store it's pricing history for a month:
{
product: "apple",
amount:[
{day: ISODate("2017-11-01T00:00:00.000Z"), price: 24},
{day: ISODate("2017-11-02T00:00:00.000Z"), price: 20},
{day: ISODate("2017-11-03T00:00:00.000Z"), price: 19},
{day: ISODate("2017-11-03T00:00:00.000Z"), price: 25}
],
quality: "best"
}

Group and sum day by day

This is how my collection structure looks like:
{
"_id" : ObjectId("57589d2a9108dace306602b8"),
"IDproject" : NumberLong(53),
"email" : "john.doe#gmail.com",
"dc" : ISODate("2016-06-06T22:33:13.000Z")
}
{
"_id" : ObjectId("57589d2a9108dace306602b8"),
"IDproject" : NumberLong(53),
"email" : "david.doe#gmail.com",
"dc" : ISODate("2016-06-07T22:33:13.000Z")
}
{
"_id" : ObjectId("57589d2a9108dace306602b8"),
"IDproject" : NumberLong(53),
"email" : "elizabeth.doe#gmail.com",
"dc" : ISODate("2016-06-078T22:33:13.000Z")
}
As you can see, there are two customers added on June 7th and one on June 6th. I would like to group and sum these results for the last 30 days.
It should looks something like this:
{
"dc" : "2016-06-05"
"total" : 0
}
{
"dc" : "2016-06-06"
"total" : 1
}
{
"dc" : "2016-06-07"
"total" : 2
}
As, you can see, there are no records on June 6th, so it's zero. It should be zero for June 5th, etc.
That would be the case #1, and the case #2 are following results:
{
"dc" : "2016-06-05"
"total" : 0
}
{
"dc" : "2016-06-06"
"total" : 1
}
{
"dc" : "2016-06-07"
"total" : 3
}
I've tried this:
db.getCollection('customer').aggregate([
{$match : { IDproject : 53}},
{ $group: { _id: "$dc", total: { $sum: "$dc" } } }, ]);
But seems complicated. I'm first time working with noSQL database.
Thanks.
Here's how you will get daily counts (the common idiom for row count is {$sum: 1}).
However, you cannot obtain zeros for days that are lacking data – because there is no data that would give the grouping key for these days. You must handle these cases in PHP by generating a list of desided dates and then looking if there's data for that each date.
db.getCollection('customer').aggregate([
{$match : { IDproject : 53}},
{$group: {
_id: {year: {$year: "$dc"}, month: {$month: "$dc"}, day: {$dayOfMonth: "$dc"}}},
total: {$sum: 1}
}},
]);
Note that MongoDB only operates in the UTC timezone; there are no aggregation pipeline operators that can convert timestamps to local timezones reliably. The $year, $month and $dayOfMonth operators give the date in UTC which may not be the same as in the local timezone. Solutions include:
saving timestamps in the local timezone (= lying to MongoDB that they are in UTC),
saving the timezone offset with the timestamp,
saving the local year, month and dayOfMonth with the timestamp.

mongo query select only first of month

is it possible to query only the first (or last or any single?) day of the month of a mongo date field.
i use the $date aggregation operators regularly but within a $group clause.
basically i have field that is already aggregated (averaged) for each day of the month. i want to select only one of these days (with the value as a representative of the entire month.)
following is a sample of a record set from jan 1, 2014 to feb 1, 2015 with price as the daily price and 28day_avg as the trailing monthly average for 28 days.
{ "date" : ISODate("2014-01-01T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 59.23, "28day_avg": 54.21}
{ "date" : ISODate("2014-01-02T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 58.75, "28day_avg": 54.15}
...
{ "date" : ISODate("2015-02-01T00:00:00Z"), "_id" : ObjectId("533b3697574e2fd08f431cff"), "price": 123.50, "28day_avg": 122.25}
method 1.
im currently running an aggregation using $month data (and summing the price) but one issue is im seeking to retrieve the underlying date value ISODate("2015-02-01T00:00:00Z") versus the 0,1,2 value that comes with several of the date aggregations (that loop at the first of the week, month, year). mod(28) on a date?
method 2
i'd like to simply pluck out a single record of the 28day_avg as representative of the period. the 1st of the month would be adequate
the desired output is...
_id: ISODate("2015-02-01T00:00:00Z"), value: 122.25,
_id: ISODate("2015-01-01T00:00:00Z"), value: 120.78,
_id: ISODate("2014-12-01T00:00:00Z"), value: 118.71,
...
_id: ISODate("2014-01-01T00:00:00Z"), value: 53.21,
of course, the value will vary from method 1 to method 2 but that is fine. one is 28 days trailing while the other will account for 28, 30, 31 day months...dont care about that so much.
A non-agg is ok but also doesnt work. aka {"date": { "$mod": [ 28, 0 ]} }
To pick the first of the month for each month (method 2), use the following aggregation:
db.test.aggregate([
{ "$project" : { "_id" : "$date", "day" : { "$dayOfMonth" : "$date" }, "28day_avg" : 1 } },
{ "$match" : { "day" : 1 } }
])
You can't use an index for the match, so this is not efficient. I'd suggest adding another field to each document that holds the $dayOfMonth value, so you can index it and do a simple find:
{
"date" : ISODate("2014-01-01T00:00:00Z"),
"price" : 59.23,
"28day_avg" : 54.21,
"dayOfMonth" : 1
}
db.test.ensureIndex({ "dayOfMonth" : 1 })
db.test.find({ "dayOfMonth" : 1 }, { "_id" : 0, "date" : 1, "28day_avg" : 1 })

Need some help completing this aggregation pipeline

I have an analytics collection where I store queries as individual documents. I want to count the number of queries taking place over the past day (24 hours). Here's the aggregation command as it is:
db.analytics.aggregate([{$group:{_id:{"day":{$dayOfMonth:"$datetime"},"hour":{$hour:"$datetime"}},"count":{$sum:1}}},{$sort:{"_id.day":1,"_id.hour":1}}])
The result looks like:
.
.
.
{
"_id" : {
"day" : 17,
"hour" : 19
},
"count" : 8
},
{
"_id" : {
"day" : 17,
"hour" : 22
},
"count" : 1
},
{
"_id" : {
"day" : 18,
"hour" : 0
},
"count" : 1
}
.
.
.
Originally, my plan was to add a $limit operation to simply take the last 24 results. That's a great plan until you realize that there are some hours without any queries at all. So the last 24 documents could go back more than a single day. I thought of using $match, but I'm just not sure how to go about constructing it. Any ideas?
First of all you need to get the day just as current date or as most recent document from the collection. Then use query for specified day like:
db.analytics.aggregate([
{$project:{datetime:"$datetime",day:{$dayOfMonth:"$datetime"}}},
{$match:{day:3}},
{$group:{_id:{"hour":{$hour:"$datetime"}},"count":{$sum:1}}},
{$sort:{"_id.hour":1}}
]);
where 3 is the day of the month here {$match:{day:3}}
The idea is to add a day field, so, we able to filter by it, then group documents of the day by hours and sort.