Get last inserted item per day - mongodb

Is there a feature in mongodb that I can use to get the last inserted item per day ? I have a collection where I need to get the last inserted item per day, the data is grouped on an hourly basis like in the structure below.
{
timestamp: 2017-05-04T09:00:00.000+0000,
data: {}
},
{
timestamp: 2017-05-04T10:00:00.000+0000,
data: {}
}
I thought about using a projection but I am not quite sure how I could do this.
Edit: Also, since mongodb stores data in UTC, I would like to account for the offset as well.

You can $sort and use $last for the item, with rounding out the grouping key to each day:
db.collection.aggregate([
{ "$sort": { "timestamp": 1 } },
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"lastDoc": { "$last": "$$ROOT" }
}}
])
So the sort makes things appear in order, and then the grouping _id is rounded for each day by some date math. You subtract the epoch date from the current date to make it a number. Use the modulus to round to a day, then add the epoch date to the number to return a Date.
So stepping through the math we have getting the timestamp value from the date with the $subract line. We do this a couple of times:
{ "$subtract": [ "$timestamp", new Date(0) ] }
// Is roughly internally like
ISODate("2017-06-06T10:44:37.627Z") - ISODate("1970-01-01T00:00:00Z")
1496745877627
Then there is the modulo with $mod which when applied to the numeric value returns the difference. The 1000 milliseconds * 60 seconds * 60 * minutes * 24 hours gives the other argument:
{ "$mod": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
1000 * 60 * 60 * 24
]}
// Equivalent to
1496745877627 % (1000 * 60 * 60 * 24)
38677627
Then there is the wrapping $subtract of the two numbers:
{ "$subtract": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]}
// Subtract "difference" of the modulo to a day
// from the milliseconds value of the current date
1496745877627 - 38677627
1496707200000
Then add back to the epoch date value to create a date rounded to the current day, which to the aggregation pipeline basically looks like providing the millisecond value to the constructor:
new Date(1496707200000)
ISODate("2017-06-06T00:00:00Z")
Which takes the timestamp value and subrtacts out the difference of the divisor from "one day" and ends up at the time at the "start of day".
Just using $$ROOT here to represent the whole document. But any document path provided to $last here provides the result.

Related

mongodb find oldest date of three keys in each document

I have a document schema that looks like this:
{
status: String,
estimateDate: Date,
lostDate: Date,
soldDate: Date,
assignedDate: Date
}
With this schema all three dates could exists and none of them could exists. I need to do a check of all three and if at least one exists use the oldest date if none exists use todays date. With the "returned" date, get the difference in days from another key (assignedDate). I have figured out how to do what I want with one date but cannot figure out how to scale this up to include all three keys. Below is the working code I have for one key.
Within my aggregate pipeline $project stage I do the following:
days: {
$cond: {
if: {
$not: ["$date1"]
},
then: {
$floor: {
$divide: [
{
$subtract: [new Date(), "$assignedDate"]
},
1000 * 60 * 60 * 24
]
}
},
else: {
$floor: {
$divide: [
{
$subtract: [
"$estimateDate",
"$assignedDate"
]
},
1000 * 60 * 60 * 24
]
}
}
}
}
You can use $min and $ifNull operators to get oldest date specify new Date() as default value if any of those dates does not exist:
db.col.aggregate([
{
$project: {
oldest: {
$min: [
{ $ifNull: [ "$lostDate", new Date() ] },
{ $ifNull: [ "$soldDate", new Date() ] },
{ $ifNull: [ "$assignedDate", new Date() ] },
]
}
}
}
])

How to use aggregate to group by half hour with rounding?

I'm using $group to group my post by hour like:
"$group" : {
"_id" : {
"$hour" : {
$add : ["$createdAt", 10*60*60*1000]
}
},
...
}
But now I also want to group by half-of-hour, it's mean:
2:30 => 3:00
2:29 => 2:00
How I using mongo aggregate to pass this trouble?
Sr for my bad English. :)
I gather the +10 here is for a timezone adjustment. The same basic principles apply to producing the date with 30 minute rounding, except you want to first just convert to a numeric value and work back the intervals via a modulo ( $mod ):
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [
{ "$add": [ "$createdAt", 1000 * 60 * 60 * 10 ] },
new Date(0)
]},
{ "$mod": [
{ "$subtract": [
{ "$add": [ "$createdAt", 1000 * 60 * 60 * 10 ] },
new Date(0)
]},
1000 * 60 * 30
]}
]},
new Date(0)
]
},
"count": { "$sum": 1 } // or whatever accumulation required
}}
Using the epoch date ( Date(0) ) with a $subtract operation from the stored date ( adjusted ) will return the milliseconds since epoch as a numeric value from the date stored. The modulo operation to the milliseconds in 30 minutes returns the remainder from the current date and you then $subtract that again to get a rounded interval.
The same is present with the $add operation where the epoch date object to a numeric value returns a Date again.
So every interval start is now the grouping key, as of every 30 minutes.
You can alternately use date aggregation operators, but this returns a BSON Date object which will be translated in API rather than just an numeric value for the "minutes" interval.
It's just standard "date math", so all the same operations apply.

How to Get the Max Daily Value per Week by Grouping

I am making a project which requires me to first calculate how much distance was traveled per day. and then on that data I have to how show What was the maximum, minimum and average distance traveled that particular week?
This is a mongoDB script I have written.
db = connect("localhost:27017/mydb");
var result = db.trips.aggregate([
{
"$unwind" : "$trips"
},
{
"$match" : {
"trips.startTime" : {"$lte" : ISODate("2015-10-31T23:59:59Z"), "$gte" : ISODate("2015-10-25T00:00:00Z")}
}
},
{
"$group" :
{
"_id" : {
"date" : {"$dayOfMonth" : "$trips.startTime"}
},
"distance" :{"$sum" : "$trips.distance"}
}
}
]);
while(result.hasNext())
{
print(tojson(result.next()));
}
Which when replaced by dynamic dates gives me correct values.
Now it leaves me with two options, either I modify the current group query or write a double group query. Double group query seems a more valid approach. My attempt at writing such a query.
{
"$group" :
{
"_id" : {
"week" : "$_id.date"
},
"max-distance" : {
"$max" : "$distance"
}
}
}
Adding these lines didn't make a difference, clearly I know I am doing wrong, but how to correct it. i would need help with that
Thanks
You seem to wan the $week operator, but of course you need a valid Date as input in order to extract the "week" from that.
What you may not know is that you can instead use "date math" to round out the date to a "day", where the result is still a Date object. Then you can use the $week operator to obtain your $max values:
db.trips.aggregate([
{ "$unwind" : "$trips" },
{ "$match": {
"trips.startTime" : {
"$lte": ISODate("2015-10-31T23:59:59Z"),
"$gte": ISODate("2015-10-25T00:00:00Z")
}
}},
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$trips.startTime", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$trips.startTime", new Date(0) ] },
1000 * 60 * 60 * 24
]}
]},
new Date(0)
]
},
"distance": { "$sum": "$trips.distance" }
}},
{ "$group": {
"_id": { "$week": "$_id" },
"max-distance": { "$max": "$distance" }
}
]);
The basic trick in the first part is when you $subtract one Date object from another, the result is the millseconds in difference. So using the epoch date the data is converted to it's milliseconds equivalent and then you can use the math to round that number to a day.
(1000 * 60 * 60 * 24) is the number of milliseconds in a day, so finding the modulo ( $mod ) of that returns the remainder of milleseconds past the day, which you can subtract from the date value in the document to round to a day.
The same is true of $add when adding a Date object to a number, the result is a Date. So this handles the conversion, and then the $week can be extracted from there.

Mogodb split values into 5 minute intervals and return most recent within interval group

My Mongo database has documents as so:
{
"timestamp": ISODate("2015-09-27T15:28:06.0Z"),
"value": '123'
},
{
"timestamp": ISODate("2015-09-27T15:31:06.0Z"),
"value": '737'
},
{
"timestamp": ISODate("2015-09-27T15:35:00.0Z"),
"value": '456'
},
{
"timestamp": ISODate("2015-09-27T15:40:20.0Z"),
"value": '789'
}
...etc...
What I want to do is aggregate these in 5 minute intervals and than get the most recent (with the latest timestamp) value per 'group of 5 minutes'.
So basically the steps are:
1) split into groups of 5 minutes
2) return the 5-minute timestamp and the value of the document that has the newest timestamp within this 5 minute group
Based on that and my documents above the documents returned should be:
{
"timestamp": ISODate("2015-09-27T15:25:00.0Z"),
"value": '123'
},
{
"timestamp": ISODate("2015-09-27T15:35:00.0Z"),
"value": '456' // 456 has a newer timestamp than 737, which are in the same 5 minute range
},
{
"timestamp": ISODate("2015-09-27T15:40:00.0Z"),
"value": '789'
}
I have tried grouping into 5 minute intervals as described here: https://stackoverflow.com/a/26814496/1007236
Starting from there I can't find out how to return the value of the most recent within each 5 minute group.
How can I do that?
You solve this by a very simple application of Date math:
db.collection.aggregate([
{ "$sort": { "timestamp": 1 } },
{ "$group": {
"_id": {
"$add": [
{ "$subtract": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
{ "$mod": [
{ "$subtract": [ "$timestamp", new Date(0) ] },
1000 * 60 * 5
]}
]},
new Date(0)
]
},
"value": { "$first": "$value" }
}}
])
Where the basic principle is finding the modulo ( $mod ) or "remainder" from the time by a five minute interval and subtracting that from the base time. This rounds to "five minutes".
Of course the other part is you $sort in order to make sure the smallest original "timestamp" sorted "value" is on "top".
The other parts are when you $subtract "epoch" date as a BSON Date from another date then you receive an "integer" in result. The similar part is adding ( $add ) an "integer" to a BSON Date type to receive another BSON Date.
The result is BSON Date objects rounded out to the interval you use with the math.
1000 millisecons X 60 seconds X 5 minutes.

Getting unix timestamp in seconds out of MongoDB ISODate during aggregation

I was searching for this one but I couldn't find anything useful to solve my case. What I want is to get the unix timestamp in seconds out of MongoDB ISODate during aggregation. The problem is that I can get the timestamp out of ISODate but it's in milliseconds. So I would need to cut out those milliseconds. What I've tried is:
> db.data.aggregate([
{$match: {dt:2}},
{$project: {timestamp: {$concat: [{$substr: ["$md", 0, -1]}, '01', {$substr: ["$id", 0, -1]}]}}}
])
As you can see I'm trying to get the timestamp out of 'md' var and also concatenate this timestamp with '01' and the 'id' number. The above code gives:
{
"_id" : ObjectId("52f8fc693890fc270d8b456b"),
"timestamp" : "2014-02-10T16:20:56011141"
}
Then I improved the command with:
> db.data.aggregate([
{$match: {dt:2}},
{$project: {timestamp: {$concat: [{$substr: [{$subtract: ["$md", new Date('1970-01-01')]}, 0, -1]}, '01', {$substr: ["$id", 0, -1]}]}}}
])
Now I get:
{
"_id" : ObjectId("52f8fc693890fc270d8b456b"),
"timestamp" : "1392049256000011141"
}
What I really need is 1392049256011141 so without the 3 extra 000. I tried with $subtract:
> db.data.aggregate([
{$match: {dt:2}},
{$project: {timestamp: {$concat: [{$substr: [{$divide: [{$subtract: ["$md", new Date('1970-01-01')]}, 1000]}, 0, -1]}, '01', {$substr: ["$id", 0, -1]}]}}}
])
What I get is:
{
"_id" : ObjectId("52f8fc693890fc270d8b456b"),
"timestamp" : "1.39205e+009011141"
}
Not exactly what I would expect from the command. Unfortunately the $substr operator doesn't allow negative length. Does anyone have any other solution?
I'm not sure why you think you need the value in seconds rather than milliseconds as generally both forms are valid and within most language implementations the milliseconds is actually preferred. But generally speaking, trying to coerce this into a string is the wrong way to go around this, and generally you just do the math:
db.data.aggregate([
{ "$project": {
"timestamp": {
"$subtract": [
{ "$divide": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
1000
]},
{ "$mod": [
{ "$divide": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
1000
]},
1
]}
]
}
}}
])
Which returns you an epoch timestamp in seconds. Basically derived from when one BSON date object is subtracted from another one then the result is the time interval in milliseconds. Using the initial epoch date of "1970-01-01" results in essentially extracting the milliseconds value from the current date value. The $divide operator essentially takes off the milliseconds portion and the $mod does the modulo to implement rounding.
Really though you are better off doing the work in the native language for your application as all BSON dates will be returned there as a native "date/datetime" type where you can extract the timestamp value. Consider the JavaScript basics in the shell:
var date = new Date()
( date.valueOf() / 1000 ) - ( ( date.valueOf() / 1000 ) % 1 )
Typically with aggregation you want to do this sort of "math" to a timestamp value for use in something like aggregating values within a time period such as a day. There are date operators available to the aggregation framework, but you can also do it the date math way:
db.data.aggregate([
{ "$group": {
"_id": {
"$subtract": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
{ "$mod": [
{ "$subtract": [ "$md", new Date("1970-01-01") ] },
1000 * 60 * 60 * 24
]}
]
},
"count": { "$sum": 1 }
}}
])
That form would be more typical to emit a timestamp rounded to a day, and aggregate the results within those intervals.
So your purposing of the aggregation framework just to extract a timestamp does not seem to be the best usage or indeed it should not be necessary to convert this to seconds rather than milliseconds. In your application code is where I think you should be doing that unless of course you actually want results for intervals of time where you can apply the date math as shown.
The methods are there, but unless you are actually aggregating then this would be the worst performance option for your application. Do the conversion in code instead.