Query datetime by time of day in MongoDB [duplicate] - mongodb

This question already has answers here:
Group result by 15 minutes time interval in MongoDb
(7 answers)
Closed 5 years ago.
I have a collection of objects in my mongoDB containing datetimes among other values.
How would I go about querying the objects by datetime, where the timestamp is set to be at 9 o'clock?
So if I have the following collection...
id : 1, date : ISODate("2017-07-16T09:00:00.000+0000")
id : 2, date : ISODate("2017-01-17T07:00:00.000+0000")
id : 3, date : ISODate("2017-07-27T09:00:00.000+0000")
id : 4, date : ISODate("2017-03-20T09:00:00.000+0000")
id : 5, date : ISODate("2017-03-07T10:00:00.000+0000")
id : 6, date : ISODate("2017-07-04T11:00:00.000+0000")
The return value should be...
id : 1, date : ISODate("2017-07-16T09:00:00.000+0000")
id : 3, date : ISODate("2017-07-27T09:00:00.000+0000")
id : 4, date : ISODate("2017-03-20T09:00:00.000+0000")
I'm fairly new to MongoDB and not very experienced with Js so please try and keep it as simple as you can. To that note Neil Lunn marked this question as a duplicate of
This Question, which I feel is partially correct, but it's also more complex than I need.
I don't need grouping or anything of that nature, I just want a query that tells me which documents exist containing this timestamp.

You could use an aggregate pipeline to convert the date to its timepart and then match on that converted value. For example:
db.collection.aggregate([
{
$project: {
timePart: {$dateToString: { format: "%H:%M:%S:%L", date: "$date"}},
date: 1
}
},
{
$match: {
timePart: '09:00:00:000'
}
},
{
$project: {
date: 1
}
}
])
You can think of this as a pipeline; the output from the first $project step becomes the input to the $match step. The $project step outputs - for every document in the underlying collection - a document containing the _id, the date and a new attribute named timePart which has been populated with the time part from the date attribute. The $match step then matches these documents against your filter criteria (in your example this is 09:00:00:000 i.e. 9am) and then the documents which are matched are then forwarded to the next step which uses the $project operator again to discard the timePart attribute since, I assume, that is only relevant for saerching and should not be included in the end result.
Breaking it down, the output of the first step looks like this:
{
"_id" : 1,
date : ISODate("2017-07-16T09:00:00.000+0000"),
timePart: "09:00:00.000"
},
{
"_id" : 2,
date : ISODate("2017-01-17T07:00:00.000+0000"),
timePart: "07:00:00.000"
},
...
The second step excludes the document with id: 2 because its timePart does not match 09:00:00.000 and then forwards the document with id: 1 to the third stage which then selects - from those documents forwarded by step 2 - the fields _id and date thereby giving you:
{
"_id" : 1,
date : ISODate("2017-07-16T09:00:00.000+0000")
},
{
"_id" : 3,
date : ISODate("2017-07-27T09:00:00.000+0000")
},
{
"_id" : 4,
date : ISODate("2017-03-20T09:00:00.000+0000")
}
Note: this approach must transform the date attribute of every document before applying the match stage, if that's worryingly inefficient for you then you might want to reconsider how you are persisting this data.

Related

how get the last 4 months average value

I am trying this aggregation last 4 months records each month OZONE average value but average value is null how to get the average value
db.Air_pollution.aggregate([
{$match:{CREATE_DATE:{$lte:new Date(),$gte:new Date(new Date().setDate(new
Date().getDate()-120))}}},
{$group:{_id:{month:{$month:"$CREATE_DATE"},year:
{$year:"$CREATE_DATE"}},avgofozone:{$avg:"$OZONE"}}},
{$sort:{"year":-1}},{$project:
{year:'$_id.year',month:'$_id.month',_id:0,avgofozone:1}}
])
output:
{ "avgofozone" : null, "year" : 2018, "month" : 2 }
{ "avgofozone" : null, "year" : 2018, "month" : 3 }
{ "avgofozone" : null, "year" : 2018, "month" : 1 }
It's not working because the OZONE field is a string, and you can't compute $avg on a string. Plus, it's not a valid number: "8:84" should be 8.84
from mongodb documentation:
$avg
Returns the average value of the numeric values that result from
applying a specified expression to each document in a group of
documents that share the same group by key. $avg ignores non-numeric
values.
Otherwise the aggregation query is correct, here is a link showing it: mongo playground.net/p/VaL-Nn8e21E

Add object to object array if an object property is not given yet

Use Case
I've got a collection band_profiles and I've got a collection band_profiles_history. The history collection is supposed to store a band_profile snapshot every 24 hour and therefore I am using MongoDB's recommended format for historical tracking: Each month+year is it's own document and in an object array I will store the bandProfile snapshot along with the current day of the month.
My models:
A document in band_profiles_history looks like this:
{
"_id" : ObjectId("599e3bc406955db4cbffe0a8"),
"month" : 7,
"tag_lowercased" : "9yq88gg",
"year" : 2017,
"values" : [
{
"_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "example name1",
},
"day" : 1
},
{
"_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "new name",
},
"day" : 2
}
]
}
And a document in band_profiles:
{
"_id" : ObjectId("5989a6190f39d9fd70cddeb1"),
"tag" : "9V9LRGU",
"name_normalized" : "example name",
"tag_lowercased" : "9v9lrgu",
}
This is how I upsert my documents into band_profiles_history at the moment:
BandProfileHistory.update(
{ tag_lowercased: tag, year, month},
{ $push: {
values: { day, profile }
}
},
{ upsert: true }
)
My problem:
I only want to insert ONE snapshot for every day. Right now it would always push a new object into the object array values no matter if I already have an object for that day or not. How can I achieve that it would only push that object if there is no object for the current day yet?
Putting mongoose aside for a moment:
There is an operation addToSet that will add an element to an array if it doesn't already exists.
Caveat:
If the value is a document, MongoDB determines that the document is a duplicate if an existing document in the array matches the to-be-added document exactly; i.e. the existing document has the exact same fields and values and the fields are in the same order. As such, field order matters and you cannot specify that MongoDB compare only a subset of the fields in the document to determine whether the document is a duplicate of an existing array element.
Since you are trying to add an entire document you are subjected to this restriction.
So I see the following solutions for you:
Solution 1:
Read in the array, see if it contains the element you want and if not push it to the values array with push.
This has the disadvantage of NOT being an atomic operation meaning that you could end up would duplicates anyways. This could be acceptable if you ran a periodical clean up job to remove duplicates from this field on each document.
It's up to you to decide if this is acceptable.
Solution 2:
Assuming you are putting the field _id in the subdocuments of your values field, stop doing it. Assuming mongoose is doing this for you (because it does, from what I understand) stop it from doing it like it says here: Stop mongoose from creating _id for subdocument in arrays.
Next you need to ensure that the fields in the document always have the same order, because order matters when comparing documents in the addToSet operation as stated in the citation above.
Solution 3
Change the schema of your band_profiles_history to something like:
{
"_id" : ObjectId("599e3bc406955db4cbffe0a8"),
"month" : 7,
"tag_lowercased" : "9yq88gg",
"year" : 2017,
"values" : {
"1": { "_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "example name1"
}
},
"2": {
"_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "new name"
}
}
}
Notice that the day field became the key for the subdocuments on the values. Notice also that values is now an Object instead of an Array.
No you can run an update query that would update values.<day> only if values.<day> didn't exist.
Personally I don't like this as it is using the fact that JSON doesn't allow duplicate keys to support the schema.
First of all, sadly mongodb does not support uniqueness of a field in an array of a collection. You can see there is major bug opened for 7 years and not closed yet(that is a shame in my opinion).
What you can do from here is limited and all is on application level. I had same problem and solve it in application level. Do something like this:
First read your document with document _id and values.day.
If your reading in step 1 returns null, that means there is no record on values array for given day, so you can push the new value(I assume band_profile_history has record with _id value).
If your reading in step 1 returns a document, that means values array has a record for given day. In that case you can use setoperation with $operator.
Like others said, they will be not atomic but while you are dealing with your problem in application level, you can make whole bunch of code synchronized. There will be 2 queries to run on mongodb among of 3 queries. Like below:
db.getCollection('band_profiles_history').find({"_id": "1", "values.day": 3})
if returns null:
db.getCollection('band_profiles_history').update({"_id": "1"}, {$push: {"values": {<your new band profile history for given day>}}})
if returns not null:
db.getCollection('band_profiles_history').update({"_id": "1", "values.day": 3}, {$set: {"values.$": {<your new band profile history for given day>}}})
To check if object is empty
{ field: {$exists: false} }
or if it is an array
{ field: {$eq: []} }
Mongoose also supports field: {type: Date} so you can use it instead counting a days, and do updates only for current date.

Can sorting before grouping improve query performance in Mongo using the aggregate framework?

I'm trying to aggregate data for 100 accounts for a 14-15 month period, grouping by year and month.
However, the query performance is horrible as it takes 22-27 seconds. There are currently over 15 million records in the collection and I've got an index on the match criteria and can see using explain() that the optimizer uses it.
I tried adding another index on the sort criteria in the query below and after adding the index, the query now takes over 50 seconds! This happens even after I remove the sort from the query.
I'm extremely confused. I thought because grouping can't utilize an index, that if the collection was sorted beforehand, then the grouping could be much faster. Is this assumption correct? If not, what other options do I have? I can bear the query performance to be as much as 5 seconds but nothing more than that.
//Document Structure
{
Acc: 1,
UIC: true,
date: ISODate("2015-12-01T05:00:00Z"),
y: 2015
mm: 12
value: 22.3
}
//Query
db.MyCollection.aggregate([
{ "$match" : { "UIC" : true, "Acc" : { "$in" : [1, 2, 3, ..., 99, 100] }, "date" : { "$gte" : ISODate("2015-12-01T05:00:00Z"), "$lt" : ISODate("2017-02-01T05:00:00Z") } } },
//{ "$sort" : { "UIC" : 1, "Acc" : 1, "y" : -1, "mm" : 1 } },
{ "$group" : { "_id" : { "Num" : "$Num", "Year" : "$y", "Month" : "$mm" }, "Sum" : { "$sum" : "$value" } } }
])
What I would suggest you to do is to make a script (can be in nodejs) that aggregates the data in a different collection. When you have these long queries, what's advisable is to make a different collection containing the aggregation data and query from that.
My second advice would be to create a composed index in this aggregated collection and search by regular expression. In your case I would make an index containing accountId:period. For example, for account 1, and February of 2016, The index would be something like 1:201602.
Then you would be able to perform queries using regular expressions by account and timestamp. Like as if you wanted the registers for 2016 of account 1, you could do something like:
db.aggregatedCollection.find{_id : \1:2016\})
Hope my answer was helpful

Mongodb: Get documents sorted by a dynamic ranking

I have these documents:
{ "_id" : ObjectId("52abac78f8b13c1e6d05aeed"), "score" : 125494, "updated" : ISODate("2013-12-14T00:55:20.339Z"), "url" : "http://pictwittrer.com/1crfS1t" }
{ "_id" : ObjectId("52abac86f8b13c1e6d05af0f"), "score" : 123166, "updated" : ISODate("2013-12-14T00:55:34.354Z"), "url" : "http://bit.ly/JghJ1N" }
Now, i would like to get all documents sorted by this dynamic ranking:
ranking = score / (NOW - updated).abs
ranking is a float value where:
- score is the value of scopre property of my document
- the denominator is just the difference between NOW (when I'm executing this query) and updated field of my document
I'd want to do this because I want the old documents are sorted last
I'm new to Mongodb and aggregation frameworks but considering the answer Tim B gave I came up with this:
db.coll.aggregate(
{ $project : {
"ranking" : {
"$divide" : ["$score", {"$subtract":[new Date(), "$updated"]}]
}
}
},
{ $sort : {"ranking" : 1}})
Using $project you can reshape documents to insert precomputed values, in your case the ranking field. After that using $sort you can sort the documents by rank in the order you like by specifying 1 for ascending or -1 for descending.
I'm sorry for the terrible code formatting, I tried to make it as readable as possible.
Look at the MongoDB aggregation framework, you can do a project to create the score you want and then a sort to sort by that created score.
http://docs.mongodb.org/manual/core/aggregation-pipeline/
http://docs.mongodb.org/manual/reference/command/aggregate/#dbcmd.aggregate

Aggregating by a field and selecting the document with the max value of another field as a collection

Using the aggregate framework, what is the best way to get documents with a maximum value of a field per grouping so using the collection below I would like to have functionality to return one document for each group_id having the latest date. The second listing shows the desired result.
group_id date
1 11/1/12
1 11/2/12
1 11/3/12
2 11/1/12
3 11/2/12
3 11/3/12
DESIRED RESULT
group_id date
1 11/3/12
2 11/1/12
3 11/3/12
You can use the $max grouping function in the Aggregation Framework to find the latest document for each group_id. You will need additional queries to retrieve the full documents based on the grouped criteria.
var results = new Array();
db.groups.aggregate(
// Find documents with latest date for each group_id
{ $group: {
_id: '$group_id',
date: { $max: '$date' },
}},
// Rename _id to group_id, so can use as find criteria
{ $project: {
_id: 0,
group_id:'$_id',
date: '$date'
}}
).result.forEach(function(match) {
// Find matching documents per group and push onto results array
results.push(db.groups.findOne(match));
});
Example results:
{
"_id" : ObjectId("5096cfb8c24a6fd1a8b68551"),
"group_id" : 1,
"date" : ISODate("2012-11-03T00:00:00Z"),
"foo" : "bar"
}
{
"_id" : ObjectId("5096cfccc24a6fd1a8b68552"),
"group_id" : 2,
"date" : ISODate("2012-11-01T00:00:00Z"),
"foo" : "baz"
}
{
"_id" : ObjectId("5096cfddc24a6fd1a8b68553"),
"group_id" : 3,
"date" : ISODate("2012-11-03T00:00:00Z"),
"foo" : "bat"
}