get day of the week from mongodb datetime query - mongodb

I am constructing a database where I might want to query a day of the week. Is it possible to use mongodb to query days in the week in a datetime (or utc timestamp) field?
Something like; get every object that has a datetime that was on a monday.
If it is not possible then the alternative seems to create dummy variables in the collection that show what day of the week it was. Preferably I would like to only query the datetime object for this as this would keep the database smaller.

There are three solutions that I can think of:
Your solution: create an extra "day_of_week" field, either an int or string, and then query against this field rather than the datetime field.
Query for everything in your collection, and then filter the results by day of the week on the client side.
Use $where, passing a javascript function which calls date.getDay(). For example, {$where: function () { return this.date.getDay() == 5; }} for getting every date on a Friday.
Solution #2 would call datetime.date.weekday() in pymongo on the client side. The downside of this method is that every document in the collection will end up being sent over the wire, which could add unnecessary network load. It's better than #1, however, in that it's more space efficient and you don't have duplicated information to keep in sync. Solution #3 has neither of these problems, but $where is slow because it requires the server to create a JavaScript execution context and cannot make use of indexes.

Pymongo can return Mongo BSON timestamp fields as python datetimes: http://api.mongodb.org/python/current/api/bson/timestamp.html
From there you can call datetime.date.weekday()
http://docs.python.org/2/library/datetime.html#datetime.date

Related

Should I use the timestamp in "_id"?

I need monitor the time of the records been created, for further query and modify.
first thing flashed in my mind is give the document a "createDateTime" field, with the default value of "new Date()", but Mongodb said the document _id has a timestamp embedded with, and the id was generated when the document was created, so it sounds dummy to add a new field for that.
for too many times, I've seen people set a "createDateTime" for their data, and I don't know if they know about the details of mongodb's _id.
I want know should I use the _id as a "createDateTime" field? what is the best practice?
and the pros and cons.
thanks for any tips.
I'd actually say it depends on how you want to use the date.
For example, it's not actionable using the aggregation framework Date operators.
This will fail for example:
db.test.aggregate( { $group : { _id: { $year: "$_id" } } })
The following error occurs:
"errmsg" : "exception: can't convert from BSON type OID to Date"
(The date cannot be extracted from the ObjectId.)
So, operations that are normally simple date operations become much more complex if you wanted to do any sort of date math in an aggregation. It would be far easier to have a createDateTime stamp. Counting the number of documents created in a particular year and month would be simple using aggregation with a distinct createdDateTime field.
You can sort on an ObjectId, to some degree. The remaining 8 bytes of the ObjectId aren't sortable in a meaningful way. Most MongoDB drivers default to creating the ObjectId within the driver and not on the database. So, if you've got multiple clients (like web servers for example) creating new documents (and new ObjectIds), the time stamps will only be as accurate as the various servers.
Also, depending the precision you'd need, an ISODate value is stored using 8 bytes, rather than the 4 used in an ObjectId.
Yes, you should. There is no reason not to do, besides the human readability while directly looking into the database. See also here and here.
If you want to use the aggregation framework to group by the date within _id, this is not possible yet as WiredPrairie correctly said. There is an open jira ticket for that, you might watch. But of course you can do this with Map-Reduce and ObjectID.getTimestamp(). An example for that can be found here.

Sort collection by insertion datetime using only id field

I have a collection of data and I want to get it sorted by insertion time. I have not any additional fields to store the insert time. But as I found out I can get this time from Id.
I have tried this code:
return bookmarks.find({}, {sort: {_id.getTimestamp(): 1}, limit: 10});
or
return bookmarks.find({}, {sort: {ObjectId(_id).getTimestamp(): 1}, limit: 10});
but get the error message:
=> Your application has errors. Waiting for file change.
Is there any way to sort collection by insertion datetime using only id field ?
At the moment this isn't possible with Meteor, even if it is with MongoDB. The ObjectID's created with meteor don't bear a timestamp. See http://docs.meteor.com/#collection_object_id
The reason for this is client side code can insert code and it can arrive late on the server, hence there is no guarantee the timestamp portion of the ObjectID will be accurate. In addition to the latency the client side's date is used meaning if they're off it's going to get you incorrect data. I think this is the reason they use an ObjectID but it is completely random.
If you want to sort by date you have to store the time/date separately.
The part what i striked out is not accurate. Meteor use it is own id generation which is based on a random string that is while does not apply the doc what i linked before. Check sasha.sochka's comment under.
It is nearly but not 100% good if you just sort for the _id field . While as it is constructed the first 4 byte is the timestamp in secs (so sorting for the getTimestamps value is not better). Under one second resolution you cannot get the exact order, as it is mentioned in the documentation: http://docs.mongodb.org/manual/reference/object-id/#objectid
It is still true that you can try to check the exact order of the insert/update ops against your collection in the oplog, if you have an oplog, but as it is a capped collection anyway you will see the recent operations only. http://docs.mongodb.org/manual/core/replica-set-oplog/.

Querying on Date in Mongo

I'm inserting a Mongo doc with the following time-stamp:
val format = new java.text.SimpleDateFormat("yyyyMMddHHmmss")
format.format(new Date()).toLong
Here's what the section looks like from Mongo's shell:
"{Timestamp" : NumberLong("20130919161948")}"
Based on a few tests, it appears to me that I can simply compare 2 documents by Timestamp by simply checking > or < for the yyyyMMddHHmmss format.
Please let me know if this time-stamp is OK for Mongo. Will I be able to query with it?
Mongo will not understand this as a timestamp, but as a number. As you set your date with a format going from year to seconds, you will be able to query mongo using > or < to know if it is before or after.
However if you want to mongo to treat the data as a date, you will need to use the appropriate bson date format. By having mongo treat it as a date, you will have all Mongo date operations available, like extracting year, day of week, etc.. read more
If you are using casbah, and Joda, you can enable serialization and deserialization by an explicit call:
import com.mongodb.casbah.conversions.scala._
RegisterJodaTimeConversionHelpers()
Read more here.
#Kevin, I think you are right. java.util.Date is supported in BSON object.
Using NumberLong to represent timestamp allows you to do range queries, but with BSON date type, date operation in aggregation framework becomes possible, which is more powerful.

How to query date saved as text in bad date format in mongoDB

I am very new to mongodb
I have a database with sale_date and the value is saved as text and the format is "dd:mm:yyyy". Now I want to query based on the date. Like I want to query the last month's entry.
I also have field sale_time and also saved as text and the format is "hh:mm" and I want to query the last hour's entry.
**I want to query from java and also from the mongo console.
One row of my collection:
{
"_id":112350,
"sale_date":"21.07.2011",
"sale_time":"18:50",
"store_id":"OK3889-45",
"region_code":45,
"product_id":"QKDGLHX5061",
"product_catagorie":53,
"no_of_product":1,
"price":1211.37,
"total_price":1211.37
}
I have million of entries. Now I want to find the entries for the month of July 2011 or hour from 18:00 to 19:00 in 21.07.2013.
You can query with a regex matching your results. You said format dd:mm:yyyy but the example looks like dd.mm.yyyy so I used that in examples
For example:
db.sales.find({sale_date: /..\.07\.2011/})
This will be ineficient since it can't use an index, but it will get the job done.
It would be better, if you stick with dates as strings to reverse the order to yyyy:mm:dd then you could use a anchored regex, which will hit an index like:
db.sales.find({sale_date: /2011\.07/})
For the hour query:
db.sales.find({sale_date: "21.07.2013", sale_time: {$gte: "18:00", $lt: "19:00"}})
There is no efficient and reliable way to query the for a date range you want given the date structure you have used. A regex style query would scan through the entire collection for example, and if you have millions of documents, that's not acceptable.
You could theoretically create a MapReduce to better structure the data into a new collection. But, that will be more work to maintain (as MapReduces aren't automatically updated, and may make other queries and data fetching involve more steps than you'd like).
Although, if you're willing to do that, I'd strongly suggest you instead just fix your data as I mentioned in my comment to be a standard YYYYMMDD. Even better, you may want to consider merging the time and would be to include the time stamp in the same field:
2013-07-21T14:30
If you don't though, you can still do the single date query reasonably (although you'd want to index both the date and time as a compound index):
db.sales.ensureIndex({ sale_date: 1, sale_time: 1})
Regarding the code, it's basically going to look like this:
BasicDBObject date = new BasicDBObject("sale_date", "21.07.2013");
BasicDBObject time = new BasicDBObject("sale_time",
new BasicDBObject("$gte", "18:00").
append("$lte", "19:00"));
BasicDBObject andQuery = new BasicDBObject();
List<BasicDBObject> obj = new ArrayList<BasicDBObject>();
obj.add(date);
obj.add(time);
andQuery.put("$and", obj);
cursor = coll.find(andQuery);

What is the fastest way to see when the last update to MongoDB was made

I'm writing a long-polling script to check for new documents in my mongo collection and the only way I know of checking to see whether or not changes have been made in the past iteration of my loop is to do a query getting the last document's ID and parsing out the timestamp in that ID and seeing if it's greater than the timestamp left since I last made a request.
Is there some kind of approach that doesn't involve making a query every time or something that makes that query the fastest?
I was thinking a query like this:
db.chat.find().sort({_id:-1}).limit(1);
But it would be using the PHP driver.
The fastest way will be creating indexes on timestamp field.
Creating index:
db.posts.ensureIndex( { timestamp : 1 } )
Optimizes this query:
db.posts.find().sort( { timestamp : -1 } )
findOne give you only one the last timestamp.
nice to help you.
#Zagorulkin Your suggestion is surely going to help in the required scenario. However i don't think so sort() works with findOne().