Querying on Date in Mongo - mongodb

I'm inserting a Mongo doc with the following time-stamp:
val format = new java.text.SimpleDateFormat("yyyyMMddHHmmss")
format.format(new Date()).toLong
Here's what the section looks like from Mongo's shell:
"{Timestamp" : NumberLong("20130919161948")}"
Based on a few tests, it appears to me that I can simply compare 2 documents by Timestamp by simply checking > or < for the yyyyMMddHHmmss format.
Please let me know if this time-stamp is OK for Mongo. Will I be able to query with it?

Mongo will not understand this as a timestamp, but as a number. As you set your date with a format going from year to seconds, you will be able to query mongo using > or < to know if it is before or after.
However if you want to mongo to treat the data as a date, you will need to use the appropriate bson date format. By having mongo treat it as a date, you will have all Mongo date operations available, like extracting year, day of week, etc.. read more
If you are using casbah, and Joda, you can enable serialization and deserialization by an explicit call:
import com.mongodb.casbah.conversions.scala._
RegisterJodaTimeConversionHelpers()
Read more here.

#Kevin, I think you are right. java.util.Date is supported in BSON object.
Using NumberLong to represent timestamp allows you to do range queries, but with BSON date type, date operation in aggregation framework becomes possible, which is more powerful.

Related

Converting date string (YYYY-MM-DD) to datetime object pymongo

Before mass inserting using insertmany(), I need to change the "Date" field of each document from a string which is in the format of 'YYYY-MM-DD' (for example '2020-02-28) to a datetime object which can be used in mongo for later purposes...
Is there a possible way of doing this using pymongo
So my idea would look something like this
dict["Date"] = Mongo_Date(dict["Date"]) #converting the original string to a date object
outputList.append(dict)
#Later on in code
mycol.insert_many(outputList)
is there any easy way of doing this with pymongo??
A couple of possibilities come to mind:
use the python map function to modify all of the objects at once
insert the objects into MongoDB, and then use update with $dateFromString to modify them

Query mongo ISODate using play reactive mongo

I am trying to query dates in mongodb.
The dates are stored as ISODate("2015-10-08T05:48:55.778+0000").
Now how should i do query like $gte or $lte.
I have been using Play plugin for reactive mongo
To query from the mongo shell, I would need to query with=>
{"endDateTime":{"$eq": new Date("2017-10-08T05:48:55.778+0000")}
OR,
{"endDateTime":{"$eq": ISODate("2017-10-08T05:48:55.778+0000")}
So, what should I do to query it using play reactive mongo. I have been using JodaTime. I am generating the Json Object of the query, and feeding to the find() api straightaway.
*Yes there a lot suggestion in SO, about the topic, but none of them seem to help me in this case. I could give more info if needed.
Update Answer:
Seems like I had some confusion, when converting the dates.
When I tried converting the String Date to Joda DateTime , the result when I print it in console, it would be shown as timestamp,but when I sent it to reactive mongo find it would convert to some form of string date "2015-10-08T05:48:55.778+0000".
So, I had to retrieve the millisecond conversion and send it to the respective api, and mongo would process without any issues.

querying documents in a date range using Mongo built-in 'timestamp'

I'm aware that the Mongo "ObjectId" has the method "getTimestamp()" , which works like
ObjectId("507f191e810c19729de860ea").getTimestamp()
And also I'm aware that it can be sorted based on built-in 'timestamp'
db.collection.find().sort({'timestamp': -1})
I know I can create a new field "created_time" in each document by converting ObjectId to created_time, then query based on this new field.
I've also read this post which converts the date range to ObjectId and then directly compare the ObjectId, but this method I'm worried about the other bytes which is not for time but for machine and process.
My question is, is there a way to directly query documents in a date range using Mongo built-in 'timestamp'? without extra field or extra effort.
something like below (but I tried below command and not working), which can directly query Mongo using its built-in timestamp.
db.collection.find({'timestamp':{$gt: new Date(ISODate("2015-08-14T14:00:00Z"))}})

How to query date saved as text in bad date format in mongoDB

I am very new to mongodb
I have a database with sale_date and the value is saved as text and the format is "dd:mm:yyyy". Now I want to query based on the date. Like I want to query the last month's entry.
I also have field sale_time and also saved as text and the format is "hh:mm" and I want to query the last hour's entry.
**I want to query from java and also from the mongo console.
One row of my collection:
{
"_id":112350,
"sale_date":"21.07.2011",
"sale_time":"18:50",
"store_id":"OK3889-45",
"region_code":45,
"product_id":"QKDGLHX5061",
"product_catagorie":53,
"no_of_product":1,
"price":1211.37,
"total_price":1211.37
}
I have million of entries. Now I want to find the entries for the month of July 2011 or hour from 18:00 to 19:00 in 21.07.2013.
You can query with a regex matching your results. You said format dd:mm:yyyy but the example looks like dd.mm.yyyy so I used that in examples
For example:
db.sales.find({sale_date: /..\.07\.2011/})
This will be ineficient since it can't use an index, but it will get the job done.
It would be better, if you stick with dates as strings to reverse the order to yyyy:mm:dd then you could use a anchored regex, which will hit an index like:
db.sales.find({sale_date: /2011\.07/})
For the hour query:
db.sales.find({sale_date: "21.07.2013", sale_time: {$gte: "18:00", $lt: "19:00"}})
There is no efficient and reliable way to query the for a date range you want given the date structure you have used. A regex style query would scan through the entire collection for example, and if you have millions of documents, that's not acceptable.
You could theoretically create a MapReduce to better structure the data into a new collection. But, that will be more work to maintain (as MapReduces aren't automatically updated, and may make other queries and data fetching involve more steps than you'd like).
Although, if you're willing to do that, I'd strongly suggest you instead just fix your data as I mentioned in my comment to be a standard YYYYMMDD. Even better, you may want to consider merging the time and would be to include the time stamp in the same field:
2013-07-21T14:30
If you don't though, you can still do the single date query reasonably (although you'd want to index both the date and time as a compound index):
db.sales.ensureIndex({ sale_date: 1, sale_time: 1})
Regarding the code, it's basically going to look like this:
BasicDBObject date = new BasicDBObject("sale_date", "21.07.2013");
BasicDBObject time = new BasicDBObject("sale_time",
new BasicDBObject("$gte", "18:00").
append("$lte", "19:00"));
BasicDBObject andQuery = new BasicDBObject();
List<BasicDBObject> obj = new ArrayList<BasicDBObject>();
obj.add(date);
obj.add(time);
andQuery.put("$and", obj);
cursor = coll.find(andQuery);

Convert a ISODate string to mongoDB native ISODate data type

My application generates logs in JSON format. The logs looks something like this :
{"LogLevel":"error","Datetime":"2013-06-21T11:20:17Z","Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
Currently, I'm pushing in the aforementioned log line as it is into mongoDB. But mongoDB stores the Datetime as a string (which is expected). Now that I want to run some data crunching job on these logs, I'd prefer to store the Datetime as mongoDB's native ISODate data type.
There are 3 ways I can think of for doing this :
i) parse every JSON log line and convert the string to ISODate type in the application code and then insert it. Cons : I'll have to parse each and every line before pushing it to mongoDB, which is going to be a little expensive
ii) After every insert run a query to convert the last inserted document's string date time to ISODate using
element.Datetime = ISODate(element.Datetime);
Cons : Again expensive, as I'm gonna be running one extra query per insert
iii) Modify my logs at generation point so that I don't have to do any parsing at application code level, or run an update query after every insert
Also, just curious, is there a way I can configure mongoDB to auto convert datetime strings to its native isodate format ?
TIA
EDIT:
I'm using pymongo for inserting the json logs
My file looks something like this :
{"LogLevel":"error","Datetime":"2013-06-21T11:20:17Z","Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
There are hundreds of lines like the one mentioned above.
And this is how I'm inserting them into mongodb:
for line in logfile:
collection.insert(json.loads(line))
The following will fix my problem:
for line in logfile:
data = json.loads(line)
data["Datetime"] = datetime.strptime(data["Datetime"], "%Y-%M-%DTHH:mmZ")
collection.insert(data)
What I want to do is get rid of the extra manipulation of datetime I'm having to do above. Hope this clarifies the problem.
Looks like you already have the answer... I would stick with:
for line in logfile:
data = json.loads(line)
data["Datetime"] = datetime.strptime(data["Datetime"], "%Y-%M-%DTHH:mmZ")
collection.insert(data)
I had a similar problem, but I didn't known beforehand where I should replace it by a datetime object. So I changed my json information to something like:
{"LogLevel":"error","Datetime":{"__timestamp__": "2013-06-21T11:20:17Z"},"Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
and parsed json with:
json.loads(data, object_hook=logHook)
with 'logHook' defined as:
def logHook(d):
if '__timestamp__' in d:
return datetime.strptime(d['__timestamp__'], "%Y-%M-%DTHH:mmZ")
return d
This logHook function could also be extended to replace many other 'variables' with elif, elif, ...
Hope this helps!
Also, just curious, is there a way I can configure mongoDB to auto convert datetime strings to its native isodate format ?
You probably want to create a Python datetime object for the timestamp, and insert that using PyMongo. This is stored under the hood as the native date object in MongoDB.
So, for example in Python:
from datetime import datetime
object_with_timestamp = { "timestamp": datetime.now() }
your_collection.insert(object_with_timestamp)
When this object gets queried from the Mongo shell, an ISODate object is present:
"timestamp" : ISODate("2013-06-24T09:29:58.615Z")
It depends on with what language/driver/utility you're pushing the log. I am assuming you're using mongoimport.
mongoimport doesn't support ISODate(). Refer to this issue https://jira.mongodb.org/browse/SERVER-5543 ISODate() is not a JSON format, hence not supported in mongoimport.
i) approach seems more efficient. ii) does two actions on mongo: insert & update. I had same issue while importing some log data into mongo. I ended up converting ISO 8601 format date to epoch format.
{"LogLevel":"error","Datetime":{"$date" : 1371813617000},"Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
Above JSON should work. Note that it is 64-bit not 32-bit epoch.