Convert a ISODate string to mongoDB native ISODate data type - mongodb

My application generates logs in JSON format. The logs looks something like this :
{"LogLevel":"error","Datetime":"2013-06-21T11:20:17Z","Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
Currently, I'm pushing in the aforementioned log line as it is into mongoDB. But mongoDB stores the Datetime as a string (which is expected). Now that I want to run some data crunching job on these logs, I'd prefer to store the Datetime as mongoDB's native ISODate data type.
There are 3 ways I can think of for doing this :
i) parse every JSON log line and convert the string to ISODate type in the application code and then insert it. Cons : I'll have to parse each and every line before pushing it to mongoDB, which is going to be a little expensive
ii) After every insert run a query to convert the last inserted document's string date time to ISODate using
element.Datetime = ISODate(element.Datetime);
Cons : Again expensive, as I'm gonna be running one extra query per insert
iii) Modify my logs at generation point so that I don't have to do any parsing at application code level, or run an update query after every insert
Also, just curious, is there a way I can configure mongoDB to auto convert datetime strings to its native isodate format ?
TIA
EDIT:
I'm using pymongo for inserting the json logs
My file looks something like this :
{"LogLevel":"error","Datetime":"2013-06-21T11:20:17Z","Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
There are hundreds of lines like the one mentioned above.
And this is how I'm inserting them into mongodb:
for line in logfile:
collection.insert(json.loads(line))
The following will fix my problem:
for line in logfile:
data = json.loads(line)
data["Datetime"] = datetime.strptime(data["Datetime"], "%Y-%M-%DTHH:mmZ")
collection.insert(data)
What I want to do is get rid of the extra manipulation of datetime I'm having to do above. Hope this clarifies the problem.

Looks like you already have the answer... I would stick with:
for line in logfile:
data = json.loads(line)
data["Datetime"] = datetime.strptime(data["Datetime"], "%Y-%M-%DTHH:mmZ")
collection.insert(data)
I had a similar problem, but I didn't known beforehand where I should replace it by a datetime object. So I changed my json information to something like:
{"LogLevel":"error","Datetime":{"__timestamp__": "2013-06-21T11:20:17Z"},"Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
and parsed json with:
json.loads(data, object_hook=logHook)
with 'logHook' defined as:
def logHook(d):
if '__timestamp__' in d:
return datetime.strptime(d['__timestamp__'], "%Y-%M-%DTHH:mmZ")
return d
This logHook function could also be extended to replace many other 'variables' with elif, elif, ...
Hope this helps!

Also, just curious, is there a way I can configure mongoDB to auto convert datetime strings to its native isodate format ?
You probably want to create a Python datetime object for the timestamp, and insert that using PyMongo. This is stored under the hood as the native date object in MongoDB.
So, for example in Python:
from datetime import datetime
object_with_timestamp = { "timestamp": datetime.now() }
your_collection.insert(object_with_timestamp)
When this object gets queried from the Mongo shell, an ISODate object is present:
"timestamp" : ISODate("2013-06-24T09:29:58.615Z")

It depends on with what language/driver/utility you're pushing the log. I am assuming you're using mongoimport.
mongoimport doesn't support ISODate(). Refer to this issue https://jira.mongodb.org/browse/SERVER-5543 ISODate() is not a JSON format, hence not supported in mongoimport.
i) approach seems more efficient. ii) does two actions on mongo: insert & update. I had same issue while importing some log data into mongo. I ended up converting ISO 8601 format date to epoch format.
{"LogLevel":"error","Datetime":{"$date" : 1371813617000},"Module":"DB","Method":"ExecuteSelect","Request":"WS_VALIDATE","Error":"Procedure or function 'WS_VALIDATE' expects parameter '#LOGIN_ID', which was not supplied."}
Above JSON should work. Note that it is 64-bit not 32-bit epoch.

Related

Converting date string (YYYY-MM-DD) to datetime object pymongo

Before mass inserting using insertmany(), I need to change the "Date" field of each document from a string which is in the format of 'YYYY-MM-DD' (for example '2020-02-28) to a datetime object which can be used in mongo for later purposes...
Is there a possible way of doing this using pymongo
So my idea would look something like this
dict["Date"] = Mongo_Date(dict["Date"]) #converting the original string to a date object
outputList.append(dict)
#Later on in code
mycol.insert_many(outputList)
is there any easy way of doing this with pymongo??
A couple of possibilities come to mind:
use the python map function to modify all of the objects at once
insert the objects into MongoDB, and then use update with $dateFromString to modify them

How to query mongodb without knowing dynamic field type in Java?

I want to create some APIs, which let user pass in just strings for types like number, boolean. And automatically convert them before querying mongodb. Is it possible?
Yes, It is possible for MongoDB. You could write your own utility to convert string in mongo specific query or you could use some open source utilities like enter link description here.
Eventually, MongoDB accepts JSON string to execute the same client does also convert each query in the same JSON format. MongoDB clients or MongoDB doesn't need any predefined mapping or POJO.
This utility will convert string as shown below -
User string -
"select * from users where firstName='Vijay' AND lastName='Rajput'"
Then this utility will convert it into -
db.users.find({$and: [{firstName: 'Vijay'}, {lastName: 'Rajput'}]})

Query mongo ISODate using play reactive mongo

I am trying to query dates in mongodb.
The dates are stored as ISODate("2015-10-08T05:48:55.778+0000").
Now how should i do query like $gte or $lte.
I have been using Play plugin for reactive mongo
To query from the mongo shell, I would need to query with=>
{"endDateTime":{"$eq": new Date("2017-10-08T05:48:55.778+0000")}
OR,
{"endDateTime":{"$eq": ISODate("2017-10-08T05:48:55.778+0000")}
So, what should I do to query it using play reactive mongo. I have been using JodaTime. I am generating the Json Object of the query, and feeding to the find() api straightaway.
*Yes there a lot suggestion in SO, about the topic, but none of them seem to help me in this case. I could give more info if needed.
Update Answer:
Seems like I had some confusion, when converting the dates.
When I tried converting the String Date to Joda DateTime , the result when I print it in console, it would be shown as timestamp,but when I sent it to reactive mongo find it would convert to some form of string date "2015-10-08T05:48:55.778+0000".
So, I had to retrieve the millisecond conversion and send it to the respective api, and mongo would process without any issues.

querying documents in a date range using Mongo built-in 'timestamp'

I'm aware that the Mongo "ObjectId" has the method "getTimestamp()" , which works like
ObjectId("507f191e810c19729de860ea").getTimestamp()
And also I'm aware that it can be sorted based on built-in 'timestamp'
db.collection.find().sort({'timestamp': -1})
I know I can create a new field "created_time" in each document by converting ObjectId to created_time, then query based on this new field.
I've also read this post which converts the date range to ObjectId and then directly compare the ObjectId, but this method I'm worried about the other bytes which is not for time but for machine and process.
My question is, is there a way to directly query documents in a date range using Mongo built-in 'timestamp'? without extra field or extra effort.
something like below (but I tried below command and not working), which can directly query Mongo using its built-in timestamp.
db.collection.find({'timestamp':{$gt: new Date(ISODate("2015-08-14T14:00:00Z"))}})

Querying on Date in Mongo

I'm inserting a Mongo doc with the following time-stamp:
val format = new java.text.SimpleDateFormat("yyyyMMddHHmmss")
format.format(new Date()).toLong
Here's what the section looks like from Mongo's shell:
"{Timestamp" : NumberLong("20130919161948")}"
Based on a few tests, it appears to me that I can simply compare 2 documents by Timestamp by simply checking > or < for the yyyyMMddHHmmss format.
Please let me know if this time-stamp is OK for Mongo. Will I be able to query with it?
Mongo will not understand this as a timestamp, but as a number. As you set your date with a format going from year to seconds, you will be able to query mongo using > or < to know if it is before or after.
However if you want to mongo to treat the data as a date, you will need to use the appropriate bson date format. By having mongo treat it as a date, you will have all Mongo date operations available, like extracting year, day of week, etc.. read more
If you are using casbah, and Joda, you can enable serialization and deserialization by an explicit call:
import com.mongodb.casbah.conversions.scala._
RegisterJodaTimeConversionHelpers()
Read more here.
#Kevin, I think you are right. java.util.Date is supported in BSON object.
Using NumberLong to represent timestamp allows you to do range queries, but with BSON date type, date operation in aggregation framework becomes possible, which is more powerful.