I am trying to use mongodb's ObjectID to do a range query on the insertion time of a given collection. I can't really find any documentation that this is possible, except for this blog entry: http://mongotips.com/b/a-few-objectid-tricks/ .
I want to fetch all documents created after a given timestamp. Using the nodejs driver, this is what I have:
var timeId = ObjectId.createFromTime(timestamp);
var query = {
localUser: userId,
_id: {$gte: timeId}
};
var cursor = collection.find(query).sort({_id: 1});
I always get the same amount of records (19 in a collection of 27), independent of the timestamp. I noticed that createFromTime only fills the bytes in the objectid related to time, the other ones are left at 0 (like this: 4f6198be0000000000000000).
The reason that I try to use an ObjectID for this, is that I need the timestamp when inserting the document on the mongodb server, not when passing the document to the mongodb driver in node.
Anyone knows how to make this work, or has another idea how to generate and query insertion times that were generated on the mongodb server?
Not sure about nodejs driver in ruby, you can simply apply range queries like this.
jan_id = BSON::ObjectId.from_time(Time.utc(2012, 1, 1))
feb_id = BSON::ObjectId.from_time(Time.utc(2012, 2, 1))
#users.find({'_id' => {'$gte' => jan_id, '$lt' => feb_id}})
make sure
var timeId = ObjectId.createFromTime(timestamp) is creating an ObjectId.
Also try query without localuser
Related
Context
I have a big collection with millions of documents which is constantly updated with production workload. When performing a query, I have noticed that a document can be returned multiple times; My workload tries to migrate the documents to a SQL system which is set to allow unique row ids, hence it crashes.
Problem
Because the collection is so big and lots of users are updating it after the query is started, iterating over the cursor's result may give me documents with the same id (old and updated version).
What I'v tried
const cursor = db.collection.find(query, {snapshot: true});
while (cursor.hasNext()) {
const doc = cursor.next();
// do some stuff
}
Based on old documentation for the mongo driver (I'm using nodejs but this is applicable to any official mongodb driver), there is an option called snapshot which is said to avoid what is happening to me. Sadly, the driver returns an error indicating that this option does not exists (It was deprecated).
Question
Is there a way to iterate through the documents of a collection in a safe fashion that I don't get the same document twice?
I only see a viable option with aggregation pipeline, but I want to explore other options with standard queries.
Finally I got the answer from a mongo changelog page:
MongoDB 3.6.1 deprecates the snapshot query option.
For MMAPv1, use hint() on the { _id: 1} index instead to prevent a cursor from returning a document more than once if an intervening write operation results in a move of the document.
For other storage engines, use hint() with { $natural : 1 } instead.
So, from my code example:
const cursor = db.collection.find(query).hint({$natural: 1});
while (cursor.hasNext()) {
const doc = cursor.next();
// do some stuff
}
I am implementing a process in rust where I read a large number of documents from a mongodb collection, perform some calculations on the values of each document and then have to update the documents in mongodb.
In my initial implementation, after the calculations are performed, I go through each of the documents and call db.collection.replace_one.
let document = bson::to_document(&item).unwrap();
let filter = doc! { "_id": item.id.as_ref().unwrap() };
let result = my_collection.replace_one(filter, rec_document, None).await?
Since this is quite time consuming for large record sets, I want to implement it using db.collection.bulkWrite. In version 1.1.1 of the official rust mongodb driver, bulkWrite does not seem to be supported, so I want to use db.run_command. However, I am not sure how to call db.collection.bulkWrite(...) using run_command as I cannot figure out how to pass the command name as well as the set of documents to replace the values in mongodb.
What I have attempted is to create a String representing the command document with all the document records to be updated string joined as well. In order to create bson::Document from that string, I convert the string to bytes and then attempt to create the document to be passed using Document::from_reader but that doesn't work, nor is a good solution.
Is there a proper or better way to call bulkWrite using version 1.1.1 of the mongodb crate?
I am trying to query a binary field in mongo db. The data looks like this:
{"_id":"WE8fSixi8EuWnUiThhZdlw=="}
I've tried a lot of things for example:
{ '_id': new Binary( 'WE8fSixi8EuWnUiThhZdlw==', Binary.SUBTYPE_DEFAULT) }
{ '_id': Binary( 'WE8fSixi8EuWnUiThhZdlw==', 0) }
etc
Nothing seems to be working, have exhausted google and the mongo documentation, any helper would be amazing.
UPDATE:
Now you should be able to query UUID and BinData from MongoDB Compass v1.20+ (COMPASS-1083). For example: {"field": BinData(0, "valid_base64")}.
PREVIOUS:
I see that you're using MongoDB Compass to query the field. Unfortunately, the current version of MongoDB Compass (v1.16.x) does not support querying binary data.
You can utilise mongo shell to query the data instead. For example:
db.collection.find({'_id':BinData(0, "WE8fSixi8EuWnUiThhZdlw==")});
Please note that the field name _id is reserved for use as a primary key; its value must be unique in the collection, and is immutable. Depending on the value of the binary that you're storing into _id, I would suggest to store the binary in another field and keep the value of _id to contain ObjectId.
I'm aware that the Mongo "ObjectId" has the method "getTimestamp()" , which works like
ObjectId("507f191e810c19729de860ea").getTimestamp()
And also I'm aware that it can be sorted based on built-in 'timestamp'
db.collection.find().sort({'timestamp': -1})
I know I can create a new field "created_time" in each document by converting ObjectId to created_time, then query based on this new field.
I've also read this post which converts the date range to ObjectId and then directly compare the ObjectId, but this method I'm worried about the other bytes which is not for time but for machine and process.
My question is, is there a way to directly query documents in a date range using Mongo built-in 'timestamp'? without extra field or extra effort.
something like below (but I tried below command and not working), which can directly query Mongo using its built-in timestamp.
db.collection.find({'timestamp':{$gt: new Date(ISODate("2015-08-14T14:00:00Z"))}})
How do I write a mongodb shell query that will return the documents (or just document ids) for all objects created after a specific date? I see examples like the following...
return query based on date
But they just return the timestamp, I want to query based on a timestamp. I don't think the logic is as easy as looking for objects higher than a specific objectid, because if mongodb is sharded, then there are multiple servers creating objects.