mongodb - query based on _id timestamp - mongodb

How do I write a mongodb shell query that will return the documents (or just document ids) for all objects created after a specific date? I see examples like the following...
return query based on date
But they just return the timestamp, I want to query based on a timestamp. I don't think the logic is as easy as looking for objects higher than a specific objectid, because if mongodb is sharded, then there are multiple servers creating objects.

Related

Remove obsolete collection in mongodb

I want to delete all the collections from my db which are not used for long time. Is there any why i can check when the particular collection was last used?
It depends what you mean by 'last used'. If you mean the last time a document was inserted into the collection then you could do this by converting the ObjectId of the last inserted document into a date. The following query should return the date the last document was inserted:
db.<collection_name>.findOne({},{_id:1})._id.getTimestamp()
the findOne query will return documents in natural order, therefore if you input no query criteria ('{}') then it will return the most recently inserted document. You can then get the _id field and call the getTimestamp() function
I'm not sure if there is any way to reliably tell when a collection was last queried. If you're running your database with profiling enabled then there might be entries in the db.system.profile collection, or in the oplog.

querying documents in a date range using Mongo built-in 'timestamp'

I'm aware that the Mongo "ObjectId" has the method "getTimestamp()" , which works like
ObjectId("507f191e810c19729de860ea").getTimestamp()
And also I'm aware that it can be sorted based on built-in 'timestamp'
db.collection.find().sort({'timestamp': -1})
I know I can create a new field "created_time" in each document by converting ObjectId to created_time, then query based on this new field.
I've also read this post which converts the date range to ObjectId and then directly compare the ObjectId, but this method I'm worried about the other bytes which is not for time but for machine and process.
My question is, is there a way to directly query documents in a date range using Mongo built-in 'timestamp'? without extra field or extra effort.
something like below (but I tried below command and not working), which can directly query Mongo using its built-in timestamp.
db.collection.find({'timestamp':{$gt: new Date(ISODate("2015-08-14T14:00:00Z"))}})

Using MongoDB to query selected field

I am trying to query out the data from my MongoDB database but there are some fields which I would like to omit as MongoDB will query the whole collections with id, n out.
I did this to limit the query but unfortunately only one field could be omitted but not the other which is the 'n' field. How can I omit two fields?
data = collection.find_one({"files_id": file_id},{"_id":0,"data":1})
And I also realized that my query for data has the field name (u'data') too, how can I query it so that it only returns the data? for this case it's a binary data
Example:
{u'data': Binary('\x00\x00\xed\x00\n\x00\x00\xd5\xa9\x00\x000\x00\x00\x00#\x00\x00\x0f\xff\xf0\x00\x0b\x80\x00\x00\x00
Kindly assist thanks!

Aggregate and Sum Data from mutliple MongoDB Collections filtered by date range

I have data across three collections and need to produce a data set which aggregates data from these collections, and filters by a date range.
The collections are:
db.games
{
_id : ObjectId,
startTime : MongoDateTime
}
db.entries
{
player_id : ObjectId, // refers to db.players['_id']
game_id : ObjectId // refers to db.games['_id']
}
db.players
{
_id : ObjectId,
screen_name,
email
}
I want to return a collection which is number of entries by player for games within a specified range. Where the output should look like:
output
{
player_id,
screen_name,
email,
sum_entries
}
I think I need to start by creating a collection of games within the date range, combined with all the entries and then aggregate over count of entries, and finally output collection with the player data, it's seems a lot of steps and I'm not sure how to go about this.
The reason why you have these problems is because you try to use MongoDB like a relational database, not like a document-oriented database. Normalizing your data over many collections is often counter-productive, because MongoDB can not perform any JOIN-operations. MongoDB works much better when you have nested documents which embed other objects in arrays instead of referencing them. A better way to organize that data in MongoDB would be to either have each game have an array of players which took part in it or to have an array in each player with the games they took part in. It's also not necessarily a mistake to have some redundant additional data in these arrays, like the names and not just the ID's.
But now you have the problem, so let's see how we can deal with it.
As I said, MongoDB doesn't do JOINs. There is no way to access data from more than one collection at a time.
One thing you can do is solving the problem programmatically. Create a program which fetches all players, then all entries for each player, and then the games referenced by the entries where startTimematches.
Another thing you could try is MapReduce. MapReduce can be used to append results to another collection. You could try to use one MapReduce job for each of the relevant collections into one and then query the resulting collection.

MongoDB: range queries on insertion time with _id and ObjectID

I am trying to use mongodb's ObjectID to do a range query on the insertion time of a given collection. I can't really find any documentation that this is possible, except for this blog entry: http://mongotips.com/b/a-few-objectid-tricks/ .
I want to fetch all documents created after a given timestamp. Using the nodejs driver, this is what I have:
var timeId = ObjectId.createFromTime(timestamp);
var query = {
localUser: userId,
_id: {$gte: timeId}
};
var cursor = collection.find(query).sort({_id: 1});
I always get the same amount of records (19 in a collection of 27), independent of the timestamp. I noticed that createFromTime only fills the bytes in the objectid related to time, the other ones are left at 0 (like this: 4f6198be0000000000000000).
The reason that I try to use an ObjectID for this, is that I need the timestamp when inserting the document on the mongodb server, not when passing the document to the mongodb driver in node.
Anyone knows how to make this work, or has another idea how to generate and query insertion times that were generated on the mongodb server?
Not sure about nodejs driver in ruby, you can simply apply range queries like this.
jan_id = BSON::ObjectId.from_time(Time.utc(2012, 1, 1))
feb_id = BSON::ObjectId.from_time(Time.utc(2012, 2, 1))
#users.find({'_id' => {'$gte' => jan_id, '$lt' => feb_id}})
make sure
var timeId = ObjectId.createFromTime(timestamp) is creating an ObjectId.
Also try query without localuser