Remove obsolete collection in mongodb - mongodb

I want to delete all the collections from my db which are not used for long time. Is there any why i can check when the particular collection was last used?

It depends what you mean by 'last used'. If you mean the last time a document was inserted into the collection then you could do this by converting the ObjectId of the last inserted document into a date. The following query should return the date the last document was inserted:
db.<collection_name>.findOne({},{_id:1})._id.getTimestamp()
the findOne query will return documents in natural order, therefore if you input no query criteria ('{}') then it will return the most recently inserted document. You can then get the _id field and call the getTimestamp() function
I'm not sure if there is any way to reliably tell when a collection was last queried. If you're running your database with profiling enabled then there might be entries in the db.system.profile collection, or in the oplog.

Related

How does the limit() option work in mongodb?

Let say you have a collection of 10,000 documents and I make a find query with a the option limit(50). How will mongoDb choose which 50 documents to return.
Will it auto-sort them(maybe by their creation date) or not?
Will the query return the same documents every time it is called? How does the limit option work in mongodb?
Does mongoDB limit the documents after they are returned or as it queries them. Meaning will mongoDB query all documents the limit the results to 50 documents or will it query the 50 documents only?
The first 50 documents of the result set will be returned.
If you do not sort the documents (or if the order is not well-defined, such as sorting by a field with values that occur multiple times in the result set), the order may change from one execution to the next.
Will it auto-sort them(maybe by their creation date) or not?
No.
Will the query return the same documents every time it is called?
The query may produce the same results for a while and then start producing different results if, for example, another document is inserted into the collection.
Meaning will mongoDB query all documents the limit the results to 50 documents or will it query the 50 documents only?
Depends on the query. If an index is used, only the needed documents will be read from the storage engine. If a sort stage is used in the query execution, all documents will be read from storage, sorted, then the required number will be returned and the rest discarded.

Best way to get the first inserted record in a collection of MongoDB

I need to fetch the first inserted record in a collection in MongoDB for which I am currently using the below query:
db.users.find({}).sort({"created_at":1}).limit(1);
But, this takes up a lot of memory. The collection has about 100K records.
What is the efficient way to do this?
MongoDB _id is unique identifier which is automatically generated upon insertion of document into MongoDB collection
_id field stores ObjectId value and is automatically indexed.
According to MongoDB documentation,
The 12-byte ObjectId value consists of:
4-byte value representing the seconds since the Unix epoch,
3-byte machine identifier,
2-byte process id, and
3-byte counter, starting with a random value.
According to description as mentioned into above question to fetch first inserted record please try executing following mongodb find operation into MongoDB shell.
db.users.find({}).sort({"_id":1}).limit(1);
In above query we have sorted result according to _id field since _id Object ID value consists of unix epoch timestamp
further to this you can add specific filters in query to get first record inserted for that criteria:
like suppose you collection contains data for storing employees from IT, ADMIN, FINANCE department and you want to look for the first document inserted for IT (i.e. first IT employee) then you can execute:
db.users.find({"Dept" : "IT"}).sort({"_id":1}).limit(1);
and similarly to find last employee:
db.users.find({"Dept" : "IT"}).sort({"_id":-1}).limit(1);
Note: for bigger collections/sharded collection it will take considerable time to get the result as it iterates entire _id field for ascending and descending criteria.

MongoDB: How to get the last updated timestamp of the last updated document in a collection

Is there a simple OR elegant method (or query that I can write) to retrieve the last updated timestamp (of the last updated document) in a collection. I can write a query like this to find the last inserted document
db.collection.find().limit(1).sort({$natural:-1})
but I need information about the last updated document (it could be an insert or an update).
I know that one way is to query the oplog collection for the last record from a collection. But it seems like an expensive operation given the fact that oplog could be of very large size (also not trustworthy as it is a capped collection). Is there a better way to do this?
Thanks!
You could get the last insert time same way you mentioned in the question:
db.collection.find().sort({'_id': -1}).limit(1)
But, There isn't any good way to see the last update/delete time. But, If you are using replica sets you could get that from the oplog.
Or, you could add new field in document as 'lastModified'.
You can also checkout collection-hooks. I hope this will help
One way to go about it is to have a field that holds the time of last update. You can name it updatedAt. Every time you make an update to the document, you'll just update the value to the current time. If you use the ISO format to store the time, you'll be able to sort without issues (that's what I use).
The other way is the _id field.
Method 1
db.collection.find().limit(1).sort({updatedAt: -1})
Method 2
db.collection.find().limit(1).sort({_id: -1})
You can try with ,
db.collection.findOne().sort({$natural:-1}).limit(1);

Query by position in mongodb collection

I need to fetch the document in a mongodb collection using its position. I know the position of the document inside the collection exactly but could not figure out a way to pull those documents from collection. Is there any way to achieve this?
db.daily.find({'_id': {'$in': 0,5,8}})
This is what i tried but _id is not inserted as 1,2,3... but it has some random num Eg:57d8fd62f2a9d913ba0d006d. Thanks in advance.
You can use skip and limit to query based on the position in the natural order
db.collection.find().skip(10).limit(1) // get 10th document in natural order
As the natural order link points out, the document order need not match the order that documents are inserted (with an exception for capped collections). If you use the default ObjectId as the _id field for your documents you can sort by _id to order based on insertion in the collection (up to the resolution of the timestamp in the ObjectId)
db.collection.find().sort([("_id",1)]).skip(10).limit(1) // get 10th document in inserted order
You may also consider using your own _id or adding a field to be able to sort on in order to query based on the position you define.

How to find last update/insert/delete operation time on mongodb collection without objectid field

I have some unused collections in the MongoDb database. I've to find out when the CRUD operations done against collections in the database. We have our own _id field instead of mongo's default object_id. We dont have any time filed in the collections to find out the modification time. is there any way to find out the modification time of collections in mongodb from meta data? Is there any data dictionay informations like in oracle to find out this? please give some idea/workarounds
To make a long story short: MongoDB has a flexible schema. Simply add a date field. Since older entries don't have it, they can not be the last entry.
Let's call that field mtime.
So after adding a date field to your schema definition, we generate an index in descending order on the new field:
db.yourCollction.createIndex({mtime:-1})
Finding the last mtime for a collection now is easy:
db.yourCollection.find({"mtime":{"$exists":true}}).sort({"mtime":-1}).limit(1)
Do this for every collection. When the above query does not return a value within the timeframe you defined for purging a collection, simply drop it, since it has not been modified since you introduced the mtime field.
After your collections are cleaned up, you may remove the mtime field from your schema definition. To remove it from the documents, you can run a simple query:
db.yourCollection.update(
{ "mtime":{ $exists:true} },
{ "$unset":{ "mtime":""} },
{ multi: true}
)
There is no "data dictionary" to get this information in MongoDB.
If you've enabled the profiling level in advance to log all operations (db.setProfilingLevel(2)) and you haven't had many operations to log, so that the system.profile capped collection hasn't overwritten whatever logs you are interested in, you can get the information you need there—but otherwise it's gone.