Duplicate a Mongodb Collection but with a `limit` applied - mongodb

Is it possible to have a second Collection that is exactly the same as the first Collection, but with a limit and sort operator applied to it?
If the first Collection have 1000+ records and have new records being added, the second Collection will have all the new records, but limited to N newest records (sort by timestamp field).
The reason for doing this is to overcome a limitation in my database driver that does not have sort and limit implemented yet.

It sounds like what you want is a capped collection.
A capped collection is automatically limited to a fixed amount of records. When they are full and you add a new document, the oldest document is deleted. All records are guaranteed to be in insertion order when you query from the collection without an explicit sort. The biggest limitation is that documents in a capped collection can not be updated when that would increase their size.
Capped collections need to be created explicitely with the createCollection function. This shell command would create a capped collection limited to 1000 documents:
db.createCollection( "your_collection_name", { capped: true, size: 1000 } );
When you want to convert an already existing collection to a capped collection, you can use the convertToCapped database command:
db.runCommand({"convertToCapped": "your_existing_collection_name", size: 1000});

Related

How does the limit() option work in mongodb?

Let say you have a collection of 10,000 documents and I make a find query with a the option limit(50). How will mongoDb choose which 50 documents to return.
Will it auto-sort them(maybe by their creation date) or not?
Will the query return the same documents every time it is called? How does the limit option work in mongodb?
Does mongoDB limit the documents after they are returned or as it queries them. Meaning will mongoDB query all documents the limit the results to 50 documents or will it query the 50 documents only?
The first 50 documents of the result set will be returned.
If you do not sort the documents (or if the order is not well-defined, such as sorting by a field with values that occur multiple times in the result set), the order may change from one execution to the next.
Will it auto-sort them(maybe by their creation date) or not?
No.
Will the query return the same documents every time it is called?
The query may produce the same results for a while and then start producing different results if, for example, another document is inserted into the collection.
Meaning will mongoDB query all documents the limit the results to 50 documents or will it query the 50 documents only?
Depends on the query. If an index is used, only the needed documents will be read from the storage engine. If a sort stage is used in the query execution, all documents will be read from storage, sorted, then the required number will be returned and the rest discarded.

What is a default limit of Firestore query?

Let's say I have a collection mycollection that has 1,000,000 records.
How many records will this query return?
const query = firestore.collection('mycollection').get()
I couldn't find that in docs.
There is no default limit. The query you're showing is asking for all of the documents in mycollection. For large collections, you will need to impose a limit in order to avoid excessive costs and running out of memory.
From firebase.google.com documentation:
By default, a query retrieves all documents that satisfy the query in
ascending order by document ID. You can specify the sort order for
your data using orderBy(), and you can limit the number of documents
retrieved using limit().

How to delete N documents based on any given field of that document?

I intend to delete a large number of documents from a collection. My collection has more than a million documents. My idea is say, query for 10k documents based on a common field and delete all of them. I'm not sure how to get this done. Any help ?
You can do a deleteMany, see the mongodb documentation

How does MongoDB order their docs in one collection? [duplicate]

This question already has answers here:
How does MongoDB sort records when no sort order is specified?
(2 answers)
Closed 7 years ago.
In my User collection, MongoDB usually orders each new doc in the same order I create them: the last one created is the last one in the collection. But I have detected another collection where the last one I created has the 6 position between 27 docs.
Why is that?
Which order follows each doc in MongoDB collection?
It's called natural order:
natural order
The order in which the database refers to documents on disk. This is the default sort order. See $natural and Return in Natural Order.
This confirms that in general you get them in the same order you inserted, but that's not guaranteed–as you noticed.
Return in Natural Order
The $natural parameter returns items according to their natural order within the database. This ordering is an internal implementation feature, and you should not rely on any particular structure within it.
Index Use
Queries that include a sort by $natural order do not use indexes to fulfill the query predicate with the following exception: If the query predicate is an equality condition on the _id field { _id: <value> }, then the query with the sort by $natural order can use the _id index.
MMAPv1
Typically, the natural order reflects insertion order with the following exception for the MMAPv1 storage engine. For the MMAPv1 storage engine, the natural order does not reflect insertion order if the documents relocate because of document growth or remove operations free up space which are then taken up by newly inserted documents.
Obviously, like the docs mentioned, you should not rely on this default order (This ordering is an internal implementation feature, and you should not rely on any particular structure within it.).
If you need to sort the things, use the sort solutions.
Basically, the following two calls should return documents in the same order (since the default order is $natural):
db.mycollection.find().sort({ "$natural": 1 })
db.mycollection.find()
If you want to sort by another field (e.g. name) you can do that:
db.mycollection.find().sort({ "name": 1 })
For performance reasons, MongoDB never splits a document on the hard drive.
When you start with an empty collection and start inserting document after document into it, mongoDB will place them consecutively on the disk.
But what happens when you update a document and it now takes more space and doesn't fit into its old position anymore without overlapping the next? In that case MongoDB will delete it and re-append it as a new one at the end of the collection file.
Your collection file now has a hole of unused space. This is quite a waste, isn't it? That's why the next document which is inserted and small enough to fit into that hole will be inserted in that hole. That's likely what happened in the case of your second collection.
Bottom line: Never rely on documents being returned in insertion order. When you care about the order, always sort your results.
MongoDB does not "order" the documents at all, unless you ask it to.
The basic insertion will create an ObjectId in the _id primary key value unless you tell it to do otherwise. This ObjectId value is a special value with "monotonic" or "ever increasing" properties, which means each value created is guaranteed to be larger than the last.
If you want "sorted" then do an explicit "sort":
db.collection.find().sort({ "_id": 1 })
Or a "natural" sort means in the order stored on disk:
db.collection.find().sort({ "$natural": 1 })
Which is pretty much the standard unless stated otherwise or an "index" is selected by the query criteria that will determine the sort order. But you can use that to "force" that order if query criteria selected an index that sorted otherwise.
MongoDB documents "move" when grown, and therefore the _id order is not always explicitly the same order as documents are retrieved.
I could find out more about it thanks to the link Return in Natural Order provided by Ionică Bizău.
"The $natural parameter returns items according to their natural order within the database.This ordering is an internal implementation feature, and you should not rely on any particular structure within it.
Typically, the natural order reflects insertion order with the following exception for the MMAPv1 storage engine. For the MMAPv1 storage engine, the natural order does not reflect insertion order if the documents relocate because of document growth or remove operations free up space which are then taken up by newly inserted documents."

Get a list of records from a collection sorted by count and uniqueness of a field

So I have a bunch of documents in a MongoDB collection and it seems that the collection is growing a little faster than we thought.
Is there a way to get a list from a collection that will count the number of documents that have X as a value in a field.
For example(I'll just make data up)
there are 4 values possible for the field (reference).
/content/public
/content/private
/resource/something
/much/wow
Is there a way to get a list from mongo that says:
1231 Records have /content/public as the value for reference.
21312312 have /content/private
34 have /resource/something
34242 have /much/wow
Use the aggregation tools for this. You haven't listed a language in your question, so here's the mongodb command directly. This assumes your collection is named 'urls'.
db.urls.aggregate([{$group: {_id:'$reference', total:{$sum:1} } }]);