Can I create index after loading documents in sharded mongodb collection? - mongodb

We are doing migration from a collection(old collection) to another mongodb collection(new collection) in Azure cosmos. So while this migration we haven't created index in new collection and migrated all documents to new collection. So can we create index now in new collection or it should have created earlier before migration(before loading documents).Both(old & new collection) are sharded collection. Size of new collection is 211 gb, if I create index it will account for certain memory consumption(index size). So I would like to know is there any impact if we create index after loading documents? Are we good to create index after loading documents?

Yes, you can create indexes after the migration as well without any issues.
Please check this documentation for the commands to create various index types.
The commands to track progress of the indexing operation are also provided on the same document.

Related

Mongo dynamic collection creation and locking

I am working on an app where I am looking into creating MongoDB collections on the fly as they are needed.
The app consumes data from a data source and maps the data to a collection. If the collection does not exist, the app:
creates the collection
kicks off appropriate indexes in the background
shards the collection
and then inserts the data into the collection.
While this is going on, other processes will be reading and writing from the database.
Looking at this MongodDB locking FAQ, it appears that the reads and writes in other collections of the database should not be affected by the dynamic collection creation snd setup i.e. they won't end up waiting on a lock we acquired to create the collection.
Question: Is the above assumption correct?
Thank you in advance!
No, when you insert into a collection which does not exist, then a new collection is created automatically. Of course, this new collection does not have any index (apart from _id) and is not sharded.
So, you must ensure the collection is created before the application starts any inserts.
However, it is no problem to create indexes and enable sharding on a collection which contains already some data.
Note, when you enable sharding on an empty collection or a collection with low amount of data, then all data is written initially to the primary shard. You may use sh.splitAt() to pre-split the upcoming data.

MongoDB : How to exclude a collection from generating oplog

I am working on processing the mongodb oplog and I create a collection in mongodb to add the processed data and I don't want this collection to again generate oplog.
I want all other collection to generate oplog but need to exclude one of the collection. How can I achieve this. Is there any settings to let mongodb know not to generate oplog for a collection.
Any help is appreciated.
Collection in local database are not part of replication. So, If you create a collection in the local database and insert records to that, oplog entries are not created.

Does build createIndex() a new index over the whole collection?

What happens when I add a new document to my collection and run the createIndex() function. Does MongoDB build a new index over the whole collection (time and resource consuming)? Or is MongoDB just updating the index for the single document (time and resource saving)? I am not sure because I found this in the documentation (3.0):
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur during a foreground index build.
I am asking because I need a dynamic 2dsphere index which will be updated continuous. If it needs to build the index everytime over the whole collection it would take too much time.
Building an index indexes all existing documents and also cause all future documents to get indexed as well. So if you insert new documents into an already indexed collection (or update the indexed fields of existing documents), the indexes will get updated.
When you call createIndex and an index of the same type over the same fields already exists, nothing happens.

How to disable MongoDB cache for specific collection?

I'm running a MongoDB service and some of the collections are data store only and I don't want let MongoDB loads these collections' data into memory. Is there any configuration for that?
MongoDB does not load the collection into memory, only the collection's indexes.
By default index is _id field only. You can't remove the _id index.

What might be the result of updating an Index with new field of a collection in MongoDB?

We are using MongoDB as database which is a busy Web Application. Busy in sense 800 concurrent users, every second there would be 50-70 or more updations on MongoDB.
The application is already in place and is not from scratch (the indexes on Mongodb collections have been already created) when I took over this Application.
We recently had a new functionality developed and we added some new fields/attributes to one of our collection in mongodb.
My question is that can I update the index (add a new field to it) to the existing index of the collection?
I was confused and asking because I read that Rebuilding an index might be costly and slow down the application for some databases like Oracle.
No, you cannot modify existing MongoDB indexes. For a case like this it's probably best to create the new index in the background (the {background: true} option), and then drop the existing index when the new one has finished building.