Does MongoDB ensureIndex perform a rebuild? - mongodb

I'm reading MongoFB documentation.
At this url
http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/
I read
"Create the new index using the ensureIndex() in the mongo shell, or comparable method in your driver.
This operation will create or rebuild the index on this mongod instance"
I understand well?
If the index is already present, does mongoDB perform a rebuild?
So the difference with "reIndex()" is that reIndex() performa a rebuild on all the indexes of a collection.
Is it correct?

I believe that is either an English mistake or talking about if you are rebuilding indexes on replica sets either way the documentation for ensureIndex() specifically states:
http://docs.mongodb.org/manual/reference/method/db.collection.ensureIndex/#behaviors (the one linked to from here: http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/#build-the-index )
If you call multiple ensureIndex() methods with the same index specification at the same time, only the first operation will succeed, all other operations will have no effect.
So calling it again should not result in a rebuild unless you are rebuilding your indexes.
If you want to rebuild the index you must drop it first and then rerun ensureIndex():
To add or change index options you must drop the index using the dropIndex() method and issue another ensureIndex() operation with the new options.
If you create an index with one set of options, and then issue the ensureIndex() method with the same index fields and different options without first dropping the index, ensureIndex() will not rebuild the existing index with the new options.

Related

How to manually create empty MongoDB index on a new field?

I have a huge collection, more than 2TiB of data. During release of a new feature I add an index of new field, that 100% sure doesn't exist in any document, MongoDB will still perfom a full scan for this field, which may process for a long time.
Is there any hack to just manually create an empty index file with valid structure and notify MongoDB node about it, so it will load index into memory and everything else MongoDB is doing when index is crerated?
Unlike in relational RDBMS, MongoDB creates indexes also on non-existing fields, i.e. it scans the entire collection.
Index creation runs in background, so it should not harm so much.
See createIndexes
Changed in version 4.2.
For feature compatibility version (fcv) "4.2", all index builds use an optimized build process that holds the exclusive lock only at the beginning and end of the build process. The rest of the build process yields to interleaving read and write operations. MongoDB ignores the background option if specified.
If you run MongoDB version 4.2 or earlier, then you may specify option { background: true }

What happens when Index creation in MongoDB which is running in background fails

There are existing collections in MongoDB on which need to be programmatically updated for new indexes.
So there is an admin web API in my ASP.net application when invoked will invoke the create index API in MongoDB. In order to not cause an impact due to index building process, it is performed in background.
It is not known whether the existing data is good as per the index definition. Because Mongo DB imposes index key size limit to 1024, and it may be possible that values of indexed fields in some of the existing documents may sum up to length more than 1024.
So the question is when this happens what would happen when the index building fails due to this.
Also how can I programmatically (C# driver) find the status of the index build operation at a later point in time?
According to the MongoDB Documentation
MongoDB will not create an index on a collection if the index entry for an existing document exceeds the index key limit. Previous versions of MongoDB would create the index but not index such documents.
So this means, background or foreground, an index key too long will cause the creation to fail. However, no matter how you create the index, the session issuing the create index command, will block. This means if the index build fails, you should be notified by an exception thrown while await-ing the task returned by the Indexes.CreateManyASync() method.
Since you are unsure if the data will be affected by the maximum key length, I strongly suggest you test this in a pre-production environment before attempting it in production. Since production is (I assume) active, the pre-production environment won't match the data exactly (writes still happening) it will reduce the possibility of finding a failed index build in production.
Additionally, even if the index is able to be built, in the future, writes that break that key length will be rejected. This can be avoided by setting failIndexKeyTooLong server parameter. However this has its own set of caveats. Specifically,
Setting failIndexKeyTooLong to false is a temporary workaround, not a permanent solution to the problem of oversized index keys. With failIndexKeyTooLong set to false, queries can return incomplete results if they use indexes that skip over documents whose indexed fields exceed the Index Key Length Limit.
I strongly suggest you read and understand those docs before implementing that particular parameter.
In general, it is considered by many to be bad practice to build an index at run-time. If the collection is already empty, this is not a big deal, however on a collection with a large amount of data, this can cause the create command to block for quite some time. This is especially true on a busy mongod when creating the index in the background.
If you are building this index on a Replica Set or Sharded Cluster, I strongly recommend you take a look at the documentation specific to those use cases before implementing the build in code.

Is it mandatory to restart MongoDB after adding new index on collection?

A MongoDB collection is slow to provide data as it has grown huge overtime.
I need to add an index on a few fields and to reflect it immediately in search. So I seek for clarification on followings things:
Is it mandatory to restart MongoDB after indexing?
If yes, then is there any way to add index without restarting the server? I don't want any downtime...
MongoDB does not need to be restarted after indexing.
However, by default, the createIndex operation blocks read/write on the affected database (note that it is not only the collection but the db). You may change the behaviour using background mode like this:
db.collectionName.createIndex( { collectionKey: 1 }, { background: true } )
It might seem that your client is blocked when creating the index. The mongo shell session or connection where you are creating the index will block, but if there are more connections to the database, these will still be able to query and operate on the database.
Docs: https://docs.mongodb.com/manual/core/index-creation/
There is no need to restart MongoDB after you add an index!
However,an index could be created in the foreground which is the default.
What does it mean? MongoDB documentation states: ‘By default, creating an index on a populated collection blocks all other operations on a database. When building an index on a populated collection, the database that holds the collection is unavailable for reading or write operations until the index build completes. Any operation that requires a read or writes lock on all databases will wait for the foreground index build to complete’.
For potentially long-running index building operations on standalone deployments, the background option should be used. In that case, the MongoDB database remains available during the index building operation.
To create an index in the background, the following snippet should be used, see the image below.

Does build createIndex() a new index over the whole collection?

What happens when I add a new document to my collection and run the createIndex() function. Does MongoDB build a new index over the whole collection (time and resource consuming)? Or is MongoDB just updating the index for the single document (time and resource saving)? I am not sure because I found this in the documentation (3.0):
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur during a foreground index build.
I am asking because I need a dynamic 2dsphere index which will be updated continuous. If it needs to build the index everytime over the whole collection it would take too much time.
Building an index indexes all existing documents and also cause all future documents to get indexed as well. So if you insert new documents into an already indexed collection (or update the indexed fields of existing documents), the indexes will get updated.
When you call createIndex and an index of the same type over the same fields already exists, nothing happens.

Is it OK to call ensureIndex on non-existent collections?

I read somewhere that calling ensureIndex() actually creates a collection if it does not exist. But the index is always on some fields, not all of them, so if I ensure an index on say { name:1 } and then add documents to that collection that have many more fields, the index will work? I know we don't have a schema, coming from RDBMS world I just want to make sure. :) I'd like to create indexes when my website starts, but initially the database is empty. I do not need to have any data prior to ensuring indexes, is that correct?
ensureIndex will create the collection if it does not yet exist. It does not matter if you add documents that don't have the property that the index covers, you just can't use that index to find those documents. The way I understand it is that in versions before 1.7.4 a document that is missing a property for which there is an index will be indexed as though it had that property, but will a null value. In versions after 1.7.4 you can create sparse indexes that don't include these objects at all. The difference is slight but may be significant in some situations.
Depending on the circumstances it may not be a good idea to create indexes when the app starts. Consider the situation where you deploy a new version which adds new indexes when it starts up, in development you will not notice this as you only have a small database, but in production you may have a huge database and adding the index will take a lot of time. During the index creation your app will hang and you can't serve requests. You can create indexes with the background flag set to true (the syntax depends on which driver you're using), but in most cases it's better to add indexes manually, or as part of a setup script. That way you will have to think before you update indexes.
Deprecated since version 3.0: db.collection.ensureIndex() has been
replaced by db.collection.createIndex().
Ref: https://docs.mongodb.com/manual/reference/method/db.collection.ensureIndex/