Does build createIndex() a new index over the whole collection? - mongodb

What happens when I add a new document to my collection and run the createIndex() function. Does MongoDB build a new index over the whole collection (time and resource consuming)? Or is MongoDB just updating the index for the single document (time and resource saving)? I am not sure because I found this in the documentation (3.0):
By default, MongoDB builds indexes in the foreground, which prevents all read and write operations to the database while the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur during a foreground index build.
I am asking because I need a dynamic 2dsphere index which will be updated continuous. If it needs to build the index everytime over the whole collection it would take too much time.

Building an index indexes all existing documents and also cause all future documents to get indexed as well. So if you insert new documents into an already indexed collection (or update the indexed fields of existing documents), the indexes will get updated.
When you call createIndex and an index of the same type over the same fields already exists, nothing happens.

Related

How to manually create empty MongoDB index on a new field?

I have a huge collection, more than 2TiB of data. During release of a new feature I add an index of new field, that 100% sure doesn't exist in any document, MongoDB will still perfom a full scan for this field, which may process for a long time.
Is there any hack to just manually create an empty index file with valid structure and notify MongoDB node about it, so it will load index into memory and everything else MongoDB is doing when index is crerated?
Unlike in relational RDBMS, MongoDB creates indexes also on non-existing fields, i.e. it scans the entire collection.
Index creation runs in background, so it should not harm so much.
See createIndexes
Changed in version 4.2.
For feature compatibility version (fcv) "4.2", all index builds use an optimized build process that holds the exclusive lock only at the beginning and end of the build process. The rest of the build process yields to interleaving read and write operations. MongoDB ignores the background option if specified.
If you run MongoDB version 4.2 or earlier, then you may specify option { background: true }

Pymongo stop index updating while inserting new documents

Is there a way to prevent index updating when inserting new documents (in a for loop) ?
I have a multikey index, and the collection is about 2 million documents, so removing the index and recreating it is not practical, since I'm inserting documents in a loop and I do not want an index for the newly inserted ones.
No, updates to indexes are done synchronously as part of the write operation itself.
What is your goal here though, to not index those new documents at all? If so, perhaps creating an appropriate Partial Index would be the correct approach here?

Does MongoDB not update index entries upon document deletion?

we're using MongoDb 4.0 with Spring Data MongoDB and we noticed that when doing some housekeeping by batch-deleting millions of documents using external Studio3T that all index entries on all indexes stayed untouched. I read lots of MongoDb documentation regarding this but couldn't find any reference to that circumstance.
If this code does not trigger an index update, then which code does?
Query query = new Query();
query.addCriteria(Criteria.where("modifiedAt").lte(LocalDateTime.now()));
// Does not remove index entries
mongoTemplate.findAllAndRemove(query, MyModel.class);
// Does not either
mongoTemplate.remove(query, MyModel.class);
// Does not either
mongoTemplate.findAll(MyModel.class).forEach(mongoTemplate::remove);
Having an effective mechanic of removing documents for housekeeping purposes and having their index entries removed at the same time is important to us as the Index size is growing and does not fit in memory anymore. Therefore we're required to scale up our hardware here which is more expensive unnecessarily.
I know there are ways to trigger this manually, e. g. dropping indexes and recreating them, or using the compact administrative function. However in a 24/7 onlineshop use case this seems rather unpractical.

Does MongoDB ensureIndex perform a rebuild?

I'm reading MongoFB documentation.
At this url
http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/
I read
"Create the new index using the ensureIndex() in the mongo shell, or comparable method in your driver.
This operation will create or rebuild the index on this mongod instance"
I understand well?
If the index is already present, does mongoDB perform a rebuild?
So the difference with "reIndex()" is that reIndex() performa a rebuild on all the indexes of a collection.
Is it correct?
I believe that is either an English mistake or talking about if you are rebuilding indexes on replica sets either way the documentation for ensureIndex() specifically states:
http://docs.mongodb.org/manual/reference/method/db.collection.ensureIndex/#behaviors (the one linked to from here: http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/#build-the-index )
If you call multiple ensureIndex() methods with the same index specification at the same time, only the first operation will succeed, all other operations will have no effect.
So calling it again should not result in a rebuild unless you are rebuilding your indexes.
If you want to rebuild the index you must drop it first and then rerun ensureIndex():
To add or change index options you must drop the index using the dropIndex() method and issue another ensureIndex() operation with the new options.
If you create an index with one set of options, and then issue the ensureIndex() method with the same index fields and different options without first dropping the index, ensureIndex() will not rebuild the existing index with the new options.

How often shall we reindex the geospatial data in mongodb?

I wonder if it is a must of reindexing the geospatial data in mongodb if there are some new geo-data has been inserted in order to search them? Say we have a document,which looks like:
{user:'a',loc:[363.236,-45.365]}, and it is indexed. Later on, I inserted document b, which looks like: {user:'b',loc:{42.3654,-56.3}}. In order to search, do I have to reindex (using ensureIndex()) the collection every time when a new document is inserted? Will the frequent reindexing affect the overall application performance?
Thanks.
You only need to ensureIndex once; after that MongoDB maintains the index on every insert. I'm not 100% sure the index is maintained for deletes though - I imagine it must do.
You can defragment an index and rebuild it to make it smaller, hence the existence of the functionality. A useful post:
http://jasonwilder.com/blog/2012/02/08/optimizing-mongodb-indexes/