MongoDB indexes issue - mongodb

MongoDB can store documents with different fields in one collection.
How then indexes will work? If I create index on field that presents not in all documents, the documents which don't have that will not be indexed?

Documents without the field in an index will be indexed as having no value for that field. You probably want to review this: http://docs.mongodb.org/manual/core/indexes/
If you want to not include documents that don't have the key in the index, you can use a sparse index: http://docs.mongodb.org/manual/administration/indexes/#sparse-indexes

Related

Get Number of Documents in MongoDB Index

As per the title, I would like to know if there is a way to get the number of documents in a MongoDB index.
To be clear, I am not looking for either of the following:
How to get the number of documents in a collection -- .count().
How to get the size of an index -- .stats().
An index references all of the documents in its collection unless the index is a sparse index or a partial index. From the docs:
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field. The index is “sparse” because it does not include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing null values for those documents that do not contain the indexed field.
Partial indexes only index the documents in a collection that meet a specified filter expression
So ...
The answer for non sparse and non partial indexes is db.collection.count()
The answer for sparse and partial indexes could be inferred by running a query with no criteria, hinting on that index and then counting the results. For example:
db.collection.find().hint('index_name_here').count()

Making two fields compound index in mongoDb and update the document

I'm using mongoDb in our project and got stuck at a point.
I'm using bulkOperation of mongoDb to save a list of Objects.
I need to make two fields (mac and gatewaytime ) as a compound index on a collection that make up an unique combination for documents in collection.
I wanted to update the whole document if any combination of mac and gatewaytime found with a the new document.
This scenario is similer to the _id index created with each document. Just the difference is that here I need to make a compound index of these two fields.
I found that we can make an compund index as Unique but this just reject any document if found to be duplicate. I need this duplicate document to be updated with the new document in my case.
If question is not undestandable, please let me know freely.

Does Mongodb automatically updates indexed items? [duplicate]

Lets say you have a collection with a field called "primary_key",
{"primary_key":"1234", "name":"jimmy", "lastname":"page"}
and I have an index on "primary_key".
This collection has millions of rows, I want to see how expensive is to change primary_key for one of the records. Does it trigger a reindex of the entire table? or does it just reindex the changed record? in either case is that expensive to do?
Updating an indexed field in mongodb causes an update of the index (or indices if you have more than one) that use it. It does not "reindex". Shouldn't be all that expensive - effectively you will delete the old entry and insert a new one.
This document has a fair amount of detail on mongodb indexes:
http://docs.mongodb.org/master/MongoDB-indexes-guide.pdf
BTW, keep in mind that there is one special field, _id, that mongodb uses as it's primary key
_id
A field required in every MongoDB document. The _id field must have a unique value. You can think of the _id field as the document’s
primary key. If you create a new document without an _id field,
MongoDB automatically creates the field and assigns a unique BSON
ObjectId.
You cannot update the _id field.

Does mongodb reindex if you change the field that it is used in index?

Lets say you have a collection with a field called "primary_key",
{"primary_key":"1234", "name":"jimmy", "lastname":"page"}
and I have an index on "primary_key".
This collection has millions of rows, I want to see how expensive is to change primary_key for one of the records. Does it trigger a reindex of the entire table? or does it just reindex the changed record? in either case is that expensive to do?
Updating an indexed field in mongodb causes an update of the index (or indices if you have more than one) that use it. It does not "reindex". Shouldn't be all that expensive - effectively you will delete the old entry and insert a new one.
This document has a fair amount of detail on mongodb indexes:
http://docs.mongodb.org/master/MongoDB-indexes-guide.pdf
BTW, keep in mind that there is one special field, _id, that mongodb uses as it's primary key
_id
A field required in every MongoDB document. The _id field must have a unique value. You can think of the _id field as the document’s
primary key. If you create a new document without an _id field,
MongoDB automatically creates the field and assigns a unique BSON
ObjectId.
You cannot update the _id field.

How does mongodb index lists

For example: If I had a db collection called Stores, and each store document has a list of the items they sell, and stores generally share items, then how would mongodb build an index on that?
Would it build a btree index on all possible items and then on each leaf of that tree (each item) will reference the documents which contain it?
Background:
I'm trying to perform queries like this using an index:
db.store.find({merchandise:{$exists:true}}) // where 'merchandise' is a list
db.store.find()[merchandise].count()
would an index on 'merchandise' help me?
If not, is my only option creating a separate meta field on 'merchandise' size, and index that?
Schema:
{ _id: 123456,
name: Macys
merchandise: [ 248651234564, 54862101248, 12450184, 1256001456 ]
}
From your document sample if you build your index on merchandise it will be multikey index and that index will be on every item on the array. See Multikey Indexes section in here.
If merchandise is an array of subdocuments, indexing over merchandise will put the index on all field of subdocument in the array. With index you can make queries like
db.store.find("merchandise":248651234564) and it will retrieve all document having merchandise 248651234564
For getting count of merchandise, you can get only get the size of merchandise field of one document like db.store.find()[index].merchandise.length. So creating a seperate field on merchandise size and indexing is a feasible option, if you want to run queries based on merchandise size.
Hope this helps
If you index a field that contains an array, MongoDB indexes each value in the array separately, in a multikey index. When you have 4 documents inside an array, each will act as a key in the index and point to the mentioned document(s).
You can use multikey indexes to index fields within objects embedded in arrays. That means, in your array, you can index a specific field in each document. For example: stuffs.thing : 1.
Read more about Multikey Indexes
Whether you need these indexes would depend on:
How many queries rely on that specific field?
How many updates, inserts hit that specific field (array)?
How many items will that array contain?
...
Remember that indexes slow writes as they need to be updated as well. I'd consider an explain on my queries to measure performance.