Does Key order matter in a MongoDB BSON doc? - mongodb

I know certain commends need the hashmap / dictionary to be ordered, but does the actual BSON document in MongoDB matter and would the index still work?
E.g.
db.people.ensureIndex({LName:1, FName:1});
Would it work on both:
{LName:"abc", FName:"def"},
{FName:"ghi", LName:"jkl"}
?
Thanks

The order of a document's properties does not affect indexing.
You can see this for yourself by running this query:
db.people.find({LName: "abc"}).explain()
and then this query:
db.people.find({LName: "jkl"}).explain()
you should see that MongoDB will use the index in both cases (the cursor property should be something like "BtreeCursor LName_1_FName_1").

Related

Refering MongoDB indexes with find() / aggregate()

Is there any way to force a find() or aggregate() query to refer/see a particular existing index in MongoDB. I am asking about the the scenario when a collection has more than one compound indexes.
Yes, $hint is there for that. As mentioned in the documentation, you can use it like that:
db.users.find().hint( { age: 1 } )
What you put in argument is the definition of the index, and not its name. This query would force the use of the index on the age field. I'm not sure though whether it works for aggreate() call as well or not.
Aggregation does not support $hint. There is a open item in MongoDB
https://jira.mongodb.org/browse/SERVER-7944

Mongodb id on bulk insert performance

I have a class/object that have a guid and i want to use that field as the _id object when it is saved to Mongodb. Is it possible to use other value instead of the ObjectId?
Is there any performance consideration when doing bulk insert when there is an _id field? Is _id an index? If i set the _id to different field, would it slow down the bulk insert? I'm inserting about 10 million records.
1) Yes you can use that field as the id. There is no mention of what API (if any) you are using for inserting the documents. So if you would do the insertion at the command line, the command would be:
db.collection.insert({_id : <BSONString_version_of_your_guid_value>, field1 : value1, ...});
It doesn't have to be BsonString. Change it to whatever Bson value is closest matching to your guid's original type (except the array type. Arrays aren't allowed as the value of _id field).
2) As far as i know, there IS effect on performance when db.collection.insert when you provide your own ids, especially in bulk, BUT if the id's are sorted etc., there shouldn't be a performance loss. The reason, i am quoting:
The structure of index is a B-tree. ObjectIds have an excellent
insertion order as far as the index tree is concerned: they are always
increasing, meaning they are always inserted at the right edge of
B-tree. This, in turn, means that MongoDB only has to keep the right
edge of the B-Tree in memory.
Conversely, a random value in the _id field means that _ids will be
inserted all over the tree. Then the machine must move a page of the
index into memory, update a tiny piece of it, then probably ignore it
until it slides out of memory again. This is less efficient.
:from the book `50 Tips and Tricks for MongoDB Developers`
The tip's title says - "Override _id when you have your own simple, unique id." Clearly it is better to use your id if you have one and you don't need the properties of an ObjectId. And it is best if your ids are increasing for the reason stated above.
3) There is a default index on _id field by MongoDB.
So...
Yes. It is possible to use other types than ObjectId, including GUID that will be saved as BinData.
Yes, there are considerations. It's better if your _id is always increasing (like a growing number, or ObjectId) otherwise the index needs to rebuild itself more often. If you plan on using sharding, the _id should also be hashed evenly.
_id indeed has an index automatically.
It depends on the type you choose. See section 2.
Conclusion: It's better to keep using ObjectId unless you have a good reason not to.

mongodb: why indexOnly=false when collection is empty

Let's say I have an empty db without any collections. Then I run db.qqq.ensureIndex({a:1}).
In the output of db.qqq.find().explain() I see BasicCursor and "indexOnly" : false. That seems OK.
db.qqq.find({a:"somevalue"}).explain() outputs BtreeCursor a_1, but it also tells "indexOnly" : false. Why does this happen?
Why the given index isn't enough for mongodb to fulfill my query?
UPD: OK, so I need to use projection, since there is no all fields in my index. But what I don't understand -- if Mongo can see from index that there is no any documents matching query, then why should it scan the actual documents?
You need to add projection to that query, index only means it gets ALL data from the index. MongoDB cannot use an index only cursor if you want to get the full document back. So i.e.:
db.qqq.find({a:"somevalue"},{a:1,_id:0}).explain()
Should work.
MongoDB doesn't know that there are no documents until it searches for them, so it will have to at least check in the index if it can. A "BasicCursor" with "n=0" is not really a bad thing of course as no actual documents are read (or index elements, as there are none).
Also, if you want to use a covered index, you need to use a projection so that only fields are returned that are actually part of the index. You do that with:
db.qqq.find({a:"somevalue"},{a:1,_id:0}).explain()

Add _id when ensuring index?

I am building a webapp using Codeigniter (PHP) and MongoDB.
I am creating indexes and have one question.
If I am querying on three fields (_id, status, type) and want to
create an index do I need to include _id when ensuring the index like this:
db.comments.ensureIndex({_id: 1, status : 1, type : 1});
or will this due?
db.comments.ensureIndex({status : 1, type : 1});
You would need to explicitly include _id in your ensureIndex call if you wanted to include it in your compound index. But because filtering by _id already provides selectivity of a single document that's very rarely the right thing to do. I think it would only make sense if your documents are very large and you're trying to use covered indexes.
MongoDB will currently only use one index per query with the exception of $or queries. If your common query will always be searching on those three fields (_id, status, type) then a compound index would be helpful.
From within the DB shell you can use the explain() command on your query to get information on the indexes used.
You don't need to implicitly create index on the _id field, it's done automatically. See the mongo documentation:
The _id Index
For all collections except capped collections, an index is automatically created for the _id field. This index is special and cannot be deleted. The _id index enforces uniqueness for its keys (except for some situations with sharding).

Dot Notation in Node JS MongoDb queries

Is it possible to use Dot Notation when dealing with nested documents?
http://www.mongodb.org/display/DOCS/Dot+Notation+(Reaching+into+Objects)
I'm trying to query the results of a map/reduce and therefore need to
run a query like this:
find({'_id.page' : 'ThisPage', '_id.user' : 'AUser'})
Trying this in Node code returns no rows but the same query works as
expected in mongodb shell.
Dot notation isn't required for reaching inside of documents for queries, you can use document notation instead.
find({'_id.page' : 'ThisPage', '_id.user' : 'AUser'})
could instead be
find({_id: {page: 'ThisPage', user: 'AUser'}})
It is very possible, I've done it before.
Why do you have nested documents under your _id property? Not sure what your use case is but that seems a bit strange. _id is a special property that is always the unique id of the document. So it might be getting treated special by the driver (i.e. doesn't expect there to be sub documents). Maybe try putting your sub-documents under a different property name.