I am experimenting with creating a text index in MongoDB for several fields in a sub-document. These fields are not the same from document to document. According to the documentation, I would create a normal text index like so:
db.collection.ensureIndex({
subject: "text",
content: "text"
});
In my case, I want the index on all fields in the fs.files collection at db.fs.files.metadata. I've tried this:
db.fs.files.ensureIndex({'metadata.$**': 'text'});
I don't believe this has worked, as searching with db.fs.files.runCommand('text'... returns zero results, and db.fs.files.stats() shows the index as a very small size (and I have ~35k documents in this collection).
How can I create a text index on field values of a subdocument where the keys are not known ahead of time?
If you create a {'$**': 'text'} index on the parent document it will index the subdocument fields too. The docs say it only affects text so it will skip the file data but will include the name & contentType.
Related
I'm planning to add a Collection to a mongodb database that will have a text field that should be unique for each Document. Lookups from this Collection will almost always be based on this field. This field can contain as many as 100+ chars.
My question is, should this field be the _id field, or should I just add an index for it? What would the performance impact for either approach be?
I suggest you to use your unique text as _id.
It will reduce data size and eliminate an index. Here is the reference. 9th page will guide you.
For example: If I had a db collection called Stores, and each store document has a list of the items they sell, and stores generally share items, then how would mongodb build an index on that?
Would it build a btree index on all possible items and then on each leaf of that tree (each item) will reference the documents which contain it?
Background:
I'm trying to perform queries like this using an index:
db.store.find({merchandise:{$exists:true}}) // where 'merchandise' is a list
db.store.find()[merchandise].count()
would an index on 'merchandise' help me?
If not, is my only option creating a separate meta field on 'merchandise' size, and index that?
Schema:
{ _id: 123456,
name: Macys
merchandise: [ 248651234564, 54862101248, 12450184, 1256001456 ]
}
From your document sample if you build your index on merchandise it will be multikey index and that index will be on every item on the array. See Multikey Indexes section in here.
If merchandise is an array of subdocuments, indexing over merchandise will put the index on all field of subdocument in the array. With index you can make queries like
db.store.find("merchandise":248651234564) and it will retrieve all document having merchandise 248651234564
For getting count of merchandise, you can get only get the size of merchandise field of one document like db.store.find()[index].merchandise.length. So creating a seperate field on merchandise size and indexing is a feasible option, if you want to run queries based on merchandise size.
Hope this helps
If you index a field that contains an array, MongoDB indexes each value in the array separately, in a multikey index. When you have 4 documents inside an array, each will act as a key in the index and point to the mentioned document(s).
You can use multikey indexes to index fields within objects embedded in arrays. That means, in your array, you can index a specific field in each document. For example: stuffs.thing : 1.
Read more about Multikey Indexes
Whether you need these indexes would depend on:
How many queries rely on that specific field?
How many updates, inserts hit that specific field (array)?
How many items will that array contain?
...
Remember that indexes slow writes as they need to be updated as well. I'd consider an explain on my queries to measure performance.
MongoDB can store documents with different fields in one collection.
How then indexes will work? If I create index on field that presents not in all documents, the documents which don't have that will not be indexed?
Documents without the field in an index will be indexed as having no value for that field. You probably want to review this: http://docs.mongodb.org/manual/core/indexes/
If you want to not include documents that don't have the key in the index, you can use a sparse index: http://docs.mongodb.org/manual/administration/indexes/#sparse-indexes
I currently have a collection of small documents. Each document has an indexed geospacial field and *the default _id is never used in any query*. There will never be more than one document related to a particular geo location. I think it makes sense to override the default _id, and use the geospacial data for this somehow.
Question is, how do you use geospacial data as the unique id? Is it a case of creating a flat string from the geo field? E.g. 'x123456y123456'?
The _id field is the unique identifier for each document and thus is a needed field. The _id field is generated on document creation automatically if one is not provided. If you can provide this geospaital value when creating the document you should be able to use the string as you suggested, you cannot use an array as the _id value. However please be aware that once a document is created the _id becomes unchangeable. This means that using the _id field as a meaningful index of geospatial data may not be of much value.
Have a look here for more info on the _id field and here for some information about creating geospatial indexes in Mongo
I have an collection "companies" and the documents have this structure:
id, name, address, branch, city
I want to add an keyword field that will have an index, so I can do a fulltext search, but how can I add a field to each document?
Thanks for help
There's no schema in MongoDB, so you don't have to add a field to every document.
Just start writing new documents with this field, or update old documents when you have this value for them.
As for indexing, now you can leverage sparse indexes, they will be more efficient if most of your documents don't have this field.
Also, you might want this keyword field to be an array. It can be handled more efficiently than a string.
If you want to add a field with the same value to all documents in a collection, you can use this:
db.collection.update({}, // update all documents
{$set : {keyword : []}}, // or another value
false, // is upsert?
true) // is multi-update?
When you do a $set, you can't use values from other fields. So if this new value is going to be a function of other fields, you have no other option, but to loop through the documents and update them one by one.