unique compound text indexes on mongodb - mongodb

I've created an index on two fields f1, f2 with db.test.createIndex({"f1":"text","f2":"text"},{unique:true})
{
"v" : 2,
"unique" : true,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "f1_text_f2_text",
"ns" : "test.test",
"weights" : {
"f1" : 1,
"f2" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 3
}
When I insert the two documents
db.test.insert({f1:"hello",f2:"there"})
db.test.insert({f1:"hello",f2:"there2"})
I get a duplicate key error
"E11000 duplicate key error collection: test.test index: f1_text_f2_text dup key: { : \"hello\", : 1.1 }"
however db.test.insert({f1:"hello2",f2:"there"}) works.
Are compound text indexes not supposed to work like regular compound indexes?

Are you sure that you want a unique text index?
If you create a standard compound index:
db.test.createIndex({"f1": 1, "f2": 1}, {unique: true})
Then the following inserts will all be successful:
db.test.insert({f1:"hello",f2:"there"})
db.test.insert({f1:"hello",f2:"there1"})
db.test.insert({f1:"hello",f2:"there2"})
And this insert will then fail with E11000 duplicate key error collection:
db.test.insert({f1:"hello",f2:"there"})
You don't have to create a text index in order to index string fields. A text index has a very specific role in supporting text searches but not all string searches require a text index. So, if you must ...
Facilitate 'quick' text matches covering both f1 and f2
Enforce uniqueness across f1 and f2
... then I suspect you will need to create two indexes:
db.test.createIndex({"f1":"text", "f2":"text"})
db.test.createIndex({"f1": 1, "f2": 1}, {unique: true})

Related

Mongodb query does not use prefix on compound index with text field

I've created the following index on my collection:
db.myCollection.createIndex({
user_id: 1,
name: 'text'
})
If I try to see the execution plan of a query containing both fields, like this:
db.getCollection('campaigns').find({
user_id: ObjectId('xxx')
,$text: { $search: 'bla' }
}).explain('executionStats')
I get the following results:
...
"winningPlan" : {
"stage" : "TEXT",
"indexPrefix" : {
"user_id" : ObjectId("xxx")
},
"indexName" : "user_id_1_name_text",
"parsedTextQuery" : {
"terms" : [
"e"
],
"negatedTerms" : [],
"phrases" : [],
"negatedPhrases" : []
},
"inputStage" : {
"stage" : "TEXT_MATCH",
"inputStage" : {
"stage" : "TEXT_OR",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"user_id" : 1.0,
"_fts" : "text",
"_ftsx" : 1
},
"indexName" : "user_id_1_name_text",
"isMultiKey" : true,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "backward",
"indexBounds" : {}
}
}
}
}
...
As stated in the documentation, MongoDB can use index prefixes to perform indexed queries.
Since user_id is a prefix for the index above, I'd expect that a query only by user_id would use the index, but if I try the following:
db.myCollection.find({
user_id: ObjectId('xxx')
}).explain('executionStats')
I get:
...
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"user_id" : {
"$eq" : ObjectId("xxx")
}
},
"direction" : "forward"
},
...
So, it is not using the index at all and performing a full collection scan.
In general MongoDB can use index prefixes to support queries, however compound indexes including geospatial or text fields are a special case of sparse compound indexes. If a document does not include a value for any of the text index field(s) in a compound index, it will not be included in the index.
In order to ensure correct results for a prefix search, an alternative query plan will be chosen over the sparse compound index:
If a sparse index would result in an incomplete result set for queries and sort operations, MongoDB will not use that index unless a hint() explicitly specifies the index.
Setting up some test data in MongoDB 3.4.5 to demonstrate the potential problem:
db.myCollection.createIndex({ user_id:1, name: 'text' }, { name: 'myIndex'})
// `name` is a string; this document will be included in a text index
db.myCollection.insert({ user_id:123, name:'Banana' })
// `name` is a number; this document will NOT be included in a text index
db.myCollection.insert({ user_id:123, name: 456 })
// `name` is missing; this document will NOT be included in a text index
db.myCollection.insert({ user_id:123 })
Then, forcing the compound text index to be used:
db.myCollection.find({user_id:123}).hint('myIndex')
The result only includes the single document with the indexed text field name, rather than the three documents that would be expected:
{
"_id": ObjectId("595ab19e799060aee88cb035"),
"user_id": 123,
"name": "Banana"
}
This exception should be more clearly highlighted in the MongoDB documentation; watch/upvote DOCS-10322 in the MongoDB issue tracker for updates.
This behavior is due to text indexes being sparse by default:
For a compound index that includes a text index key along with keys of
other types, only the text index field determines whether the index
references a document. The other keys do not determine whether the
index references the documents or not.
The query filter is not referencing the text index field, so the query planner won't consider this index as it can't be certain that the full result set of documents will be returned by the index.

Add field that is unique index to collection in MongoDB

I'm trying to add a username field to documents in a 'users' collection, and I'd like it to be a unique index. (So far, we've been using email addresses for login but we'd like to add a username field as well.) However, running db.users.ensureIndex({username:1},{unique:true}) fails because mongo considers all the unset usernames to be duplicates and therefore not unique. Anybody know how to get around this?
Show the current users and username if they have one:
> db.users.find({},{_id:0,display_name:1,username:1})
{ "display_name" : "james" }
{ "display_name" : "sammy", "username" : "sammy" }
{ "display_name" : "patrick" }
Attempt to make the 'username' field a unique index:
> db.users.ensureIndex({username:1},{unique:true})
{
"err" : "E11000 duplicate key error index: blend-db1.users.$username_1 dup key: { : null }",
"code" : 11000,
"n" : 0,
"connectionId" : 272,
"ok" : 1
}
It doesn't work because both james and sammy have username:null.
Let's set patrick's username to 'patrick' to eliminate the duplicate null value.
> db.users.update({display_name: 'patrick'}, { $set: {username: 'patrick'}});
> db.users.ensureIndex({username:1},{unique:true})
> db.users.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "blend-db1.users",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"username" : 1
},
"unique" : true,
"ns" : "blend-db1.users",
"name" : "username_1"
}
]
Now it works!
To clarify the question, what I'd like is to be able to make username a unique index without having to worry about all the documents that have username still set to null.
Try creating a unique sparse index:
db.users.ensureIndex({username:1},{unique:true,sparse:true})
As per the docs:
You can combine the sparse index option with the unique indexes option
so that mongod will reject documents that have duplicate values for a
field, but that ignore documents that do not have the key.
Although this only works for documents which don't have the field, as opposed to documents that do have the field, but where the field has a null value.

mongo _id field duplicate key error

I have a collection with the _id field as a IP with type String.
I'm using mongoose, but here's the error on the console:
$ db.servers.remove()
$ db.servers.insert({"_id":"1.2.3.4"})
$ db.servers.insert({"_id":"1.2.3.5"}) <-- Throws dup key: { : null }
Likely, it's because you have an index that requires a unique value for one of the fields as shown below:
> db.servers.remove()
> db.servers.ensureIndex({"name": 1}, { unique: 1})
> db.servers.insert({"_id": "1.2.3"})
> db.servers.insert({"_id": "1.2.4"})
E11000 duplicate key error index: test.servers.$name_1 dup key: { : null }
You can see your indexes using getIndexes() on the collection:
> db.servers.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "test.servers",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"name" : 1
},
"unique" : true,
"ns" : "test.servers",
"name" : "name_1"
}
]
I was confused by exactly the same error today, and later figured it out. It was because I removed a indexed property from a mongoose schema, but did not drop that property from the mongodb index. The error message is infact that the new document has an indexed property whose value is null (not in the json).

Adding unique index in MongoDB ignoring nulls

I'm trying to add unique index on a group of fields in MongoDB. Not all of those fields are available in all of the documents and I'd like to index only those which have all of the fields.
So, I'm trying to run this:
db.mycollection.ensureIndex({date:1, type:1, reference:1}, {sparse: true, unique: true})
But I get an error E11000 duplicate key error index on a field which misses 'type' field (there are many of them and they are duplicate, but I just want to ignore them).
Is it possible in MongoDB or there is some workaround?
There are multiple people who want this feature and because there is no workaround for this, I would recommend voting up feature request Jira tickets in jira.mongodb.org:
SERVER-785 - support filtered (partial) indexes
SERVER-2193 - sparse indexes only support single field
Note that because 785 would provide a way to enforce this feature, 2193 is marked "won't fix" so it may be more productive to vote up and add your comments to 785.
The uniqueness, you can guarantee, using upsert operation instead of doing insert. This will make sure that if some document already exist then it will update or insert if document don't exist
test:Mongo > db.test4.ensureIndex({ a : 1, b : 1, c : 1}, {sparse : 1})
test:Mongo > db.test4.update({a : 1, b : 1}, {$set : { d : 1}}, true, false)
test:Mongo > db.test4.find()
{ "_id" : ObjectId("51ae978960d5a3436edbaf7d"), "a" : 1, "b" : 1, "d" : 1 }
test:Mongo > db.test4.update({a : 1, b : 1, c : 1}, {$set : { d : 1}}, true, false)
test:Mongo > db.test4.find()
{ "_id" : ObjectId("51ae978960d5a3436edbaf7d"), "a" : 1, "b" : 1, "d" : 1 }
{ "_id" : ObjectId("51ae97b960d5a3436edbaf7e"), "a" : 1, "b" : 1, "c" : 1, "d" : 1 }

mongodb:how to add one field to the _id index composed of a Compound index

I can't remove the _id index, why?
When I try running the dropIndexes command, it removes all indexes but not the _id index.
Doing 'db.runCommand' doesn't work either:
> db.runCommand({dropIndexes:'fs_files',index:{_id:1}})
{ "nIndexesWas" : 2, "errmsg" : "may not delete _id index", "ok" : 0 }
not ok.
Can i use a field including _id in a composite index?
I couldn't find anything online, the ensureindex command can't do it.
db.fs_files.ensureIndex({'_id':1, 'created':1});
the above command just created a new composite index. i haven't found some similar 'create Index' command.
the default _id index is a unique index?
the getIndexes returns it's not a unique index.
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "gridfs.fs_files",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"created" : 1
},
"unique" : true,
"ns" : "gridfs.fs_files",
"name" : "created_1"
}
There is a createIndex command in addition to ensureIndex also.
E.g.
db.<coll>.createIndex({foo:1})
You cannot delete the index on "_id" in mongodb.
Please see the documentation here