Mongodb setting unique field - mongodb

TENANT
{ "_ID" : 11, NAME : "ruben", OPERATION :[{OPERATION_ID: 100, NAME : "Check"}] }
how to set the OPERATION_ID has unique to avoid duplicate values and to avoid null values like primary key?

When you want the OPERATION_IDs to be unique for all tenants, then you can do it like that:
db.tenants.ensureIndex( { operation.OPERATION_ID : 1 }, { unique:true, sparse:true } );
When you want the OPERATION_IDs to be unique per tenant, so that two tenants can both have the operation_ID:100 but no tenant can have operation_id:100 twice, you have to add the _id of the tenant to the index so that any given combination of _id and operation_id is unique.
db.tenants.ensureIndex( { _id: 1, operation.OPERATION_ID : 1 }, { unique:true, sparse:true } );

Adding a unique index on OPERATION.OPERATION_ID will ensure that no two distinct documents will contain an element in OPERATION with the same OPERATION_ID.
If you want to prevent a single document from having two elements in OPERATION with the same OPERATION_ID, you can't use unique indexes; you will have to use set update operators (such as $set and $addToSet). You could turn OPERATION into a subdocument keyed by OPERATION_ID, like so:
{ "_ID" : 11, NAME : "ruben", OPERATION : {"100" : {NAME : "Check"} }}
Then you can enforce uniqueness by issuing updates with $set; for example:
db.<collection>.update({NAME: "ruben"}, {$set: {"OPERATION.100.NAME": "Uncheck"}})
Regarding null values: MongoDB doesn't feature non-null constraints on fields (it doesn't even force a given field to have a single type), so you will have to ensure in your application that null values aren't inserted.

Related

Join 2 Mongo DB tables with Id and ObjectId

My Scenario : I need to join 2 below tables in Mongo DB and condition is
testScenarioId(table 1) = _id (table 2)
Table 1:
{
"_id" : ObjectId("58516a6838fdb54d744ba070"),
"_class" : "com.TestResults",
"testScenarioId" : "581cef861892ad1eb7d124dd",
"runId" : 314,
"status" : "passed"
}
Table 2:
{
"_id" : ObjectId("57f41cb9319ed34079df8a2d"),
"environment" : "STAGE",
"component" : "platform",
"scenarioName" : "ABC-1234",
}
i am able to do if i am joining with same local field and foreign field but not on the above case.
Mongodb does not support type coercion in $lookup. So field of type ObjectId can not be looked up with a string type foreign field.
What you need to do is while saving the testScenarioId, you need to store as objectId.
I tried using $type in aggregation but its not supported. So currently here is no way to do it directly in aggregation pipeline.
If you want to implement join in 2 collections, then you will be insert "testScenarioId" in ObjectId form.
At that time,you have insert id in string form and "lookup" Aggregation does not support this form of Id.
Reason why ObjectId: when query finds Id from first table (table 1), they will get id in ObjectId form, and after then they will compare id with second table parameter "testScenarioId" which was store in string form, and they will not match and query return null data.

MongoDB update with matching shard key and multiple=true

MongoDB recommends all update() operations for a sharded collection that specify the 'multi:false' option must include the shard key in the query condition so the query will hit only a specific shard cluster. If no shard key found and 'multi:false', it returns this error (See http://docs.mongodb.org/manual/core/sharded-cluster-query-router/):
update does not contain _id or shard key for pattern
I am switching my code to use a sharded collection. My code is using update() with 'multi:true' by default and I don't want to change that default option to avoid any potential error above. My question is if I include the shard key in an update() with 'multi:true', will mongos be smart enough to route the query to the specific cluster using the shard key and ignore 'multi: true'?
EDIT:
Checkout these codes, which confirms what #wdberkeley said.
Version 2.4:
https://github.com/mongodb/mongo/blob/v2.4/src/mongo/s/strategy_shard.cpp#L941
Version 2.6:
https://github.com/mongodb/mongo/blob/v2.6/src/mongo/s/chunk_manager_targeter.cpp#L250
Yes. If you have the shard key in the query, like
> db.myShardedCollection.update({ "shardKey" : 22, "category" : "frogs" }, { "$set" : { "category" : "amphibians" } }, { "multi" : true })
then mongos can use the shard key to direct the update to just the shard whose key range includes the value 22. Whether updating 1 document or 1000, all the documents affected have shardkey = 22 so all will be found on on the shard whose range contains 22. This would also work in the case of a range query like
> db.myShardedCollection.update({ "shardKey" : { "$gte" : 22 }, "category" : "frogs" }, { "$set" : { "category" : "amphibians" } }, { "multi" : true })
except for hashed shard keys.

using an Object (subdocument) with varying fields as _id

Our (edX) original Mongo persistence representation uses a bson dictionary (aka object or subdocument) as the _id value (see, mongo/base.py). This id is missing a field.
Can some documents' _id values have more subfields than others without totally screwing up indexing?
What's the best way to handle existing documents without the additional field? Remove and replace them? Try to query w/ new _id format and if fails fall over to query w/o the new field? Try to query with both new and old _id format in one query?
To be more specific, the current format is
{'_id': {
'tag': 'i4x', // yeah, it's always this fixed value
'org': your_school_x,
'course': a_catalog_number,
'category': the_xblock_type,
'name': uniquifier_within_course
}}
I need to add 'run': the_session_or_term_for_course_run or 'course_id': org/course/run.
Documents within a collection need not have values for _id that are of the same structure. Hence, it is perfectly acceptable to have the following documents within a collection:
> db.foo.find()
{ "_id" : { "a" : 1 } }
{ "_id" : { "a" : 1, "b" : 2 } }
{ "_id" : { "c" : 1, "b" : 2 } }
Note that because the index is on only _id, only queries that specify a value for _id will use the index:
db.foo.find({_id:1}) // will use the index on _id
db.foo.find({_id:{state:"Alaska"}) // will use the index on _id
db.foo.find({"_id.a":1}) // will NOT use the index on _id
Note also that only a complete match of the "value" of _id will return a document. So this returns no documents for the collection above:
db.foo.find({_id:{c:1}})
Hence, for your case, you are welcome to add fields to the object that is the value for the _id key. And it does not matter that all documents have a different structure. But if you are hoping to query the collection by_id and have it be efficient, you are going to need to add indexes for all relevant sub parts that might be used in isolation. That is not super efficient.
_id is no different than any other key in this regard.

Get position of selected document in collection [mongoDB]

How to get position (index) of selected document in mongo collection?
E.g.
this document: db.myCollection.find({"id":12345})
has index 3 in myCollection
myCollection:
id: 12340, name: 'G'
id: 12343, name: 'V'
id: 12345, name: 'A'
id: 12348, name: 'N'
If your requirement is to find the position of the document irrespective of any order, that is not
possible as MongoDb does not store the documents in specific order.
However,if you want to know the index based on some field, say _id , you can use this method.
If you are strictly following auto increments in your _id field. You can count all the documents
that have value less than that _id, say n , then n + 1 would be index of the document based on _id.
n = db.myCollection.find({"id": { "$lt" : 12345}}).count() ;
This would also be valid if documents are deleted from the collection.
As far as I know, there is no single command to do this, and this is impossible in general case (see Derick's answer). However, using count() for a query done on an ordered id value field seems to work. Warning: this assumes that there is a reliably ordered field, which is difficult to achieve in a concurrent writer case. In this example _id is used, however this will only work with a single writer case.:
MongoDB shell version: 2.0.1
connecting to: test
> use so_test
switched to db so_test
> db.example.insert({name: 'A'})
> db.example.insert({name: 'B'})
> db.example.insert({name: 'C'})
> db.example.insert({name: 'D'})
> db.example.insert({name: 'E'})
> db.example.insert({name: 'F'})
> db.example.find()
{ "_id" : ObjectId("4fc5f040fb359c680edf1a7b"), "name" : "A" }
{ "_id" : ObjectId("4fc5f046fb359c680edf1a7c"), "name" : "B" }
{ "_id" : ObjectId("4fc5f04afb359c680edf1a7d"), "name" : "C" }
{ "_id" : ObjectId("4fc5f04dfb359c680edf1a7e"), "name" : "D" }
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
{ "_id" : ObjectId("4fc5f053fb359c680edf1a80"), "name" : "F" }
> db.example.find({_id: ObjectId("4fc5f050fb359c680edf1a7f")})
{ "_id" : ObjectId("4fc5f050fb359c680edf1a7f"), "name" : "E" }
> db.example.find({_id: {$lte: ObjectId("4fc5f050fb359c680edf1a7f")}}).count()
5
>
This should also be fairly fast if the queried field is indexed. The example is in mongo shell, but count() should be available in all driver libs as well.
This might be very slow but straightforward method. Here you can pass as usual query. Just I am looping all the documents and checking if condition to match the record. Here I am checking with _id field. You can use any other single field or multiple fields to check it.
var docIndex = 0;
db.url_list.find({},{"_id":1}).forEach(function(doc){
docIndex++;
if("5801ed58a8242ba30e8b46fa"==doc["_id"]){
print('document position is...' + docIndex);
return false;
}
});
There is no way that MongoDB can return this as it does not keep documents in order in the database, just like MySQL f.e. doesn't name row numbers.
The ObjectID trick from jhonkola will only work if only one client creates new elements, as the ObjectIDs are generated on the client side, with the first part being a timestamp. There is no guaranteed order if different clients talk to the same server. Still, I would not rely on this.
I also don't quite understand what you are trying to do though, so perhaps mention that in your question? I can then update the answer.
Restructure your collection to include the position of any entry i.e {'id': 12340, 'name': 'G', 'position': 1} then when searching the database collection(myCollection) using the desired position as a query
The queries I use that return the entire collection all use sort to get a reproducible order, find.sort.forEach works with the script above to get the correct index.

How do you get around missing values in a unique index using mongo db?

The mongo documentation states that "When a document is saved to a collection with unique indexes, any missing indexed keys will be inserted with null values. Thus, it won't be possible to insert multiple documents missing the same indexed key."
So is it impossible to create a unique index on an optional field? Should I create a compound index with say a userId as well to solve this? In my specific case I have a user collection that has an optional embedded oauth object.
e.g.
>db.users.ensureIndex( { "name":1, "oauthConnections.provider" : 1, "oauthConnections.providerId" : 1 } );
My sample user
{ name: "Bob"
,pwd: "myPwd"
,oauthConnections [
{
"provider":"Facebook",
"providerId" : "12345",
"key":"blah"
}
,{
"provider":"Twitter",
"providerId" : "67890",
"key":"foo"
}
]
}
I believe that this is possible: You can have an index that is sparse and unique. This way, non-existant values never make it to the index, hence they can't be duplicate.
Caveat: This is not possible with compound indexes. I'm not quite sure about your question. Your citing a part of the documentation that concerns compound indexes -- there, missing values will be inserted, but from your question I guess you're not looking for a solution w/ compound indexes?
Here's a sample:
> db.Test.insert({"myId" : "1234", "string": "foo"});
> show collections
Test
system.indexes
>
> db.Test.find();
{ "_id" : ObjectId("4e56e5260c191958ad9c7cb1"), "myId" : "1234", "string" : "foo" }
>
> db.Test.ensureIndex({"myId" : 1}, {sparse: true, unique: true});
>
> db.Test.insert({"myId" : "1234", "string": "Bla"});
E11000 duplicate key error index: test.Test.$myId_1 dup key: { : "1234" }
>
> db.Test.insert({"string": "Foo"});
> db.Test.insert({"string": "Bar"});
> db.Test.find();
{ "_id" : ObjectId("4e56e5260c191958ad9c7cb1"), "myId" : "1234", "string" : "foo" }
{ "_id" : ObjectId("4e56e5c30c191958ad9c7cb4"), "string" : "Foo" }
{ "_id" : ObjectId("4e56e5c70c191958ad9c7cb5"), "string" : "Bar" }
Also note that compound indexes can't be sparse
It is not impossible to index an optional field. The docs are talking about a unique index. Once you've specified a unique index, you can only insert one document per value for that field, even if that value is null.
If you want a unique index on an optional field but still allow multiple nulls, you could try making the index both unique and sparse, although I have no idea if that's possible. I couldn't find an answer in the documentation.
There's no good way to uniquely index an optional field. You can either fill it with a default (the _id on the user would work), let your access layer enforce uniqueness, or change your "schema" a bit.
We have a separate collection for oauth login tokens, partially for this reason. We never really need to access those in a context where having them as embedded docs is an obvious win. If this is a relatively easy change to make, it's probably your best bet.
----edit----
As the other answers points, you can achieve this with a sparse index. It's even a documented use. You should probably accept one of those answers instead of mine.
http://www.mongodb.org/display/DOCS/Indexes#Indexes-SparseIndexes