I have the following JSON schema in MongoDB:
{"email": "example#gmail.com", "second_email": "example222#gmil.com"}
How can I enforce that both fields will be unique separately AND also be unique between them.
i.e the following document will not be valid:
{"email":"anotherone#gmail.com", "second_email":"example#gmail.com"}
Because example#gmail.com is already exists in another document in the other field.
Off the top of my head, no database can do this (use another column/field as source data for uniqueness constraint). You will need to do some reshaping of data to achieve this. The easiest way is a unique constraint on an array field.
> db.foo.createIndex({ emails: 1 }, { unique: true } )
> db.foo.insert({ emails: ['example#gmail.com', 'example222#gmail.com'] })
WriteResult({ "nInserted" : 1 })
> db.foo.insert({ emails: ['anotherone#gmail.com', 'example#gmail.com'] })
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 11000,
"errmsg" : "E11000 duplicate key error index: test.foo.$emails_1 dup key: { : \"example#gmail.com\" }"
}
})
Now, depending on your app logic, this emails array can even replace your original two fields. Or not. Up to you. If not, you'll need insert both the original fields and duplicate them in this array for the uniqueness check.
You need to create a a unique index on each field to enforce uniqueness for the fields.
db.collection.createIndex( { "email": 1 }, { "unique": true } )
db.collection.createIndex( { "second_email": 1 }, { "unique": true } )
That being said, MongoDB doesn't not provides a way to enforce uniqueness for two fields in the same documents. This is something you will need to do in your application using an if/else statement.
Another option as shown in this answer here is to use an indexed array field if you do not want to call the createIndex() method multiple times. But you still need to use logical condition processing if you don't want duplicate value in the array.
db.collection.createIndex( { "mails.email": 1, "mails.second_email": 1 }, { unique: true } )
db.collection.insert( { _id: 3, mails: [ { email: "example#gmail.com", second_email: "example222#gmil.com" } ] } )
now you created a "email - second email" pair combination for enforce uniqueness this two field.
Also, if you use bulk option you can set ordered as false to continue with remaining inserts when one fails.. insertMany({},{ordered: false})
Related
I need an index that will provide me uniqueness of the field among all fields. For example, I have the document:
{
_id: ObjectId("123"),
fieldA: "a",
fieldB: "b"
}
and I want to forbid insert the document
{
_id: ObjectId("456"),
fieldA: "new value for field a",
fieldB: "a"
}
because already exists the document that has the value "a" set on field "fieldA". Is it possible?
It seems you need a multikey index with a unique constraint.
Take into account that you can only have one multikey index in each collection for this reason you have to include all the fields you like to uniqueness inside an array
{
_id: ObjectId("123"),
multikey: [
{fieldA: "a"},
{fieldB: "b"}
]
}
Give a try to this code
db.collection.createIndex( { "multikey": 1}, { unique: true } )
To query you have to code
db.collection.findOne({"multikey.fieldA": "a"}, // Query
{"multikey.fieldA": 1, "multikey.fieldB": 1}) // Projection
For more info you can take a look at embedded multikey documents.
Hope this helps.
another option is to create a document with each unique key, indexed by this unique key and perform a loop over the field of each candidate document cancelling the write if any key is found.
IMO this solution is more resource consuming, in change it gets you a list of all keys consumed in written documents.
db.collection.createIndex( { "unikey": 1}, { unique: true } )
db.collection.insertMany( {[{"unikey": "$FieldA"},{"unikey": "$FieldB"}]}
db.collection.find({"unikey": 1})
I am trying to store key value data in MongoDb.
Key could be any string and I don't know about it anything more before storing, value could be any type (int, string, array). And I would like to have an index on such key & value.
I was looking on a (Multikey Index) over an array of my key-vals but looks like it can't cover queries over array fields.
Is it possible to have an index on a custom key & value in mongoDb and make queries with such operations as $exists and $eq and $gte, $lte, $and, $or, $in without COLLSCAN but through an IXSCAN stage?
Or maybe I need another Db for that?
I may have misunderstood your question but I think that this is precisely where MongoDB's strengths are - dealing with different shapes of documents and data types.
So let's say you have to following two documents:
db.test.insertMany([
{
key: "test",
value: [ "some array", 1 ]
},
{
key: 12.7,
values: "foo"
}
])
and you create a compound index like this:
db.test.createIndex({
"key": 1,
"value": 1
})
then the following query will use that index:
db.test.find({ "key": "test", "value": 1 })
and also more complicated queries will do the same:
db.test.find({ "key": { $exists: true }, "value": { gt: 0 } })
You can verify this by adding a .explain() to the end of the above queries.
UPDATE based on your comment:
You don't need the aggregation framework for that. You can simply do something like this:
db.test.distinct("user_id", { "key": { $exists: true } })
This query is going to use the above index. Moreover it can be made even faster by changing the index definition to include the "user_id" field like this:
db.test.createIndex({
"key" : 1.0,
"value" : 1.0,
"user_id" : 1
})
This, again, can be verified by running the following query:
db.test.explain().distinct("user_id", { "key": { $exists: true } })
If your key can be any arbitrary value, then this is impossible. Your best bet is to create an index on some other known field to limit the initial results so that the inevitable collection scan's impact is reduced to a minimum.
I'm trying to create case sensitive index with mongoDB version 3.4? I'm using the following query to create an index but still it allows me to insert data with different case?
db.Test.createIndex( { "type" : 1 },{ unique: true , collation: { locale: 'en' ,caseLevel:true ,strength: 3 } } )
in the above query i'm making Type as unique. First i inserted "apple" into the database and when i try to "apple" it throws duplicate error. but when i try to insert "Apple" it allows me to insert. for me while inserting "Apple" it should throws duplicate error.
Strength 2 will work
db.Test.createIndex({
type:1
},
{
collation:{
locale:"en",
strength:2
},
unique:true
}));
I have the document like this.
[{
"_id" : ObjectId("aaa"),
"host": "host1",
"artData": [
{
"aid": "56004721",
"accessMin": NumberLong(1481862180
},
{
"aid": "56010082",
"accessMin": NumberLong(1481861880)
},
{
"aid": "55998802",
"accessMin": NumberLong(1481861880)
}
]
},
{
"_id" : ObjectId("bbb"),
"host": "host2",
"artData": [
{
"aid": "55922560",
"accessMin": NumberLong(1481862000)
},
{
"aid": "55922558",
"accessMin": NumberLong(1481861880)
},
{
"aid": "55940094",
"accessMin": NumberLong(1481861760)
}
]
}]
while updating any document, duplicate "aid" should not be added again in the array.
One option i got is using the unique index on artData.aid field. But building indexes is not preferred as i wont need it as per the requirement.
Is there any way to solve this?
Option 1: While designing Schema for that document use unique:true.
for example:
var newSchema = new Schema({
artData: [
{
aid: { type: String, unique: true },
accessMin: Number
}]
});
module.exports = mongoose.model('newSchema', newSchema );
Option 2: refer a link to avoid duplicate
As per this doc, you may use a multikey index as follows:
{ "artData.aid": 1 }
That being said, since you dont want to use a multikey index, another option for insertion is to
Query the document to find artData's that match the aid
Difference the result set with the set you are about to insert
remove the items that match your query
insert the remaining items from step 2
Ideally your query from step 1 wont return a set that is too large -- making this a surprisingly fast operation. That said, It's really based on the number of duplicates you assume you will be trying to insert. If the number is really high, the result of the query from step 1 could return a large set of items, in which case this solution may not be appropriate, but its all I've got for you =(.
My suggestion is to really re-evaluate the reason for not using multikey indexing
How can I iterate over all documents matching each value of a specified key in a MongoDB collection?
E.g. for a collection containing:
{ _id: ObjectId, keyA: 1 },
{ _id: ObjectId, keyA: 2 },
{ _id: ObjectId, keyA: 2 },
...with an index of { keyA: 1 }, how can I run an operation on all documents where keyA:1, then keyA:2, and so on?
Specifically, I want to run a count() of the documents for each keyA value. So for this collection, the equivalent of find({keyA:1}).count(), find({keyA:2}).count(), etc.
UPDATE: whether or not the keys are indexed is irrelevant in terms of how they're iterated, so edited title and description to make Q/A easier to reference in the future.
A simpler approach to get the grouped count of unique values for keyA would be to use the new Aggregation Framework in MongoDB 2.2:
eg:
db.coll.aggregate(
{ $group : {
_id: "$keyA",
count: { $sum : 1 }
}}
)
... returns a result set where each _id is a unique value for keyA, with the count of how many times that value appears:
{
"result" : [
{
"_id" : 2,
"count" : 2
},
{
"_id" : 1,
"count" : 1
}
],
"ok" : 1
}
I am not sure I get you here but is this what you are looking for:
db.mycollection.find({ keyA: 1 }).count()
Will count all keys with keyA being 1.
If that does not answer the question do think you can be a little more specific?
Do you mean to do an aggregation for all unique key values for keyA?
It may be implemented with multiple queries:
var i=0;
var f=[];
while(i!=db.col.count()){
var k=db.col.findOne({keyA:{$not:{$in:f}}}).keyA;
i+=db.col.find({keyA:k}).count();
f.push(k);
}
The sense of this code is to collect unique values of KeyA field of objects of col collection in array f, which will be result of operation. Unfortunately, for a while doing this operation you should block any operations, which will change col collection.
UPDATE:
All can be done much easier using distinct:
db.col.distinct("KeyA")
Thanks to #Aleksey for pointing me to db.collection.distinct.
Looks like this does it:
db.ships.distinct("keyA").forEach(function(v){
db.ships.find({keyA:v}).count();
});
Of course calling count() within a loop doesn't do much; in my case I was looking for key-values with more than one document, so I did this:
db.ships.distinct("keyA").forEach(function(v){
print(db.ships.find({keyA:v}).count() > 1);
});