I have a MongoDB schema for users that looks something like this:
{
userId: "some-string",
anonymousId: "some-other-string",
project: {"$oid": "56d06bb6d9f75035956fa7ba"}
}
Users must have either a userId or an anonymousId. As users belong to a project, the model also has a reference called project, which links to the project collection.
Any userId or anonymousId value has to be unique per project, so I created two compound indexes as follows:
db.users.createIndex({ "userId": 1, "project": 1 }, { unique: true })
db.users.createIndex({ "anonymousId": 1, "project": 1 }, { unique: true })
However as not both userId and anonymousId have to be provided but just either one of them, MongoDB throws a duplicate key error for null values (for example if there is a second user with a provided anonymousId but no userId).
I therefore tried to add a sparse: true flag to the compound indexes, but this obviously only works if both fields are empty. I also tried adding the sparse flag only to the fields and not the compound indexes, but this doesn't work either.
To give an example, let's say I have the following three users in the collection:
{ userId: "user1", anonymousId: null, project: {"$oid": "56d06bb6d9f75035956fa7ba"}}
{ userId: "user2", anonymousId: "anonym", project: {"$oid": "56d06bb6d9f75035956fa7ba"}}
{ userId: "user3", anonymousId: "random", project: {"$oid": "56d06bb6d9f75035956fa7ba"}}
The following should be possible:
I want to be able to insert another user {userId: "user4", anonymousId: null} for the same project (without getting a duplicate key error)
However if I try to insert another user with {userId: "user3"} or another user with {anonymousId: "random"} there should be a duplicate key error
How else can I achieve this?
If you are using MongoDB 3.2, you can use unique partial index instead of sparse index.
Partial index is actually recommended over sparse index
Example
db.users.createIndex({ "userId": 1, "project": 1 },
{ unique: true, partialFilterExpression:{
userId: { $exists: true, $gt : { $type : 10 } } } })
db.users.createIndex({ "anonymousId": 1, "project": 1 },
{ unique: true, partialFilterExpression:{
anonymouseId: { $exists: true, $gt : { $type : 10 } } } })
In above example, Unique index will only be created when userId is present and doesn't contain null value. Same holds true to anonymousId too.
Please see https://docs.mongodb.org/manual/core/index-unique/#unique-partial-indexes
index a,c - cannot be sparse as is unique.....
index b,c - cannot be sparse as is unique.....
what about index a,b,c ?
db.benjiman.insert( { userId: "some-string", anonymousId:
"some-other-string", project: {"_oid": "56d06bb6d9f75035956fa7ba"}
})
db.benjiman.insert( { userId: "some-string2", project: {"_oid":
"56d06bb6d9f75035956fa7ba"} })
db.benjiman.insert( { anonymousId: "some-other-string2", project:
{"_oid": "56d06bb6d9f75035956fa7ba"} })
db.benjiman.createIndex({ "userId": 1, "anonymousId": 1, "project": 1 }, { unique: true })
Related
I have 'Users' collection which has two columns, '_id' and 'userName', both of type string.
I want to add third column 'UserId' which will be UUID wrapping the id from _id column.
Tried few ways but without any success.
For example:
{
_id: "fe83f869-154e-4c26-a5db-fb147728820f",
userName: "alex"
}
I want it to be:
{
_id: "fe83f869-154e-4c26-a5db-fb147728820f",
userName: "alex",
UserId: UUID("fe83f869-154e-4c26-a5db-fb147728820f")
}
I tried something like:
db.Users_temp.update(
{},
{ $set: {"UserId": UUID("$_id") } },
false,
true
)
But it results in columns with value UUID("----")
Will appreciate any help.
Ok,
Found a solution to my problem.
db.Users_temp.find().forEach(function(user) {
db.Users_temp.update(
{"_id" : user._id},
{ "$set": {"UserId": UUID(user._id)} }
)
})
this will work
i am not sure why but this works only with set operation as an array rather than as a object
db.Users_temp.update({},[{$set: {'UserId': '$_id'}}])
I have a MongoDB index:
Reservation.index(
{
source: 1,
accountID: 1, // <-- This is the only required field
confirmationCode_1: 1,
confirmationCode_2: 1,
confirmationCode_3: 1
},
{name: "Unique_reservation_index_1", unique: true}
);
Here are some sample entries I have in the database and I want to make sure that duplicates can't be made:
[
{
source: "A",
accountID: "AAA",
confirmationCode_1: "ABC"
},
{
source: "B",
accountID: "BBB",
confirmationCode_1: "ABC"
confirmationCode_2: "DEF"
},
{
source: "C",
accountID: "CCC",
confirmationCode_3: "GHI"
}
]
Sometimes I have confirmationCode_1 set and not confirmationCode_2 other times I both confirmationCode_1 and confirmationCode_2 set. Other times I have confirmationCode_3 set.
I want MongoDB to allow me to have the following doc (missing the confirmationCode_2 and confirmationCode_3 fields). Will it let me with the above index?
{
source: "A",
accountID: "123",
confirmationCode_1: "ABC"
}
Will it prevent me from adding two similar docs with confirmationCode_2 not defined or will that be considered the same? For example, if it does allow the above doc, will this be prevented?
{
source: "A",
accountID: "AAA",
confirmationCode_1: "ABC_2"
}
If I don't supply the confirmationCode_2 field, does it set the confirmationCode_2 field to null?
If I change the unique index to include sparse: true, how will it act differently?
Reservation.index(
{
source: 1,
accountID: 1, // <-- This is the only required field
confirmationCode_1: 1,
confirmationCode_2: 1
},
{name: "Unique_reservation_index_1", unique: true, sparse: true}
);
From MongoDB document on unique Index,
A unique index ensures that the indexed fields do not store duplicate values
undefined / empty / null field is allowed as long as you do not have the same tuple of values of the fields in the compound index.
Below is my actual testing result:
You can observe that the document is successfully added under the unique index.
Will unique indexes ignore fields that don't exist?
No, the index will store a null value for this field, MongoDB will enforce uniqueness on the combination of the index key values.
//You have this docuemt on you MongoDB
{
source: "A",
accountID: "123",
confirmationCode_1: "ABC"
}
//You try to insert the next document, note the missing "accountID" field
//Even though "source" and "confirmationCode_1"
//This operation SUCCESS because
//MongoDB will enforce uniqueness on the "combination" of the index key values
{
source: "A",
confirmationCode_1: "ABC"
}
//You try to insert the next document
//The operation FAIL to insert the document
//because of the violation of the unique constraint
//on the combination of key values
{
source: "A",
accountID: "123",
confirmationCode_1: "ABC"
}
What if you change unique: true to unique: true, sparse: true ?
An index that is both sparse and unique prevents collection from
having documents with duplicate values for a field but allows multiple
documents that omit the key.
I am learning MongoDB and I've encountered a thing that mildly annoys me.
Let's say I got this collection:
[
{
_id: ObjectId("XXXXXXXXXXXXXX"),
name: "Tom",
followers: 10,
active: true
},
{
_id: ObjectId("XXXXXXXXXXXXXX"),
name: "Rob",
followers: 109,
active: true
},
{
_id: ObjectId("XXXXXXXXXXXXXX"),
name: "Jacob",
followers: 2,
active: false
}
]
and I rename the name column to username with the command:
db.getCollection('users').update({}, { $rename: { "name" : "username" }}, false, true)
now the username property is at the end of the record, example:
[
// ... rest of collection has the same structure
{
_id: ObjectId("XXXXXXXXXXXXXX"),
followers: 109,
active: true,
username: "Rob"
}
// ... rest of collection has the same structure
]
How do I prevent this from happening or how do I place them in a specific order? This is infuriating to work with in Robo/Studio 3T. I've got a collection with about 15 columns which are now out of order which in the GUI because of this
The $rename operator logically performs an $unset of both the old name and the new name, and then performs a $set operation with the new name. As such, the operation may not preserve the order of the fields in the document; i.e. the renamed field may move within the document.
Documentation
It is the behaviour from version 2.6
Since it is JSON based, you can get any field easily. And you have very less columns.
Keys in JSON objects are in their very nature unordered. See RFC 4627 which defines JSON, section 1 "Introduction":
An object is an unordered collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array.
(Emphasis mine)
Therefore, it would even be correct, if you wrote
{
"name": "Joe",
"city": "New York"
}
and got back
{
"city": "New York",
"name": "Joe"
}
I want to create partial index on one of the indexed field
but I am failing miserably
db.Comment.createIndex(
{ "siteId": 1,
{ { "parent": 1} ,{partialFilterExpression:{parent:{$exists: true}}}},
"updatedDate": 1,
"label": 1 }
);
how to do that?
the field "parent" is the one I want to index partially
In roboMongo I get the error
Error: Line 3: Unexpected token {
You pass the partialFilterExpression object as a second parameter to createIndex. See the documentation.
db.Comment.createIndex(
{ "siteId": 1, "parent": 1, "updatedDate": 1, "label": 1 },
{ partialFilterExpression: { parent: { $exists: true } }
);
So don't think of it as partially indexing a field; your partial filter expression defines which documents to include in your index.
It seems to me that when you are creating a Mongo document and have a field {key: value} which is sometimes not going to have a value, you have two options:
Write {key: null} i.e. write null value in the field
Don't store the key in that document at all
Both options are easily queryable, in one you query for {key : null} and the other you query for {key : {$exists : false}}.
I can't really think of any differences between the two options that would have any impact in an application scenario (except that option 2 has slightly less storage).
Can anyone tell me if there are any reasons one would prefer either of the two approaches over the other, and why?
EDIT
After asking the question it also occurred to me that indexes may behave differently in the two cases i.e. a sparse index can be created for option 2.
Indeed you have also a third possibility :
key: "" (empty value)
And you forget a specificity about null value.
Query on
key: null will retrieve you all document where key is null or where key doesn't exist.
When a query on $exists:false will retrieve only doc where field key doesn't exist.
To go back to your exact question it depends of you queries and what data represent.
If you need to keep that, by example, a user set a value then unset it, you should keep the field as null or empty. If you dont need, you may remove this field.
Note that, since MongoDB doesnt use field name dictionary compression, field:null consumes disk space and RAM, while storing no key at all doesnt consume resources.
It really comes down to:
Your scenario
Your querying manner
Your index needs
Your language
I personally have chosen to store null keys. It makes it much easier to integrate into my app. I use PHP with Active Record and uisng null values makes my life a lot easier since I am not having to put the stress of field depedancy upon the app. Also I do not need to make any complex code to deal with magics to set non-existant variables.
I personally would not store an empty value like "" since if your not careful you could have two empty values null and "" and then you'll have a hap-hazard time of querying specifically. So I personally prefer null for empty values.
As for space and index: it depends on how many rows might not have this colum but I doubt you will really notice the index size increase due to a few extra docs with null in. I mean the difference in storage is mineute especially if the corresponding key name is small as well. That goes for large setups too.
I am quite frankly unsure of the index usage between $exists and null however null could be a more standardised method by which to query the existance since remember that MongoDB is schemaless which means you have no requirement to have that field in the doc which again produces two empty values: non-existant and null. So better to choose one or the other.
I choose null.
Another point you might want to consider is when you use OGM tools like Hibernate OGM.
If you are using Java, Hibernate OGM supports the JPA standard. So if you can write a JPQL query, you would be theoretically easy if you want to switch to an alternate NoSQL datastore which is supported by the OGM tool.
JPA does not define a equivalent for $exists in Mongo. So if you have optional attributes in your collection then you cannot write a proper JPQL for the same. In such a case, if the attribute's value is stored as NULL, then it is still possible to write a valid JPQL query like below.
SELECT p FROM pppoe p where p.logout IS null;
I think in terms of disk space the difference is negligible. If you need to create an index on this field then consider Partial Index.
In index with { partialFilterExpression: { key: { $exists: true } } } can be much smaller than a normal index.
Also should be noted, that queries look different, see values like this:
db.collection.insertMany([
{ _id: 1, a: 1 },
{ _id: 2, a: '' },
{ _id: 3, a: undefined },
{ _id: 4, a: null },
{ _id: 5 }
])
db.collection.aggregate([
{
$set: {
type: { $type: "$a" },
ifNull: { $ifNull: ["$a", true] },
defined: { $ne: ["$a", undefined] },
existing: { $ne: [{ $type: "$a" }, "missing"] }
}
}
])
{ _id: 1, a: 1, type: double, ifNull: 1, defined: true, existing: true }
{ _id: 2, a: "", type: string, ifNull: "", defined: true, existing: true }
{ _id: 3, a: undefined, type: undefined, ifNull: true, defined: false, existing: true }
{ _id: 4, a: null, type: null, ifNull: true, defined: true, existing: true }
{ _id: 5, type: missing, ifNull: true, defined: false, existing: false }
Or with db.collection.find():
db.collection.find({ a: { $exists: false } })
{ _id: 5 }
db.collection.find({ a: { $exists: true} })
{ _id: 1, a: 1 },
{ _id: 2, a: '' },
{ _id: 3, a: undefined },
{ _id: 4, a: null }
db.collection.find({ a: null })
{ _id: 3, a: undefined },
{ _id: 4, a: null },
{ _id: 5 }
db.collection.find({ a: {$ne: null} })
{ _id: 1, a: 1 },
{ _id: 2, a: '' },
db.collection.find({ a: {$type: "null"} })
{ _id: 4, a: null }