Keeping default mongo _id and unique index of MondoDB - mongodb

Is it good or bad practice to keep the standard "_id" generated my mongo in a document as well as my own unique identifier such as "name", or should I just replace _id generated with the actual name so my documents will look like this:
{
_id: 782yb238b2327b3,
name: "my_name"
}
or just like this:
{
_id: "my_name"
}

This depends on the scenario, there is nothing wrong with having your own unique ID, it may be string or a number, completely depends on your situation as long as its unique, the important thing is you are in charge of it. You would want to add an index to it of course.
for example i have an additional ID field which is a number called 'ID', because i required a sequential number as an identifier, another usecase may be that your migrating an application so you have to conform to a particular sequence pattern.
The sequences for the unique identifies could easily be stored in a separate document/collections.
There is no issue with using the built in _id if you have no requirement not to have a custom one, an interesting fact is that you can get the created date out of the _id. Always useful.
db.col.insert( { name: "test" } );
var doc = db.col.findOne( { name: "test" } );
var timestamp = doc._id.getTimestamp();

Related

Is it better to save id of a document in another document as ObjectId or String

Lets take a simple "bad" example : lets assume I have 2 collections 'person' and 'address'. And lets assume in 'address' I want to store '_id' of the person the address is associated with. Is there any benefit to store this "referential key" item as ObjectId vs string in 'address' collection?
I feel like storing them as string should not hurt but I have not worked in mongo for very long and do not know if it will hurt down the road if I follow this pattern.
I read the post here : Store _Id as object or string in MongoDB?
And its said that ObjectId is faster, and I assume its true if you are fetching/updating using the ObjectId in parent collection(for eg. fetching/updating 'person' collection using person._id as ObjectId), but I couldn't find anything that suggests that same could be true if searching by string id representation in other collection(in our example search in address collection by person._id as string)
Your feedback is much appreciated.
Regardless of performance, you should store the "referential key" in the same format as the _id field that you are referring too. That means that if your referred document is:
{ _id: ObjectID("68746287..."), value: 'foo' }
then you'd refer to it as:
{ _id: ObjectID(…parent document id…), subDoc: ObjectID("68746287...")
If the document that you're pointing to has a string as an ID, then it'd look like:
{ _id: "derick-address-1", value: 'foo' }
then you'd refer to it as:
{ _id: ObjectID(…parent document id…), subDoc: "derick-address-1" }
Besides that, because you're talking about persons and addresses, it might make more sense to not have them in two documents altogether, but instead embed the document:
{ _id: ObjectID(…parent document id…),
'name' : 'Derick',
'addresses' : [
{ 'type' : 'Home', 'street' : 'Victoria Road' },
{ 'type' : 'Work', 'street' : 'King William Street' },
]
}
As for use string as id of document, in meteor collection, you could generate the document id either Random.id() as string or Meteor.Collection.ObjectID() as ObjectId.
In this discussion loop, Mongodb string id vs ObjectId, here is one good summary,
ObjectId Pros
it has an embedded timestamp in it.
it's the default Mongo _id type; ubiquitous
interoperability with other apps and drivers
ObjectId Cons
it's an object, and a little more difficult to manipulate in practice.
there will be times when you forget to wrap your string in new ObjectId()
it requires server side object creation to maintain _id uniqueness
- which makes generating them client-side by minimongo problematic
String Pros
developers can create domain specific _id topologies
String Cons
developer has to ensure uniqueness of _ids
findAndModify() and getNextSequence() queries may be invalidated
All those information above is based on the meteor framework. For Mongodb, it is better to use ObjectId, reasons are in the question linked in your question.
Storing it as objectId is benificial. It is faster as ObjectId size is 12 bytes compared to string which takes 24 bytes.
Also, You should try to de-normalize your collections so that you don't need to make 2 collections (Opposite to RDBMS).
Something like this might be better in general:
{ _id : "1",
person : {
Name : "abc",
age: 20
},
address : {
street : "1st main",
city: "Bangalore",
country: "India"
}
}
But again, it depends on your use case. This might be not suitable sometimes.
Hope that helps! :)

Unique index in mongoDB 3.2 ignoring null values

I want to add the unique index to a field ignoring null values in the unique indexed field and ignoring the documents that are filtered based on partialFilterExpression.
The problem is Sparse indexes can't be used with the Partial index.
Also, adding unique indexes, adds the null value to the index key field and hence the documents can't be ignored based on $exist criteria in the PartialFilterExpression.
Is it possible in MongoDB 3.2 to get around this situation?
I am adding this answer as I was looking for a solution and didn't find one. This may not answer exactly this question or may be, but will help lot of others out there like me.
Example. If the field with null is houseName and it is of type string, the solution can be like this
db.collectionName.createIndex(
{name: 1, houseName: 1},
{unique: true, partialFilterExpression: {houseName: {$type: "string"}}}
);
This will ignore the null values in the field houseName and still be unique.
Yes, you can create partial index in MongoDB 3.2
Please see https://docs.mongodb.org/manual/core/index-partial/#index-type-partial
MongoDB recommend usage of partial index over sparse index. I'll suggest you to drop your sparse index in favor of partial index.
You can create partial index in mongo:3.2.
Example, if ipaddress can be "", but "127.0.0.1" should be unique. The solution can be like this:
db.collectionName.createIndex(
{"ipaddress":1},
{"unique":true, "partialIndexExpression":{"ipaddress":{"$gt":""}}})
This will ignore "" values in ipaddress filed and still be unique
{
"YourField" : {
"$exists" : true,
"$gt" : "0",
"$type" : "string"
}
}
To create at mongodbCompass you must write it as JSON:
for find other types wich supports see this link.
Yes, that can be a kind of a problem that the partial filter expression cannot contain any 'not' filters.
For those who can be interested in a C# solution for an index like this, here is an example.
We have a 'User' entity, which has one-to-one 'relation' to a 'Doctor' entity.
This relation is represented by the not required, nullable field 'DoctorId' in the 'User' entity. In other words, there is a requirement that a given 'Doctor' can be linked to only single 'User' at a time.
So we need an unique index which can fire an exception when something attempts to set DoctorId to the same Guid which already set for any other 'User' entity. At the same time multiple 'null' entries must be allowed for the 'DoctorId' field, since many users do not have any doctor attached to them.
The solution to build this kind of an index looks like:
var uniqueDoctorIdIndexDefinition = new IndexKeysDefinitionBuilder<User>()
.Ascending(o => o.DoctorId);
var existsFilter = Builders<User>.Filter.Exists(o => o.DoctorId);
var notNullFilter = Builders<User>.Filter.Type(o => o.DoctorId, BsonType.String);
var andFilter = Builders<User>.Filter.And(existsFilter, notNullFilter);
var createIndexOptions = new CreateIndexOptions<User>
{
Unique = true,
Name = UniqueDoctorIdIndexName,
PartialFilterExpression = andFilter,
};
var uniqueDoctorIdIndex = new CreateIndexModel<User>(
uniqueDoctorIdIndexDefinition,
createIndexOptions);
users.Indexes.CreateOne(uniqueDoctorIdIndex);
Probably in your description of a 'User' entity you must directly specify the BsonType of the 'DoctorId' field, by using an attribute, for example in our case it was:
[BsonRepresentation(BsonType.String)]
public Guid? DoctorId { get; set; }
I am more than sure that there is a more proficient and compact solution for this problem, so would be happy if somebody suggests it here.
Here is an example that I modified from the mongoDB partial index documentation:
db.contacts.createIndex(
{ email: 1 },
{ unique: true, partialFilterExpression: { email: { $exists: true } } }
)
IMPORTANT
To use the partial index, a query must contain the filter expression (or a modified filter expression that specifies a subset of the filter expression) as part of its query condition.
You can see that queries such as:
db.contacts.find({'email':'name#email.com'}).explain()
will indicate that they doing an index scan, even if you don't specify {$exists: true} because you're implicitly specifying a subset of the partialFilterExpression by specifying an email in your filter.
On the other hand, the following query will do a collection scan:
db.contacts.find({email: {$exists: false}})
WARNING
mythicalcoder's answer (currently the highest voted answer) is very misleading because it successfully creates a unique index, but the query planner will not generally be able to use the index you've created unless you add houseName: {$type: "string"} into your filter expression. This can have performance costs which you might not be aware of and can cause problems down the road.

Unique Values in NoSQL

Consider mongodb or couchbase. What if I need a certain value to be unique (maybe incremental) within the range of UINT32?
Well, I guess I could add a field like another_id and use something like this to increment it (mongo).
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true
}
);
return ret.seq;
}
db.users.insert(
{
another_id : getNextSequence("userid"),
name : "Stack O. Flow"
}
)
But really the question is,
Is this approach safe?
Should I even use NoSQL for this? (consider I only have around 50M rows of data but I really need fast read and writes because this 50M rows of data gets updated almost a few times in second.)
If I should stick with SQL which one should I use. I've used MySQL and it was too slow. (though non-optimization might be at fault) (joining quite a few tables)
Thank you for any suggestions.
There is a specific counter object in Couchbase that should do what you want. Here is an example of it with Node.js.
You could relate it to the main object you are using by doing an objectID such as:
original_objectID::counter.
Then when you go to get the original object, you just do another get for the counter object by ID and done. You can iterate it easily as well. So if you needed to get the object and the original objectID was
user::kirk
then that user's counter object would be:
user::kirk::counter
And you can get and set it by that ID. It works very well in Couchbase.

MongoDB order by "number" ascending

I'm trying to create a registration form with mongoose and MongoDB. I have a unique key, UserId, and every time I create a new entry I would like to take the greatest UserId in the database and increase it by one.
I tried with db.user.find({}).sort({userId: 1}); but it seems not to work.
Thanks
Masiar
What you want to do sounds more like a Schema for Relational Databases with an Auto Increment. I would recommend another solution.
At first you already have a unique id. It get automatically created and are in "_id" field. For me it seems you want to have a UserID for building relation, but you already ca use the value in _id.
The other thing why you want incremented ids could be that you create a webapplication and propably want "nicer" urls? For example. /user/1 instead of /user/abc48df...?
If that is the case i would prefer to create a unique constraint on a username. And instead of an id you use you username in the url "/user/john".
With this your urls are much nicer. And for building relation you can use _id. And you don't run into problems with fethcing the highest number first.
To create a unique index:
db.collection.ensureIndex({username: 1}, {unique: true})
You can do this to get the user with the current highest UserId:
db.user.insert( { UserId: 1 } )
db.user.insert( { UserId: 2 } )
db.user.insert( { UserId: 3 } )
db.user.find().sort( { UserId: -1 } ).limit(1)
It's worth noting that there isn't a way in MongoDB to fetch this value and insert a new user in a single atomic transaction, it only supports atomic operations on single documents. You'd need to take care that another operation didn't insert another user at the same time, you could end up with two users with the same UserId.
To iterate over the cursor and get put the returned doc in an array:
var myArray = [];
User.find().sort('UserId','descending').limit(1).each(function(err, doc) {
myArray.push(doc);
});

mongoDB: unique index on a repeated value

So i'm pretty new to mongoDb so i figure this could be a misunderstanding on general usage. so bear with me.
I have a document schema I'm working with as such
{
name: "bob",
email: "bob#gmail.com",
logins: [
{ u: 'a', p: 'b', public_id: '123' },
{ u: 'x', p: 'y', public_id: 'abc' }
]
}
My Problem is that i need to ensure that the public ids are unique within a document and collection,
Furthermore there are some existing records being migrated from a mySQL DB that dont have records, and will therefore all be replaced by null values in mongo.
I figure its either an index
db.users.ensureIndex({logins.public_id: 1}, {unique: true});
which isn't working because of the missing keys and is throwing a E11000 duplicate key error index:
or this is a more fundamental schema problem in that I shouldn't be nesting objects in an array structure like that. In which case, what? a seperate collection for the user_logins??? which seems to go against the idea of an embedded document.
If you expect u and p to have always the same values on each insert (as in your example snippet), you might want to use the $addToSet operator on inserts to ensure the uniqueness of your public_id field. Otherwise I think it's quite difficult to make them unique across a whole collection not working with external maintenance or js functions.
If not, I would possibly store them in their own collection and use the public_id as _id field to ensure their cross-document uniqueness inside a collection. Maybe that would contradict the idea of embedded docs in a doc database, but according to different requirements I think that's negligible.
Furthermore there are some existing records being migrated from a mySQL DB that dont have records, and will therefore all be replaced by null values in mongo.
So you want to apply a unique index on a data set that's not truly unique. I think this is just a modeling problem.
If logins.public_id is null that's going to violate your uniqueness constraint, then just don't write it at all:
{
logins: [
{ u: 'a', p: 'b' },
{ u: 'x', p: 'y' }
]
}
Thanks all.
In the end I opted to seperate this into 2 collections, one for users and one for logins.
users this looked a little like..
userDocument = {
...
logins: [
DBRef('loginsCollection', loginDocument._id),
DBRef('loginsCollection', loginDocument2._id),
]
}
loginDocument = {
...
user: new DBRef('userCollection', userDocument ._id)
}
Although not what i was originally after (a single collection) It is working niocely and by utilising the MongoId uniquness there is a constraint now built in at a database level and not implemented at the application level.