Mongo _id for subdocument array - mongodb

I wish to add an _id as property for objects in a mongo array.
Is this good practice ?
Are there any problems with indexing ?

I wish to add an _id as property for objects in a mongo array.
I assume:
{
g: [
{ _id: ObjectId(), property: '' },
// next
]
}
Type of structure for this question.
Is this good practice ?
Not normally. _ids are unique identifiers for entities. As such if you are looking to add _id within a sub-document object then you might not have normalised your data very well and it could be a sign of a fundamental flaw within your schema design.
Sub-documents are designed to contain repeating data for that document, i.e. the addresses or a user or something.
That being said _id is not always a bad thing to add. Take the example I just stated with addresses. Imagine you were to have a shopping cart system and (for some reason) you didn't replicate the address to the order document then you would use an _id or some other identifier to get that sub-document out.
Also you have to take into consideration linking documents. If that _id describes another document and the properties are custom attributes for that document in relation to that linked document then that's okay too.
Are there any problems with indexing ?
An ObjectId is still quite sizeable so that is something to take into consideration over a smaller, less unique id or not using an _id at all for sub-documents.
For indexes it doesn't really work any different to the standard _id field on the document itself and a unique index across the field should work across the collection (scenario dependant, test your queries).
NB: MongoDB will not add an _id to sub-documents for you.

Related

NoSQL/MongoDB: When do I need _id in nested objects?

I don't understand the purpose of the _id field when it refers to a simple nested object, eg it is not used as a root entity in a dedicated collection.
Let's say, we have a simple value-object like Money({currency: String, amount: Number }). Obviously, you are never going to store such a value object by itself in a dedicated collection. It will rather be attached to a Transaction or a Product, which will have their own collection.
But you still might query for specific currencies or amounts that are higher/lower than a given value. And you might programmatically (not on DB level) set the value of a transaction to the same as of a purchased product.
Would you need an _id in this case? What are the cases, where a nested object needs an _id?
Assuming the _id you are asking about is an ObjectId, the only purpose of such field is to uniquely identify an object. Either a document in a collection or a subdocument in an array.
From the description of your money case it doesn't seem necessary to have such _id field for your value object. A valid use case might be an array of otherwise non-unique/non-identifiable subdocuments e.g. log entries, encrypted data, etc.
ODMs might add such field by default to have a unified way to address subdocuments regardless of application needs.

Mongoid: retrieving documents whose _id exists in another collection

I am trying to fetch the documents from a collection based on the existence of a reference to these documents in another collection.
Let's say I have two collections Users and Courses and the models look like this:
User: {_id, name}
Course: {_id, name, user_id}
Note: this just a hypothetical example and not actual use case. So let's assume that duplicates are fine in the name field of Course. Let's thin Course as CourseRegistrations.
Here, I am maintaining a reference to User in the Course with the user_id holding the _Id of User. And note that its stored as a string.
Now I want to retrieve all users who are registered to a particular set of courses.
I know that it can be done with two queries. That is first run a query and get the users_id field from the Course collection for the set of courses. Then query the User collection by using $in and the user ids retrieved in the previous query. But this may not be good if the number of documents are in tens of thousands or more.
Is there a better way to do this in just one query?
What you are saying is a typical sql join. But thats not possible in mongodb. As you suggested already you can do that in 2 different queries.
There is one more way to handle it. Its not exactly a solution, but the valid workaround in NonSql databases. That is to store most frequently accessed fields inside the same collection.
You can store the some of the user collection fields, inside the course collection as embedded field.
Course : {
_id : 'xx',
name: 'yy'
user:{
fname : 'r',
lname :'v',
pic: 's'
}
}
This is a good approach if the subset of fields you intend to retrieve from user collection is less. You might be wondering the redundant user data stored in course collection, but that's exactly what makes mongodb powerful. Its a one time insert but your queries will be lot faster.

Mongodb - geospacial _id?

I currently have a collection of small documents. Each document has an indexed geospacial field and *the default _id is never used in any query*. There will never be more than one document related to a particular geo location. I think it makes sense to override the default _id, and use the geospacial data for this somehow.
Question is, how do you use geospacial data as the unique id? Is it a case of creating a flat string from the geo field? E.g. 'x123456y123456'?
The _id field is the unique identifier for each document and thus is a needed field. The _id field is generated on document creation automatically if one is not provided. If you can provide this geospaital value when creating the document you should be able to use the string as you suggested, you cannot use an array as the _id value. However please be aware that once a document is created the _id becomes unchangeable. This means that using the _id field as a meaningful index of geospatial data may not be of much value.
Have a look here for more info on the _id field and here for some information about creating geospatial indexes in Mongo

In MongoDB, do document _id's need to be unique across a collection or the entire DB?

I'm building a database with several collections. I have unique strings that I plan on using for all the documents in the main collection. Documents in other collections will reference documents in the main collection, which means I'll have to save said id's in the other collections. However, if _id's only need to be unique across a collection and not across an entire database, then I would just make the _id's in the other collections also use the aforementioned unique strings.
Also, I assume that in order to set my own _id's, all I have to do is have an "_id":"unique_string" property as part of the document that I insert, correct? I wouldn't need to convert the "unique_string" into another format, right?
Also, hypothetically speaking, would I be able to have a variable save the string "_id" and use that instead? Just to be clear, something as follows: var id = "_id" and then later on in the code (during an insert or a query for example) have id:"unique_string".
Best, and thanks,Sami
_ids have to be unique in a collection. You can quickly verify this by inserting two documents with the same _id in two different collections.
Your other assumptions are correct, just try them and see whether they work (they will). The proof of the pudding is in the eating.
Note: use _id directly, var id = "_id" just compilcates the code.

mongoDB - URL as document ID

Considering I want to create mongoDB documents for a bunch of distinct URLs: what would be the pros and cons (if any) of using the actual URL as the documents _id value instead of the default BSON ObjectId. Thanks in advance!
Cheers,
Greg
An overview of the subject here: http://www.mongodb.org/display/DOCS/Object+IDs
It has to be unique, you could potentially put yourself in the position of having to resolve collisions yourself. Better to leave the default _id alone and simply query against a field you're storing in the document, just how God (10gen) intended.
From http://www.mongodb.org/display/DOCS/BSON
The element name "_id" is reserved for use as a primary key id, but
you can store anything that is unique in that field. The database
expects that drivers will prevent users from creating documents that
violate these constraints.
From #mongodb
stupid _id values will probably make querying slow, but that's about it
And another user from #mongodb
Tell him the collisions will result in garbage data