MongoDB - Many-to-Many-Relationship (Special Case) - mongodb

Should I store an embedded document multiple times in MongoDB or should I only store it once and link to it using it‘s ID?
I want to accomplish a „Many-to-Many-Relationship“ and I only have to update these embedded documents once a year.
Which of the both option fits better?
Thanks for your help!

In your case, you only have to update the embedded documents one a year, it means that the read operation is going to be used much more than the write operation.
So, for optimizing read operations, "references" should be avoid.
The only remaining concern here is whether the embedded documents are large (size) or not and they are frequently duplicated or not. If not, feel free to use embedded documents, because that is the natural power of MongoDB.

Related

All vs All comparisons on MongoDB

We are planning to use MongoDB for a general purpose system and it seems well suited to the particular data and use cases we have.
However we have one use case where we will need to compare every document (of which there could be 10s of millions) with every other document. The 'distance measure' could be pre computed offline by another system but we are concerned about the online performance of MongoDB when we want to query - eg when we want to see the top 10 closest documents in the entire collection to a list of specific documents ...
Is this likely to be slow? Also can this be done across documents (eg query for the top10 closest documents in one collection to a document in another collection)...
Thanks in advance,
FK

Will I have problems using MongoDb regarding the size of documents?

I'm trying to develop a professional social network and I use mongodb to the database, and I wanted to ask if I will not have a problem with the database, regarding the size of documents. knowing that we plan to have a large number of users in the social network. I hope that I would have util feedback from you.
'Large number of users' is somewhat vague. Having a rough estimate helps..Anyway, the document size limit in MongoDB is 16MB, which looks enough for storing a user's profile details. However, in your use-case of 'networking', you might be planning for keeping followers/friends. Whether to store them in the same document as the User-profile document or not is a different question in itself. You might want to check these out:
What is a good MongoDB document structure for most efficient querying of user followers/followees?
http://www.10gen.com/events/common-mongodb-use-cases
http://docs.mongodb.org/manual/use-cases/
http://nosql.mypopescu.com/post/316345119/mongodb-usecases
One issue you might run into is that MongoDB stores the text of a field name for each field in each document. So if you have a field called "Name" or "Address" that you want for a set of documents that text will appear in every single document, taking up space. This is different to a relational database which has a schema, where the name of a column is only stored once.
A few years ago I worked on a project where the engineers had a bit of a surprise at the size of their data set when they simulated millions of users because they had not taken this into consideration. They optimized the data for size (ie "loc1" instead of "Location 1") but had not done the same for the field names. It's the problem when developers used to RDBM development make assumptions about NoSQL solutions, they only counted the size of their data, not field name plus field value.
They were glad they found this out in a test before they went live, otherwise they would have had to migrate every live document in order to implement the changes they wanted.
It isn't a big deal, certainly not a reason not to use MongoDB (being schema less and treating each document as a unique item is after all a feature rather than a bug or design flaw). Just something to keep in mind.

MongoDB -- large number of documents

This is related to my last question.
We have an app where we are storing large amounts of data per user. Because of the nature of data, previously we decided to create a new database for each user. This would have required a large no. of databases (probably millions) -- and as someone pointed out in a comment, that this indicated wrong design.
So we changed the design and now we are thinking about storing each user's entire information in one collection. This means one collection exactly maps to one user. Since there are 12,000 collections available per database, we can store 12,000 users per DB (and this limit could be increased).
But, now my question is -- is there any limit on the no. of documents a collection can have. Because of the way we need to store data per user, we expect to have a huge (tens of millions in extreme cases) no. of document per documents. Is that OK for MongoDB and design-wise?
EDIT
Thanks for the answers. I guess then it's OK to use large no of documents per collection.
The app is a specialized inventory control system. Each user has a large no. of little pieces of information related to them. Each piece of information has a category and some related stuff under that category. Moreover, no two collections need to see each other's data -- hence an index that touch more than one collection is not needed.
To adjust the number of collections/indexes you can have (~24k is the limit--~12k is what they say for collections because you have the _id index by default, but keep in mind, if you have more indexes on the collections, that will use namespace up as well), you can use the --nssize option when you start up mongod.
There are plenty of implementations around with billions of documents in a collection (and I'm sure there are several with trillions), so "tens of millions" should be fine. There are some numbers such as counts returned that have constraints of 64 bits, so after you hit 2^64 documents you might find some issues.
What sort of query and update load are you going to be looking at?
Your design still doesn't make much sense. Why store each user in a separate collection?
What indexes do you have on the data? If you are indexing by some field that has content that's common across all the users you'll get a significant saving in total index size by having a single collection with one index.
Index size is often the limiting factor not total database size when it comes to performance.
Why do you have so many documents per user? How large are they?
Craigslist put 2+ billion documents in MongoDB so that shouldn't be an issue if you have the hardware to support it and aren't being inefficient with your indexes.
If you posted more of your schema here you'd probably get better advice.

Mongodb- embedded vs Indexes

My question is pretty simple. I am building my first application with mongodb. Up until now, i have always used sql. I have read a lot of information about embedding documents versus linked documents.
My question to the mongodb veterans is: Is there a huge difference in speed/performance if I used indexed links/queries apposed to embedded docs? If there is a huge difference can you please explain why? Thank you.
Again, i am new to mongodb and just don't want to get off on the wrong foot. thank you.
Yes, there is an enormous difference between references and embedded docs.
An embedded document is stored in the document in the same disk location as the rest of the doc's fields, so there's no additional network round-trips or disk seeks to retrieve the embedded document when you query the document as a whole.
DBRefs, on the other hand, are simply the _id of a document in another collection. It will take an additional roundtrip and additional disk seeks to get the "linked" document. See the spec for DBRefs here:
http://www.mongodb.org/display/DOCS/Database+References#DatabaseReferences-DBRef
You should try to optimize your most common query by including in a single document all the info needed to satisfy that query.

mongodb document structure

My database has users collection,
each user has multiple documents,
each document has multiple sections
each section has multiple works
Users work with works collection very often (add new work, update works, delete works). So my question is what structure of collections should I make? works collection is 100-200 records per section.
Should I make work collection for all users with user _id or there is best solution?
Depends on what kind of queries you have. The guideline is to arrange documents so that you can fetch all you need in ideally one query.
On the other hand, what you probably want to avoid is to have mongo reallocate documents because there's not enough space for a in-place update. You can do that by preallocating enough space, or extracting that frequently changing part into its own collection.
As you can read in MongoDB docs,
Generally, for "contains" relationships between entities, embedding should be be chosen. Use linking when not using linking would result in duplication of data.
So if each user has only access to his documents, I think you're good. Just keep in mind there's a limitation on size (16MB I think) for documents which you should be careful about, since you're embedding lots of stuff.