I'm new for MongoDB , I just want to create a simple project to test performance of MongoDB
The project just like a simple CMS
it has users, blogs and comments, users can have friends
so I design my database like that
user
{
_ID:
name:
birth_day:
sex:
friends:[id_1,Id_2]
}
blogs
{
title:
owner:
tags_fiends:
comments:
[
{"_id":"","content":"","date_created":""},
{"_id":"","content":"","date_created":""},
],
"like"={"_id","_id"}
}
And How many collection are needed for this database. Can I use 1 Collection for both user and blog.Thanks in advance.
Due to mongoDB is schema less or schema free DB You can make any kind of structure within a document, which is supported:
individual elements
nested arrays
nested documents
There is a couple of things you have to considare during schema design which for it is useful to have the users and the blogs in separated schema. For example if you storing something in a nested array you can specify index for fastening the search within this array, but you can have only one multykéy index (indexed array content) within one particular collection. so if you store, friends and blogs, and posts, and tags all in arrays you can have index only on one of them.
Also important to know in this case that there is a size limit for each document what is now 16MB.
In your scenario, I would make Users a collection and reference it by _id from the blog collection.
In practise, you could make the Blogs an attribute of User, the only constraint being the max doc size of 16MB - but that's a lot of blogs (text).
To get round that (assuming you need to), a separate Blog collection referencing the user _id would be fine. You may need to denormalise the user name too if that's not your _id. This would mean you can get all the blogs for a user in a single query.
Related
I am using MongoDB for the first time, and I have some experience with NoSQL databases.
I am attempting to replicate behaviour that I have managed to achieve on Google's Cloud Firestore:
I want to create a collection within a document. I have not been able to replicate this behaviour using MongoDB as I cannot find code in the documentation. Is this behaviour even possible please?
Thanks in advance.
Edit:
Here is a screenshot of a sample document in biometric_data :
MongoDB has embedded documents which can be used to store the same data. You can try creating an array of sub-documents (each having name and data property):
{
name: "",
email: "",
...otherFields,
biometric_data: [
{
name: "glucose",
data: {
preferred_unit: "mg/dL"
// Add new properties as required
}
},
{
name: "weight",
data: {
preferred_unit: "KG"
}
}
],
...templateData
}
However, a document's size in MongoDB cannot exceed 16 MB. If number of fields in biometric_data are limited then you can use sub-documents otherwise you might have to create another collection to store those as documents (generally preferred for chat apps or where number of sub-documents can be really high).
Sub-collections (in Firestore) allow you to structure data hierarchically, making data easier to access. For example, users and posts collections can be structured in either of the ways below:
With sub-collection
users -> {userId} -> posts -> {postId}
Root level collections
users -> {userId}
posts -> {postId}
Though if you use root level collections, you must add a userId in posts document to identify who the owner of a post is.
If you use nested documents way in MongoDB, you are likely to hit the 16 MB document limit if any of the users decides to add many posts. Similarly if the biometric_data array can have many documents, it'll be best to create another collection.
Firestore's sub-collections and documents do not count towards 1 MB max doc size of parent document but nested documents in MongoDB do.
Also checkout:
Firestore - proper NoSQL structure for user-specific data
Is mongodb sub documents equivalent to Firestore subcollections?
I'm developing a web application with NodeJS, MongoDB and Mongoose. It is intended to act as an interface between the user and a big data environment. The idea is that the users can execute the big data processes in a separated cluster, and the results are stored in a MongoDB collection Results. This collection may store more than 1 million of documents per user.
The document schema of this collection can be completely different between users. For instance, we have user1 and user2. Examples of document in the Resultscollection for user1 and user2:
{
user: ObjectId(user1):, // reference to user1 in the Users collection
inputFields: {variable1: 3, ...},
outputFields: { result1: 504.75 , ...}
}
{
user: ObjectId(user2):,
inputFields: {country: US, ...},
outputFields: { cost: 14354.45, ...}
}
I'm implementing a search engine in the web application so that each user can filter in the fields according to the schemas of their documents (for example, user1 must me able to filter by inputFields.variable1, and user2 by outputFields.cost). Of course I know that I must use indexes, otherwise the queries are so slow.
My first attempt was to create an index for each different field in the Results collection, but it's quite inefficient, since the database server becomes unstable because of the size of the indexes. So my second attempt was to try to reduce the amount of indexes by using partial indexes, so that I create indexes specifying the user id in the option partialFilterExpression.
The problem is that if another user has the same schema in the Results collection as any other user and I try to create the indexes for this user, MongoDB throws this exception:
Index with pattern: { inputFields.country: 1 } already exists with different options
It happens because the partial indexes cannot index the same fields even though the partialFilterExpression is different.
So my questions are: How could I allow the users to query their results efficiently in this environmnet? Is MongoDB really suitable for this use case?
Thanks
I am trying to fetch the documents from a collection based on the existence of a reference to these documents in another collection.
Let's say I have two collections Users and Courses and the models look like this:
User: {_id, name}
Course: {_id, name, user_id}
Note: this just a hypothetical example and not actual use case. So let's assume that duplicates are fine in the name field of Course. Let's thin Course as CourseRegistrations.
Here, I am maintaining a reference to User in the Course with the user_id holding the _Id of User. And note that its stored as a string.
Now I want to retrieve all users who are registered to a particular set of courses.
I know that it can be done with two queries. That is first run a query and get the users_id field from the Course collection for the set of courses. Then query the User collection by using $in and the user ids retrieved in the previous query. But this may not be good if the number of documents are in tens of thousands or more.
Is there a better way to do this in just one query?
What you are saying is a typical sql join. But thats not possible in mongodb. As you suggested already you can do that in 2 different queries.
There is one more way to handle it. Its not exactly a solution, but the valid workaround in NonSql databases. That is to store most frequently accessed fields inside the same collection.
You can store the some of the user collection fields, inside the course collection as embedded field.
Course : {
_id : 'xx',
name: 'yy'
user:{
fname : 'r',
lname :'v',
pic: 's'
}
}
This is a good approach if the subset of fields you intend to retrieve from user collection is less. You might be wondering the redundant user data stored in course collection, but that's exactly what makes mongodb powerful. Its a one time insert but your queries will be lot faster.
I wish to add an _id as property for objects in a mongo array.
Is this good practice ?
Are there any problems with indexing ?
I wish to add an _id as property for objects in a mongo array.
I assume:
{
g: [
{ _id: ObjectId(), property: '' },
// next
]
}
Type of structure for this question.
Is this good practice ?
Not normally. _ids are unique identifiers for entities. As such if you are looking to add _id within a sub-document object then you might not have normalised your data very well and it could be a sign of a fundamental flaw within your schema design.
Sub-documents are designed to contain repeating data for that document, i.e. the addresses or a user or something.
That being said _id is not always a bad thing to add. Take the example I just stated with addresses. Imagine you were to have a shopping cart system and (for some reason) you didn't replicate the address to the order document then you would use an _id or some other identifier to get that sub-document out.
Also you have to take into consideration linking documents. If that _id describes another document and the properties are custom attributes for that document in relation to that linked document then that's okay too.
Are there any problems with indexing ?
An ObjectId is still quite sizeable so that is something to take into consideration over a smaller, less unique id or not using an _id at all for sub-documents.
For indexes it doesn't really work any different to the standard _id field on the document itself and a unique index across the field should work across the collection (scenario dependant, test your queries).
NB: MongoDB will not add an _id to sub-documents for you.
Well i'm planning a collection schema and i have dubt to how embed data inside tag field.
i have a colletion named products:
products
->_id
->product_name
->tags
i have then a tags collection:
tags
->_id
->tag_name
and i have a tags_child collection:
tags_childs
->_id
-> id_tag
->tag_child_name
now i need to save tags and tags_childs into products collection, so i tought it was good to save in products collection a field:
tags: [[_id_tag]:[_id_tag_child,_id_tag_child, ... etc],[_id_tag]:[_id_tag_child,_id_tag_child, ... etc]]
but i think is not the right way, cause i need to be able to query on products collection, filtering by tags and child_tags.
So for example i need to filter products by:
+product_name: 'roastbeef'
+ tag:'hot'
+ child tag : 'sauce'
or filter by:
+product_name: 'roastbeef'
+ tag: 'hot'
+ child tag:'sauce'
+ child tag:'knife'
+ tag:'dinner'
Parent tags are always required when filtering, while child tags are optional.
How do you implement a right collection to do this type of query at the end?
Reading over this; I described your schema a bit in detail here:
Schema Advice
There really isn't a reason to be holding this pivotal table, as in the end they are all just tags. You can still search for multiple tags within your database by going with the above suggested schema; along with showing all your available tags to and their relationship from the above schema; this way it cleans up your entire DB rather than muddying it with an extra table which MongoDB just doesn't need.
I'd highly recommend re-going through this e-book to just strengthen the understanding around the NoSQL format Mongo empowers you with.
Mongo Free eBook PDF