After few months working with Mongo trying to understand if using sub-documents for nested data is good or not, especially in this example:
Assume users collection that each document into it have the following:
{
"_id" : ObjectId("some valid Object ID"),
"userName" : "xxxxx",
"email" : "xxxx#xxxx.xx"
}
Now, in my system there are also rooms (another collection) and i want to save for each user scores per room.
In my mind, to do that i have 2 major options, (1) create new collection call userScores that will hold: userId, roomId, scores fields like i did previously in MySql and other relational DB's (2) create a sub-document into the above user document:
{
"_id" : ObjectId("sdfdfdfdfdf"),
"userName" : "xxxxx",
"email" : "xxxx#xxxx.xx",
"scores": {
"roomIdX": 50,
"roomIdY": 50,
"roomIdZ": 50
}
}
What do you think is better way so later i can handle searches, aggregations and other data queries via the code (mongoose in my case)
Thanks.
Related
First let me explain schema of my collections.
I have 3 collections
company,deal,price
I want to use information from all three collection and make a single reactive,responsive table. Here is the image
Now the schema for price collection is like this
{
"_id" : "kSqH7QydFnPFHQmQH",
"timestamp" : ISODate("2015-10-11T11:49:50.241Z"),
"dealId" : "X5zTJ2y675PjmaLMx",
"deal" : "Games",
"price" : [{
"type" : "worth",
"value" : "Bat"
}, {
"type" : "Persons",
"value" : 4
}, {
"type" : "Cost",
"value" : 5
}],
"company" : "Company1"
}
Schema for company collection is
{
"_id" : "da2da"
"name" : "Company1"
}
Schema for deal collection is
{
"_id" : "X5zTJ2y675PjmaLMx",
"name" : "Games"
}
For each company there will be 3 columns added in table(worth,persons,cost)
For each deal there will be a new row in table.
As the information is coming from 3 collections into a single table. First I want to ask is it wise to make a table from 3 different collections? If yes how could I do that in blaze?
If no. Then I will have to make table from price collection only . What should be schema of this collection in best way.
P.S in both cases I want to make table reactive.
Firstly, I recommend reywood:publish-composite for publishing related collections.
Secondly there is no intrinsic problem in setting up a table like this, you'll first figure out which collection to loop over with your {{#each}} in spacebars and then you'll define helpers that return the values from the related collections to your templates.
As far as your schema design, the choice as to whether to use nesting within a collection vs. using an entirely separate collection is typically driven by size. If the related object is overall "small" then nesting can work well. You automatically get the nested object when you publish and query that collection. If otoh it's going to be "large" and/or you want to avoid having to update every document when something in the related object changes then a separate collection can be better.
If you do separate your collections then you'll want to refer to objects from the other collection by _id and not by name since names can easily change. For example in your price collection you'd want to use companyId: "da2da" instead of company: "Company1"
There is an existing person collection in the system which is like:
{
"_id" : ObjectId("536378bcc9ecd7046700001f"),
"engagements":{
"5407357013875b9727000111" : {
"role" : "ADMINISTRATOR",
},
"5407357013875b9727000222" : {
"role" : "DEVELOPER",
}
}
}
So that multiple user objects can have the same engagement with a specific role, I need to fire a query in this hierarchy where I can get all the persons which have a specific engagement in the engagements property of person collection.
I want to get all the persons which have
5407357013875b9727000222 in the engagements.
I know $in operator could be used but the problem is that I need to compare the keys of the sub Json engagements.
I think it's as simple as this:
db.users.find({'engagements.5407357013875b9727000222': {$exists: true}})
If you want to match against multiple engagement ids, then you'll have to use $or. Sorry, no $in for you here.
Note, however, that you need to restructure your data, as this one can't be indexed to help this concrete query. Here I assume you care about performance and this query is used often enough to have impact on the database.
I have 2 collection.
Collection "users"
{
"_id" : ObjectId("54b00098e0fdb6634b1f54e6"),
"state" : "active",
"backends" : [
DBRef("backends", ObjectId("54b001ebe0fd853df1c93419")),
DBRef("backends", ObjectId("54b00284e0fd853df1c9341b"))
]
}
Collection "backends"
{
"_id" : ObjectId("54b001ebe0fd853df1c93419"),
"state" : "running"
}
I want to get a list of backend of a user where the backend's state is "running".
How can mongodb do this like join two table?
Is it any method to search backward from backend or have function the filter?
I can search like this
db.users.find({"backends.$id" : "distring"})
But what if I want to search the state inside backend object? like.
db.users.find({"backends.$state" : "running"})
But ofcoure it is not working.
MongoDB doesn't support joins so you need to do this in two steps. In the shell:
var ids = db.backends.find({state: 'running'}, {_id: 1}).map(function(backend) {
return backend._id;
});
var users = db.users.find({'backends.$id': {$in: ids}}).toArray();
On a side note, you're probably better off using a plain ObjectId instead of a DBRef for the backends array elements unless the ids in that array can actually refer to docs in multiple collections.
I'm using passport.js to store my users into my mongodb. A user object looks like this
{
"_id" : ObjectId("54893faf0907a100006341ee"),
"local" : {
"password" : [encrypted password],
"email" : "johnsmith#domain.com"
},
"__v" : 0
}
In a mongodb shell how would I go about listing all the emails? I'm finding it difficult to do this as my data sits two level deep within the object. Cheers!
You can use distinct to get a list of a field's distinct values in the collection, using dot notation to reference the embedded field:
db.users.distinct('local.email')
I have two MongoDB collections user and customer which are in one-to-one relationship. I'm new to MongoDB and I'm trying to insert documents manually although I have Mongoose installed. I'm not sure which is the correct way of storing document reference in MongoDB.
I'm using normalized data model and here is my Mongoose schema snapshot for customer:
/** Parent user object */
user: {
type: Schema.Types.ObjectId,
ref: "User",
required: true
}
user
{
"_id" : ObjectId("547d5c1b1e42bd0423a75781"),
"name" : "john",
"email" : "test#localhost.com",
"phone" : "01022223333",
}
I want to make a reference to this user document from the customer document. Which of the following is correct - (A) or (B)?
customer (A)
{
"_id" : ObjectId("547d916a660729dd531f145d"),
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street",
"user" : ObjectId("547d5c1b1e42bd0423a75781")
}
customer (B)
{
"_id" : ObjectId("547d916a660729dd531f145d"),
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street",
"user" : {
"_id" : ObjectId("547d5c1b1e42bd0423a75781")
}
}
Remember these things
Embedding is better for...
Small subdocuments
Data that does not change regularly
When eventual consistency is acceptable
Documents that grow by a small amount
Data that you’ll often need to perform a second query to fetch Fast reads
References are better for...
Large subdocuments
Volatile data
When immediate consistency is necessary
Documents that grow a large amount
Data that you’ll often exclude from the results
Fast writes
Variant A is Better.
you can use also populate with Mongoose
Use variant A. As long as you don't want to denormalize any other data (like the user's name), there's no need to create a child object.
This also avoids unexpected complexities with the index, because indexing an object might not behave like you expect.
Even if you were to embed an object, _id would be a weird name - _id is only a reserved name for a first-class database document.
One to one relations
1 to 1 relations are relations where each item corresponds to exactly one other item. e.g.:
an employee have a resume and vice versa
a building have and floor plan and vice versa
a patient have a medical history and vice versa
//employee
{
_id : '25',
name: 'john doe',
resume: 30
}
//resume
{
_id : '30',
jobs: [....],
education: [...],
employee: 25
}
We can model the employee-resume relation by having a collection of employees and a collection of resumes and having the employee point to the resume through linking, where we have an ID that corresponds to an ID in th resume collection. Or if we prefer, we can link in another direction, where we have an employee key inside the resume collection, and it may point to the employee itself. Or if we want, we can embed. So we could take this entire resume document and we could embed it right inside the employee collection or vice versa.
This embedding depends upon how the data is being accessed by the application and how frequently the data is being accessed. We need to consider:
frequency of access
the size of the items - what is growing all the time and what is not growing. So every time we add something to the document, there is a point beyond which the document need to be moved in the collection. If the document size goes beyond 16MB, which is mostly unlikely.
atomicity of data - there're no transactions in MongoDB, there're atomic operations on individual documents. So if we knew that we couldn't withstand any inconsistency and that we wanted to be able to update the entire employee plus the resume all the time, we may decide to put them into the same document and embed them one way or the other so that we can update it all at once.
In mongodb its very recommended to embedding document as possible as you can, especially in your case that you have 1-to-1 relations.
Why? you cant use atomic-join-operations (even it is not your main concern) in your queries (not the main reason). But the best reason is each join-op (theoretically) need a hard-seek that take about 20-ms. embedding your sub-document just need 1 hard-seek.
I believe the best db-schema for you is using just an id for all of your entities
{
_id : ObjectId("547d5c1b1e42bd0423a75781"),
userInfo :
{
"name" : "john",
"email" : "test#localhost.com",
"phone" : "01022223333",
},
customerInfo :
{
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street",
},
staffInfo :
{
........
}
}
Now if you just want the userinfo you can use
db.users.findOne({_id : ObjectId("547d5c1b1e42bd0423a75781")},{userInfo : 1}).userInfo;
it will give you just the userInfo:
/* 0 */
{
"name" : "john",
"email" : "test#localhost.com",
"phone" : "01022223333"
}
And if you just want the **customerInfo ** you can use
db.users.findOne({_id : ObjectId("547d5c1b1e42bd0423a75781")},{customerInfo : 1}).customerInfo;
it will give you just the customerInfo :
/* 0 */
{
"birthday" : "1983-06-28",
"zipcode" : "12345",
"address" : "1, Main Street"
}
and so on.
This schema has the minimum hard round-trip and actually you are using mongodb document-based feature with best performance you can achive.