How store data (documents) in MongoDB - mongodb

I'm studying mongodb and want to build a little database for web blogs page.
It is known that in mongo we use collections and documents, in opposite, of tables and records.
I have 2 documents (entities): User (id, nikname) and Publication (id, title ...) In relational database we would have user_id as column inside "Publication" and this would mean that users able to have many publications.
Example1
User
{
id: "123456",
nikname: "cool guy",
publications: [
{
id: "some id1",
title: "some title111",
text: "bla bla bla",
// any fields
},
{
id: "some id2",
title: "some title222",
text: "bla bla bla",
// any fields
},
....
]
}
Publication
{
id: "some id",
title: "some title",
text: "bla bla bla",
// any fields
}
In example above, each user has own array of publications.
My question is: Is this a good way to do like this? What if one user will have 1000 publications?
Moreover, if each user has an own publications then why we need to store publications table (in MONGO it is called COLLECTION)
outside the user as separate entity.
I was also thinking about a storing publication ids inside of user.
Example2
User
{
id: "123456",
nikname: "cool guy"
}
Publication 1
{
id: "some id",
title: "some title",
text: "bla bla bla",
// any fields
USER_ID: 123456
}
Publication 2
{
id: "some id",
title: "some title",
text: "bla bla bla",
// any fields
USER_ID: 123456
}
But Example2 does not differ from relational approach...
So What way will be better ?
In short, would like to know opinions of guys who worked with mongo.

In Mongo there are 3 ways you can design your model relationships.
One to One
One to Many (Embedded Docs) : your example 1
One to Many (Document References) : your example 2
Rule of thumb is that you need to consider your data retrieval pattern
of your application.
For example, if your application need to fetch Publication related to a particular user heavily, you can go for example 1 and you don't need to maintain publications in a separate collection (unless application requires it). Having a lot of sub documents are not a problem as far as a single document will not exceed the hard limits.
Example 2 of your one good if your application need to query by publications as well as user (similar to a relation model). However I see this is a not a optimized solution.
Some resource:
https://docs.mongodb.com/manual/applications/data-models-relationships/

Related

populate or aggregate 2 collection with sorting and pagination

I just use mongoose recently and a bit confused how to sort and paginate it.
let say I make some project like twitter and I had 3 schema. first is user second is post and third is post_detail. user schema contains data that user had, post is more like fb status or twitter tweet that we can reply it, post_detail is like the replies of the post
user
var userSchema = mongoose.Schema({
username: {
type: String
},
full_name: {
type: String
},
age: {
type: Number
}
});
post
var postDetailSchema = mongoose.Schema({
message: {
type: String
},
created_by: {
type: String
}
total_reply: {
type: Number
}
});
post_detail
var postDetailSchema = mongoose.Schema({
post_id: {
type: String
}
message: {
type: String
},
created_by: {
type: String
}
});
the relation is user._id = post.created_by, user._id = post_detail.created_by, post_detail.post_id = post._id
say user A make 1 post and 1000 other users comment on that posts, how can we sort the comment by the username of user? user can change the data(full_name, age in this case) so I cant put the data on the post_detail because the data can change dynamically or I just put it on the post_detail and if user change data I just change the post_detail too? but if I do that I need to change many rows because if the same users comment 100 posts then that data need to be changed too.
the problem is how to sort it, I think if I can sort it I can paginate it too. or in this case I should just use rdbms instead of nosql?
thanks anyway, really appreciate the help and guidance :))
Welcome to MongoDB.
If you want to do it in the way you describe, just don't go for Mongo.
You are designing the schema based on relations and not in documents.
Your design requires to do joins and this does not work well in mongo because there is not an easy/fast way of doing this.
First, I would not create a separate entity for the post details but embedded in the Post document the post details as a list.
Regarding your question:
or I just put it on the post_detail and if user change data I just
change the post_detail too?
Yes, that is what you should do. If you want to be able to sort the documents by the userName you should denormalize it and include in the post_details.
If I had to design the schema, it would be something like this:
{
"message": "blabl",
"authorId" : "userId12",
"total_reply" : 100,
"replies" : [
{
"message" : "okk",
"authorId" : "66234",
"authorName" : "Alberto Rodriguez"
},
{
"message" : "test",
"authorId" : "1231",
"authorName" : "Fina Lopez"
}
]
}
With this schema and using the aggregation framework, you can sort the comments by username.
If you don't like this approach, I rather would go for an RDBMS as you mentioned.

Mongo DB Storing References

In a Mongo environment it is beneficial to embed data in documents.
so for example an Employees document:
{
{
userid: 'someid',
username: 'user1'
isManager: true,
subordinates: [
{
userid: 'anotherid',
username: 'user2',
isManager: false
}
],
officeLocation: {
officeId: 'someofficeid',
officeName: 'Some Office'
}
},
{
userid: 'anotherid',
username: 'user2',
isManager: false,
officeLocation: {
officeId: 'someotherofficeid',
officeName: 'Some Other Office'
}
}
}
And the office document:
{
{
officeid: 'someofficeid',
officeName: 'Some Office'
},
{
officeid: 'someotherofficeid',
officeName: 'Some Other Office'
}
}
So lets assume that someone in the company decides that they don't like the name Some Other Office and they want to change it to Some Cool Office.
When they make the change in the office document how do we know to update all the embedded Some Other Office in the employee document as well?
It seems that every time that you take a piece of data from one document and embed it into an object in another document that the link between the two gets broken and then you have to write separate queries to update the data in all the different spots that you embedded that object into.
I like the idea of embedded documents rather than storing references, but without some kind of 2 way data-binding it seems impractical when it comes to updating information.
Is there any way that I would be able to bind the data two ways or is there an easier way to go about modeling my data?
Thanks
It remainds me about the traditional RDBMS systems when you model to normalize/denormalize an information. I'm not sure about the binding, but, if you need the "single true" for an information, the better way is never having the information stored in two different places. So, in your case, it may be better to store the Office information into a separated document and just link it by Id.

MongoDB Data-Modelling: a pattern for text search in referenced documents

I'm working on a project that use MongoDB; and I would like to hear your opinion about a feature I'd like to implement.
In paticular there are "Users" that reside in "Cities" where they offer "Services".
I have created three Collections representing the three above mentioned entities:
the User collection has a one-to-one reference with City and a one-to-many one with Service.
I would like making a search function that search in the user collection and in referenced collections for a given string available.
Therefor given the following two users, two cities and three services ...
User
{
_id:"u1",
name:"Jhon",
City: ObjectId("c1"),
Services: [
ObjectId("s1"),
ObjectId("s2")
]
}
{
_id:"u2",
name:"Jack",
City: ObjectId("c2"),
Services: [
ObjectId("s2"),
ObjectId("s3")
]
}
City
{
_id:"c1",
name: "Rome"
}
{
_id:"c2",
name: "London"
}
Services
{
_id:"s1",
name: "Repair"
}
{
_id:"s2",
name: "Sell"
}
{
_id:"s3",
name: "Buy"
}
...and searching for the word "R", the result should be the u1 user (due to the R in "Rome" and "Repair").
Given that I cannot do joins, I was thinking making a mongo shell script that adds an additional field to the User collection with all the searcheable referenced strings.
As in the following example
{
_id:"u1",
name:"Jhon",
City: ObjectId("c1"),
Services: [
ObjectId("s1"),
ObjectId("s2")
],
"idx":{
city: "Rome",
services:["Repair","Sell"]
}
}
Finally the question(s)...
Do you think is it a good choice? And Can you propose an alternative solution (or share a link about that, i didn't find nothing usefull)?
And how would you mantain that field constantly updated; for instance, What about if the referenced city name or the services offered by a user change?

Best way to structure relationships in a no-SQL database?

I'm using MongoDB. I know that MongoDB isn't relational but information sometimes is. So what's the most efficient way to reference these kinds of relationships to lessen database load and maximize query speed?
Example:
* Tinder-style "matches" *
There are many users in a Users collection. They get matched to each other.
So I'm thinking:
Document 1:
{
_id: "d3fg45wr4f343",
firstName: "Bob",
lastName: "Lee",
matches: [
"ferh823u9WURF",
"8Y283DUFH3FI2",
"KJSDH298U2F8",
"shdfy2988U2Ywf"
]
}
Document 2:
{
_id: "d3fg45wr4f343",
firstName: "Cindy",
lastName: "Doe",
matches: [
"d3fg45wr4f343"
]
}
Would this work OK if there were, say, 10,000 users and you were on Bob's profile page and you wanted to display the firstName of all of his matches?
Any alternative structures that would work better?
* Online Forum *
I supposed you could have the following collections:
Users
Topics
Users Collection:
{
_id: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
}
{
_id: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
}
Topics Collection Version 1
One example document in the Topics Collection:
{
title: "A spider just popped out of the AC",
dateTimeSubmitted: 201408201200,
category: 5,
posts: [
{
message: "I'm going to use a gun.",
dateTimeSubmitted: 201408201200,
author: "d3fg45wr4f343"
},
{
message: "I don't think this would work.",
dateTimeSubmitted: 201408201201,
author: "23qdf3a3fq3fq3"
},
{
message: "It will totally work.",
dateTimeSubmitted: 201408201202,
author: "d3fg45wr4f343"
},
{
message: "ur dumb",
dateTimeSubmitted: 201408201203,
author: "23qdf3a3fq3fq3"
}
]
}
Topics Collection Version 2
One example document in the Topics Collection. The author's avatar and userName are now embedded in the document. I know that:
This is not DRY.
If the author changes their avatar and userName, these change would need to be updated in the Topics Collection and in all of the post documents that are in it.
BUT it saves the system from querying for all the avatars and userNames via the authors ID every single time this thread is viewed on the client.
{
title: "A spider just popped out of the AC",
dateTimeSubmitted: 201408201200,
category: 5,
posts: [
{
message: "I'm going to use a gun.",
dateTimeSubmitted: 201408201200,
author: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
},
{
message: "I don't think this would work.",
dateTimeSubmitted: 201408201201,
author: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
},
{
message: "It will totally work.",
dateTimeSubmitted: 201408201202,
author: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
},
{
message: "ur dumb",
dateTimeSubmitted: 201408201203,
author: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
}
]
}
So yeah, I'm not sure which is best...
If the data is realy many to many i.e. one can have many matches and can be matched by many in your first example it is usually best to go with relations.
The main arguments against relations stem from mongodb not beeing a relational database so there are no such things as foreign key constraints or join statements.
The trade off you have to consider in those many to many cases (many beeing much more than two) is either enforce the key constraints yourself or manage the possible data inconsistencies accross the multiple documents (your last example). And in most cases the relational approach is much more practical than the embedding approach for those cases.
Exceptions could be read often write seldom examples. For (a very constructed) example when in your first example matches would be recalculated once a day or so by wiping all previous matches and calculating a list of new matches. In that case the data inconsistencies you would introduce could be acceptable and the read time you save by embedding the firstnames of the matches could be an advantage.
But usually for many to many relations it would be best to use a relational approach and make use of the array query features such as {_id :{$in:[matches]}}.
But in the end it all comes down to the consideration of how many inconsistencies you can live with and how fast you realy need to access the data (is it ok for some topics to have the old avatar for a few days if I save half a second of page load time?).
Edit
The schema design series on the mongodb blog might be a good read for you: part1, part2 and part3

In WCF Data Services, can I enclose a list of references to another entity while creating a new object?

I have a table Article, a table Tag and a joint table to associate a tag to an article.
While creating a new Article, by sending a POST request to /Service.svc/Articles, is it possible to enclose in the JSON object a list of Tag ids to be associated?
Something like:
{
title: "My article title",
text: "The content:",
Tags: [ { id: 1 }, { id: 2 }, { id: 3 } ]
}
If not can I send the list of tags in one request? For example:
/Service.svc/Articles(1)/Tags
[ { id: 1 }, { id: 2 }, { id: 3 } ]
Or do I have to make as many requests as they are tags?
Thank you very much in advance.
You can modify just the links by POST/PUT/DELETE to the $links URL as described here: http://www.odata.org/developers/protocols/operations#CreatingLinksbetweenEntries
The samples there use ATOM/XML, but the respective JSON format is also possible.
To send multiple operations to the server in one request (to save the roundtrips) you can create a batch request as described here:
http://www.odata.org/developers/protocols/batch