One to Many Relationship mongoDB - mongodb

I have a quick question regarding one to many relationships in mongoDB. I have mainly used SQL before this so im getting confused about how to approach relationships. I have viewed all the documentation online and it does not give a good example of how to set up and query a one to many relationship.
Say I have a table of Users and each user has many products. This means that in an SQL situation multiple products in the table would have the same user foreign_key. In mongoDB I have tried to replicate this by placing each users object id into the corresponding product that they are selling much like a foreign key.
Im getting confused on how I would query it. For example how would I do SELECT * FROM USERS, PRODUCTS WHERE USER_ID = USERFK_ID;?
Ive read about document references, embedded document but its just confusing me more. Does anyone have a straight explanation please.

Assuming I understood your question, I will have a users collection and a products collection.
The users collection will contain users and their details. E.g.
{id: '007', name: 'john'}
{id: '010', name: 'paul'}
The products collection will contain products linked to given users. E.g.
{id: '432738', name: 'apple', price: '100', owner: '007'} i.e. owner is john
As pertaining the query, I will do something like this:
db.collection('products').find({owner: user_id_here})

A one-to-many relationship is where the parent document can have many child documents, but the child documents can only have one parent document.
db.artists.insert(
{
_id : 3,
artistname : "Moby",
albums : [
{
album : "Play",
year : 1999,
genre : "Electronica"
},
{
album : "Long Ambients 1: Calm. Sleep.",
year : 2016,
genre : "Ambient"
}
]
}
)

Related

NoSQL db schema design

I'm trying to find a way to create the db schema. Most operations to the database will be Read.
Say I'm selling books on the app so the schema might look like this
{
{ title : "Adventures of Huckleberry Finn"
author : ["Mark Twain", "Thomas Becker", "Colin Barling"],
pageCount : 366,
genre: ["satire"] ,
release: "1884",
},
{ title : "The Great Gatsby"
author : ["F.Scott Fitzgerald"],
pageCount : 443,
genre: ["Novel, "Historical drama"] ,
release: "1924"
},
{ title : "This Side of Paradise"
author : ["F.Scott Fitzgerald"],
pageCount : 233,
genre: ["Novel] ,
release: "1920"
}
}
So most operations would be something like
1) Grab all books by "F.Scott Fitzgerald"
2) Grab books under genre "Novel"
3) Grab all book with page count less than 400
4) Grab books with page count more than 100 no later than 1930
Should I create separate collections just for authors and genre and then reference them like in a relational database or embed them like above? Because it seems like if I embed them, to store data in the db I have to manually type in an author name, I could misspell F.Scott Fitzgerald in a document and I wouldn't get back the result.
First of all i would say a nice DB choice.
As far as mongo is concerned the schema should be defined such that it serves your access patterns best. While designing schema we also must observe that mongo doesn't support joins and transactions like SQL. So considering all these and other attributes i would suggest that your choice of schema is best as it serves your access patterns. Usually whenever we pull any book detail, we need all information like author, pages, genre, year, price etc. It is just like object oriented programming where a class must have all its properties and all non- class properties should be kept in other class.
Taking author in separate collection will just add an extra collection and then you need to take care of joins and transactions by your code. Considering your concern about manually typing the author name, i don't get actually. Let's say user want to see books by author "xyz" so he clicks on author name "xyz" (like some tag) and you can fetch a query to bring all books having that selected name as one of the author. If user manually types user name then also it is just finding the document by entered string. I don't see anything manual here.
Just adding on, a price key shall also fit in to every document.

MongoDB collections design

I've got such four tables:
Point is that users that joined in particular group have access to a survey for time interval from date to date. How should i organize collection structure of such db in mongodb?
For survey and questions this will be a simple colection of surveys with an array of questions. But for this behavior with start/end of survey it is not clear for me how to store this data.
What about something like.
Groups
{
_id : "group1",
"members" : [{"name":"A"...},{"name":"B"...}],
"surveys" : [{"surveyId":"survey1", "startDate": ISODate(),"endDate":ISODate()},{"surveyId":"survey2", "startDate": ISODate(),"endDate":ISODate()}]
}
Surveys
{
_id : "survey1",
questions : [{"text":"Atheist??"...},{....}]
}
Honestly, it depends on what pattern you want to use, I mean you can embed groups inside survey also with registration details.

mongodb: Embedded only id or both id and name

I'm new to mongodb, please suggest me how to correct design schema for situation like below:
I have User collection and Product collection. Product contain info like id, title, description, price... User can bookmark or like Product. Currently, in User collection, I'm store 1 array for liked products, and 1 array for bookmarked products. So when I need to view info about 1 user, I have to read out these 2 array, then search in Product collection to get title of liked and bookmarked products.
//User collection
{
_id : 12345,
name: "John",
liked: [123, 456, 789],
bkmark: [123, 125]
}
//Product collection
{
_id : 123,
title: "computer",
desc: "awesome computer",
price: 12
}
Now I think I can speed up this process by embedded both product id and title in User collection, so that I don't have to search in Product collection, just read it out and display. But if I choose this way, whenever Product's title get updated, I have to search and update in User collection too. I can't evaluate update cost in 2nd way, so I don't know which way is correct. Please help me to choose between them.
Thanks & Regards.
You should consider what happens more often: A product gets renamed or the information of a user is requested.
You should also consider what's a bigger problem: Some time lag in which users see an outdated product name (we are talking about seconds, maybe minutes when you have a really large number of users) or always a longer response time when requesting a user profile.
Without knowing your actual usage patterns and requirements, I would guess that it's the latter in both cases, so you should rather optimize for this situation.
In general it is not recommended to normalize a MongoDB as radical as you would normalize a relational database. The reason is that MongoDB can not perform JOINs. So it's usually not such a bad idea to duplicate some relevant information in multiple documents, while accepting a higher cost for updates and a potential risk of inconsistencies.

Reading categories and number of articles in a single query

I have a database in MongoDb that contains two collections: 'categories' and 'articles'.
I'm using Mongoose on NodeJs to connect to the database and read the categories. I want to calculate the number of articles for a category without making an additional request/query, so it would be perfect if I could solve this at the database level.
An item from the 'categories' collection looks like:
{
'_id' : ObjectId("..."),
'feed_id' : 1,
'name': 'Blog posts'
}
An item from the 'articles' collection looks like:
{
'_id' : ObjectId("..."),
'feed_id' : 1,
'title': 'Article title',
'published' : '12/09/2012',
...
}
so the categories and articles are linked using the 'feed_id' field.
I would like to export all categories together with a corresponding number of articles:
{
'_id' : ObjectId("..."),
'feed_id' : 1,
'name': 'Blog posts',
'no_articles': 4
}
I'm not sure how exactly I should do this:
1) Create a 'no_articles' field in the categories collection? If yes, I would like this to be updated automatically when a document is inserted or deleted from the articles collection.
2) Sum up the articles into 'no_articles' when categories are read?
I read something about MapReduce and group, but didn't quite understand if it's possible to use them for this particular task.
This is one of use cases where traditional relational databases really shine.
It is impossible to do that with one query in mongodb. The "no_articles field" you mentioned is the way to go. Common name (among Rails people, anyway) for this approach is: Counter Cache Column. I am not very familiar with Mongoose, so I don't know whether it will maintain that field for you or not. MongoDB itself certainly won't do it. But maintaining it yourself isn't a lot of work, you just need to be accurate.
I advise against counting articles when you read categories. This is a classic example of N+1 query problem and counter cache column is there to prevent it.
Why not just store the category directly in the post document? Since it appears that you're creating new category document for each post that uses the category (as evidenced by a 1-to-many linkage using feed_id) then it might make sense to store an array of categories within the post document.
{
'_id' : ObjectId("..."),
'feed_id' : 1,
'title': 'Article title',
'published' : '12/09/2012',
...
categories : [ 'Blog Posts', 'Category 2' ]
}
Then you can do a
db.articles.find({categories : 'Blog Posts' })
To find all the articles with a certain category and you can add a .count() to get the count
Using those feed_ids to join is anathema to MongoDB. You can't join across collections so you either have to denormalize or put everything in one big collection. Mongo is designed so that you'll denormalize everything.
If this doesn't seem like the right way to solve your problem then you might be better suited to use a RDBMS.

Query a Many-to-Many relation in MongoDB

I am reading MongoDB in Action and when talking about querying many-to-many relationships in a Document, I'm having difficulty understanding how he wrote his example query (using the Ruby driver).
The query is finding all products in a specific category, where there is a products and category collection. The author says "To query for all products in the Gardening Tool category, the code is simple:
db.products.find({category_ids => category['id']})
A PRODUCT doc is like this:
doc =
{ _id: new ObjectId("4c4b1476238d3b4dd5003981"),
slug: "wheel-barrow-9092",
sku: "9092",
name: "Extra Large Wheel Barrow",
description: "Heavy duty wheel barrow...",
details: {
weight: 47,
weight_units: "lbs",
model_num: 4039283402,
manufacturer: "Acme",
color: "Green"
},
category_ids: [new ObjectId("6a5b1476238d3b4dd5000048"),
new ObjectId("6a5b1476238d3b4dd5000049")],
main_cat_id: new ObjectId("6a5b1476238d3b4dd5000048"),
tags: ["tools", "gardening", "soil"],
}
And a CATEGORY doc is like this:
doc =
{ _id: new ObjectId("6a5b1476238d3b4dd5000048"),
slug: "gardening-tools",
ancestors: [{ name: "Home",
_id: new ObjectId("8b87fb1476238d3b4dd500003"),
slug: "home"
},
{ name: "Outdoors",
_id: new ObjectId("9a9fb1476238d3b4dd5000001"),
slug: "outdoors"
}
],
parent_id: new ObjectId("9a9fb1476238d3b4dd5000001"),
name: "Gardening Tools",
description: "Gardening gadgets galore!",
}
Can someone please explain it a little more to me? I still can't understand how he wrote that query :(
Thanks all.
The query is searching the products collection for all products with a value of category['id'] in the field category_ids
When you search a field that contains an array for a specific value, MongoDB automatically enumerates each value in that array searching for matches.
To construct the query, you must first notice that the category collection defines your category hierarchy, and that each category has a unique ID (stored, as is usual in MongoDB, in the _id field)
You must also notice that the product collection has a field that stores a list of category ids, category_ids, that reference the unique ids of the category collection.
Therefore, to find all products in a particular category, you search the category_ids field of the product collection for the unique ID of the category you're interested in, which you get from the category collection.
If I were to write a query for the Mongo javascript based shell interpreter, mongothat find products in the Gardening Tools category, I would do the following:
Look up the ID of the Gardening Tools category (which, as noted before, is stored in the _id field of the category collection)
In this case, the value in your example is ObjectId("6a5b1476238d3b4dd5000048")
Insert the value into a query that searches through the category_ids field of the product collection
This is the query that you give in your question, which for the mongo shell I would write as: db.products.find({category_ids : new ObjectId("6a5b1476238d3b4dd5000048")})
I hope that's clearer than the original explanation!
(As an aside: I'm not quite sure what language your query is written in, is it perhaps PHP? In any case, javascript seems to be the language of choice for examples in the MongoDB docs because the MongoDB server installs the mongo command line interpreter alongside the server itself, so everyone has access to it)