MongoDB collections design

MongoDB collections design - mongodb

I've got such four tables:
Point is that users that joined in particular group have access to a survey for time interval from date to date. How should i organize collection structure of such db in mongodb?
For survey and questions this will be a simple colection of surveys with an array of questions. But for this behavior with start/end of survey it is not clear for me how to store this data.

What about something like.
Groups
{
_id : "group1",
"members" : [{"name":"A"...},{"name":"B"...}],
"surveys" : [{"surveyId":"survey1", "startDate": ISODate(),"endDate":ISODate()},{"surveyId":"survey2", "startDate": ISODate(),"endDate":ISODate()}]
}
Surveys
{
_id : "survey1",
questions : [{"text":"Atheist??"...},{....}]
}
Honestly, it depends on what pattern you want to use, I mean you can embed groups inside survey also with registration details.

Related

One to Many Relationship mongoDB

I have a quick question regarding one to many relationships in mongoDB. I have mainly used SQL before this so im getting confused about how to approach relationships. I have viewed all the documentation online and it does not give a good example of how to set up and query a one to many relationship.
Say I have a table of Users and each user has many products. This means that in an SQL situation multiple products in the table would have the same user foreign_key. In mongoDB I have tried to replicate this by placing each users object id into the corresponding product that they are selling much like a foreign key.
Im getting confused on how I would query it. For example how would I do SELECT * FROM USERS, PRODUCTS WHERE USER_ID = USERFK_ID;?
Ive read about document references, embedded document but its just confusing me more. Does anyone have a straight explanation please.

Assuming I understood your question, I will have a users collection and a products collection.
The users collection will contain users and their details. E.g.
{id: '007', name: 'john'}
{id: '010', name: 'paul'}
The products collection will contain products linked to given users. E.g.
{id: '432738', name: 'apple', price: '100', owner: '007'} i.e. owner is john
As pertaining the query, I will do something like this:
db.collection('products').find({owner: user_id_here})

A one-to-many relationship is where the parent document can have many child documents, but the child documents can only have one parent document.
db.artists.insert(
{
_id : 3,
artistname : "Moby",
albums : [
{
album : "Play",
year : 1999,
genre : "Electronica"
},
{
album : "Long Ambients 1: Calm. Sleep.",
year : 2016,
genre : "Ambient"
}
]
}
)

MongoDB: mapping collection names for user

I have a following corner-case for MongoDB that I hope you can help me to solve.
My MongoDB database is used by multiple independent users and there's a technical limitations that holds me from creating a DB per user. Users are untrusted. Users will create collections with arbitrary names.
Is there any way to "namespace" the collections of one user from the collections of the other user? For example, when user "jim" makes a collection "orders" it will not clash with user "bob" creating collection "orders".
Users are authenticated and connected through SSL-protected channel.

Don't create these many collections in DB. In one collection you maintain that collection fields in one document. So, you will be able to access all the collection by "_id" unique field. And all documents fields and values will be according to user choice.
For example you have one collection "user_collection" which stores all the collection details of users.
user_collection
{
{ "_id" : "0921092109227812",
"collectionName": "orders",
"user": ObjectId(Ref_Id1),
"fields": []
},
{ "_id" : "5686565681232344",
"collectionName": "orders",
"user": ObjectId(Ref_Id2),
"fields": []
}
}
I have just given you the schema. You can elaborate this schema according to your requirements.

How can I effectively design and fetch associated documents in MongoDB?

Currently I have two models in my application - for users and comments. The simplified structure is as follows:
User
{
id : "01",
username : "john"
}
Comment
{
id : "001",
body : "this is the comment"
}
Now I would like to associate users with their comments. Coming from SQL world, the first thing coming to my mind is simply adding user_id field in comment document and then use JOIN, but I guess it's not an optimal solution in terms of efficiency.
The other solution could be to embed comments in user's document:
{
id : "01",
username : "john",
comments : [
{
id : "001",
body : "this is the comment"
}
]
}
But I'm going to query for comments very often, e.g. when showing all comments from the past 24 or 48 hours. And alongside with the comment, I want to display the username.
I could of course add username field to the comment document. But then I have username stored in two places - in users collection and comments collection.
What is the best approach here?

If you have a very huge number of comments per user, it is not a good option to put those comments as sub-document of the user collection. It will not increase the efficiency. The best option will be to put the user_id in the comment collection and creating an index on that field.

How to query two collections at the same time?

I am using MongoDB and I ended up with two Collections (unintentionally).
The first Collection (sample) has 100 million records (Tweets) with the following structure:
{
"_id" : ObjectId("515af34297c2f607b822a54b"),
"text" : "bla bla ",
"id" : NumberLong("314965680476803072"),
"user" :
{
"screen_name" : "TheFroooggie",
"time_zone" : "Amsterdam",
},
}
The second Collection (users) with 30 Million records of unique users from the tweet collection and it looks like this
{ "_id" : "000000_n", "target" : 1, "value" : { "count" : 5 } }
where the _id in the users collection is the user.screen_name from the tweets collection, the target is their status (spammer or not) and finally the value.count is the number a user appeared in our first collection (sample) collection (e.g. number of captured tweets)
Now I'd like to make the following query:
I'd like to return all the documents from the sample collection (tweets) where the user has the target value = 1
In other words, I want to return all the tweets of all the spammers for example.

As you receive the tweets you could upsert them into a collection. Using the author information as the key in the "query" document portion of the update. The update document could utilize the $addToSet operator to put the tweet into a tweets array. You'll end up with a collection that has the author and an array of tweets. You can then do your spammer classification for each author and have their associated tweets.
So, you would end up doing something like this:
db.samples.update({"author":"joe"},{$addToSet:{"tweets":{"tweet_id":2}}},{upsert:true})
This approach does have the likely drawback of growing the document past its initially allocated size on disk which means it would be moved and expanded on disk. You would likely incur some penalty for index updating as well.
You could also take an approach of storing a spam rating with each tweet document and later pulling those based on user id.
As others have pointed out, there is nothing wrong with setting up the appropriate indexes and using a cursor to loop through your users pulling their tweets.
The approach you choose should be based on your intended access pattern. It sounds like you are in a good place where you can experiment with several different possible solutions.

Sorting hybrid bucketed schema in MongoDB

Our application allows users to create posts and comments. Data is growing fast and we already reviewed Mongodb scaling strategies. We like the approach presented in http://www.10gen.com/presentations/mongosf2011/schemascale , which uses a hybrid schema between embedded and non-embedded documents, bucketing comments so that they are saved in groups of 100 or 200 comments per document.
{
"_id" : '/post/2323423/1--1',
"comments" : [{
"author" : "peter",
"text" : "comment!",
"when" : "June 24 2012,
"votes": 43
},
{
"author" : "joe",
"text" : "hi!",
"when" : "June 25 2012,
"votes": 102
},
...
],
}
By bucketing comments, fewer disk reads are necessary to display thousands of comments, while at the same time, documents are kept small so writes are fast. It's perfect to paginate comments sorted by date.
We are very interesented in this approach but our application requires comments to be sorted by votes and subcomments.
Currently we use a non-embedded approach which uses a separate collection for comments. Allows us to retrieve data sorted by any field and subcommenting is easy (by reference), but performance is becoming an issue. We would like to use bucketing but the sorting by votes thing does not seem to fit in a bucket.
Sorting by date is trivial, just go for the next bucket as the user clicks 'next page', quering one document. But, how do we manage to do this if we want to sort by votes? we'd have to retrieve all buckets and then sort the comments, which is obviously inneficient...
Any ideas about a proper schema design to accomplish this?

You should be able to sort by descending:
db.collection.find({},{_id:0}).sort({'comments.votes':1})
Just note that there is a bug where you can only sort by ascending.
See this bug ticket

have you tried an aggregation query?
db.commentbuckets.aggregate([
$match: {discussion_id: <discussion_id>},
$unwind: "$comments",
$sort: {votes: -1}
]);

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse