MongoDB design decision for Documents

MongoDB design decision for Documents - mongodb

I'm building an API for a small social network and I came across a design decision that I have to make. I'm working with Express and MongoDB with mongoose to deal with the database.
I have two Documents: Users and Posts. I want the Users to be able to mark Posts as their favorites. I came up with two different ways for the Implementation:
Option A: Saving the favorites in the User Document. It makes it easy to show all favorite posts of an user. But how would I query the users, that have favorited a specific Post?
UserSchema:
favorite_posts: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "posts"
}
]
Option B: Saving the Users, that hit the favorite button in the Post Document. The benefit would be, that you can easily display all Users, that have favorited a Post. But how do I list all Posts that one specific User has marked as favorites.
PostSchema:
users_favorited: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "users"
}
]
Can somebody explain me how to query such things? I'm not getting smarter from the documentation... :(

As already mentioned in the comments your best bet would be a join-table to make a n:m relation work. Mongoose does emulate the sql inner-join functionality through the populate() functionality in regular queries or the $lookup-step in an aggregation. So basically create a table called "likes" that only holds refs to the user and the post. Using the aggregation framework, you can then easily query for all likes of a user or all likes on a post by first using the $match operator, then $group by either the user or the post and $push to create an array of all likes of a user or vice versa and then join the needed data on it using the $lookup step.
However, you could, as you've decribed, put all the favorites in a array on either the user- or the post-documents, but unless you know for sure that these arrays won't grow large, I'd recommend against it, as mongoDb is not designed for this kind of usage and you'll very quickly run into performance problems. See http://www.askasya.com/post/largeembeddedarrays/ for more.

If you are gonna query a lot by userid, you can just add a userid column on the favorites document. This would save queries/joins/aggregations

Related

MongoDB: Looking for advice on designing schema for improving query efficiency

I am fairly new to MongoDB and I’m looking for advice on designing the schema before I commit to going down this route. I’m developing a collaborative documentation system, where the user creates a document and invites other users to collaborate, much like Google docs.
There are two collections. The first one stores documents and the second one stores lists of collaborators. When the user creates a new document, they assign a list of collaborators to this document. In the simplest form, the schema would look something like this
The Document schema contains some data but it also maintains a reference to a document in the Collaborators collection
Document model
{
....
collaborators: ObjectId; // e.g. 0x507f1f77bcf86cd799439011
}
Collaborators collection contains documents that contain an array of roles for the collaborators.
Collaborators model
{
_id: 0x507f1f77bcf86cd799439011; // refererenced by Document model
collaborators: [
{userId: 1, role: "editor"},
{userId: 2, role: "commenter}
]
}
I will have an API that fetches all those documents where the logged-in user’s userId is in the list of collaborators referenced by the document. Without much experience with writing efficient queries, I think a two-step lookup will work but it won’t be very efficient.
Step 1 → Find all the collaborators lists which contain userId, and obtain their _id field
Step 2 → Find all documents that have collaborators field containing one of the values found in Step 1
Is there a more efficient way to construct this query particularly if the users fetch this list frequently?
If I should redesign the schema in some way so that the lookup can be efficient, I’d like to know.
I'm using mongoose client if that's relevant.

I realized using MongoDB aggregation framework is what I needed. I was able to use $lookup and $match stage to achieve what I want. Still not sure how expensive this is given that $lookup will perform left join.
Here’s an example if anybody wants to look.
https://mongoplayground.net/p/RPheBZESC0H

Mongodb Storing Friends Relationship

I am using MongoDb for one of the mobile app that we are developing. It has a feature of sync contacts.
I wanted to know the ideal way of storing the relationships(friends relationship and not RDBMS kind of relationship) in mongodb. I want to know the architecture for the same.
I have thought of the following user collection structure:
{
_id: ObjectID(abc),
name: "abc",
contacts: ["def", "ghi"]
}
In the above collection I am considering "def" and "ghi" as object ids of friends of user abc. Is this the correct way of doing it or can someone suggest me a better and right way that they might have implemented?
All I am concerned about is I should not get stuck or hit the performance when retrieving data specific the user's friends in future.
Consider If I want to get all the activities from collection Activities done by my friends.

I think you could use advantage of noSql structure and save/serve some more info about friend
{
_id: ObjectID(abc),
name: "abc",
contacts: [{id:"def" name:"John"}, {id:"ghi", name:"Sari"} ]
}
To display basic list you will need just one get query, and then having name (or other important related details) - check for activities.
The extra overhead with this structure is a need to update name (and other details) every time when user updates it's name - but this is not a hammer - who changes its name frequently?

Mongodb schema design for swipe card style application

What would be the good approach to design following swipe card style app with skip functionality?
Core functionality of the app I'm working on is as follows.
On the main page, a user first make a query for the list of the posts.
list should be sorted by date in reverse chronological order or some kind of internal score that determines the active post (with large number of votes or comments etc)
A each post is shown to user one by one in the form of a card like tinder or jelly style feed.
For each card, user can either skip or vote for it.
when user consumes all cards fetched and make query again for next items, skipped or already voted card by the current user should not appear again.
Here, the point is that a user could have huge number of skipped or voted post since user only can skip or vote for a post on the main page.(user can browse these already processed items on his/her profile)
The approaches I simply thought about are
1.to store the list of skipped or voted post ids for each user in somewhere and use them in the query with $nin operator.
db.posts.find({ _id: {$nin: [postid1,...,postid999]} }).sort({ date: -1 })
2.to embed all the userId of users that voted or skipped the post to an array and query using $ne operator
{
_id: 'postid',
skipOrVoteUser: ['user1', 'user2' ...... 'user999'],
date: 1429286816366
}
db.posts.find({ skipOrVoteUser: {$ne: 'user1'} }).sort({ date: -1 })
3.Maintaining feedCache for each user and fanout on write.
FeedCache
{
userId: 'user1',
posts: [{id:1, data: {..}}, {id:2, data: {...}},.... {id:3, data: {...}}]
}
Operations:
-When a user create a post, write copy of the post to all user's feed cache in the system.
-Fetch posts from the user's feed cache.
-When the user vote or skip a post, delete the post from his/her feed cache.
But since the list of the posts that user skipped or voted is ever growing and could be really large as time goes. I'm concern that this query would be too slow with large number of list for $nin for approach 1.
Also with approach 2, since all user on the system(or many depending on the filtering) could either vote or skip for a post, embedded user array of each post could be really large( max with number of all user) and performance of the query with $ne will be poor.
With approach 3, for every post created, there will be too much write operation and It won't be efficient.
What would be the good approach to design schema to support this kind of functionality? I've tried come up with good solution and could not think of better solutions. Please help me to solve this problem. Thanks!

On a relational database I would use approach 1. It's and obvious choice as you have good SQL operators for the task and you can easily optimize the query.
With document databases I would choose approach 2. In this case there is a good chance the vote/skip list remaining relatively small as the system grows.

MongoDb - Modeling storage of users & post in a webapp

I'm quite new to nosql world.
If I have a very simple webapp with users authenticating & publishing posts, what's the mongodb(nosql) way to store users & posts on the nosql db?
Do I have (like in relationnal databases) to store users & posts each one in his own collection? Or store them in the same collection, on different documents? Or, finally with a redondant user infos (credentials) on each post he has published?

A way you could do it is to use two collection, a posts collection and a authors collection. They could look like the following:
Posts
{
title: "Post title",
body: "Content of the post",
author: "author_id",
date: "...",
comments: [
{
name: "name of the commenter",
email: "...",
comment: "..."
}],
tags: [
"tag1", "tag2, "tag3
]
}
Authors
{
"_id": "author_id",
"password": "..."
}
Of course, you can put it in a single collection, but #jcrade mentioned a reason why you would/should use two collections. Remember, that's NoSQL. You should design your database from an application point of you, that means ask yourself what data is consumed and how.

This post says it all:
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1
It really depends on your application, and how many posts you expect your users to have: if it's a one-to-few relationship, then probably using embedded documents (inside your users model) is the way to go. If it's one to many (up to a couple of thousands) then just embed an array of IDs in your users model. If it's more than that, then use the answer provided by Horizon_Net.
Read the post, and you get a pretty good idea of what you will have to do. Good luck!

When you are modeling nosql database you should think in 3 basic ideas
Desnormalization
Copy same data on multiple documents. in order to simplify/optimize query processing or to fit the user’s data into a particular data model
Aggregation
Embed data into documents for example (blog post and coments) in order to impact updates both in performance and consistency because mongo has one document consistency at time
Application level Joins
Create applicaciton level joins when its not good idea to agregate information (for example each post as idependent document will be really bad because we need to accces to the same resource)
to answer your question
Create two document one is blogPost with all the comments, and tags on it and user ui. Second User with all user information.

How to balance quickness and redundancy in MongoDB data structures?

I am creating a MongoDB database with a users collection (with UserFiles in it) and a posts collection. Each post has tags and sharedFrom fields in it. I eventually plan to have users' search results influenced by what tags they normally post about and from which other users they often share posts. Would it be better to:
make a field in the UserFile document of each user that lists the post IDs made by the user?
make a field in the UserFile that documents that lists all the tags they have used and other users that they have sharedFrom?
make the search function look up the searchers activity that then influences the search results?
something I haven't thought of?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse