mongodb schematic theory for feed - mongodb

I'm using a mongo database for a bunch of users, inside the collection i have the userid, then i have a nested collection (Array) of the things that the user has liked, or should show up in their feed etc. My idea is that when something they like changes their feed updates (i'll remove the past entry from days ago, and reinsert a new entry for today).
OK, here's the question/problem. This concept works well when one user likes something and then later a content element they liked gets updated... BUT, what happens if I have 5 million users that all like one content element (Say, an article) and then the element is updated... How, using mongo do I insert/delete new records from 5million records all at once... perhaps someone can suggest a better schematic...

in this particular case - I will suggest a separate collection for that purpose
col events/likes{
_id,
userId,
action //add fields needed
}
then if user id:1 is a subscribed to events from user id:2, we need to retrive documents from events/likes collection when user id:2
Makes this sense in your case?

Related

Is there a method in Firestore that could fetch a document and its adjacent documents?

In Firestore, we need to use something like startAt or endAt with limit to get either previous or preceding data from a given point in the list of documents within a collection.
We need to fetch twice to get adjacent documents (once for the data before and once for the data after) then we need to combine them in the list which could be error prone
I was wondering if there is a way to get the corresponding document with its adjacent documents in Flutter Firestore? Something like .getDocumentWithAdjacent
Firestore.instance.collection('restaurants').document(docId)
.getDocumentWithAdjacent(previousCount: 3, afterCount: 6); // will fetch 10 documents
The command above will fetch 10 documents: 3 previous documents, the document itself, and 6 preceding documents.
Just further explanation & use cases:
Just like Instagram app for example, the user can tap on a post to see its detail. Then the user can scroll up and down to see the adjacent post details easily because when the user tap on a post, some of the data before and after that post also being loaded.
Another example: A restaurant search app that shows a list of restaurant in an area.
When a user tap somewhere in the middle of the list of restaurant, then the restaurant details is being displayed, but the user want to scroll (either horizontally or vertically) to see the adjacent restaurant in the list so we need to load the data for that also.
Thank you
Firestore does not have a concept of "adjacent" documents in the way that you describe.
What you can do instead is make two queries. One for documents greater than a given document, then another for document less than the given.

Firestore: Order by sub-collection field

First of all, this is not a regular question. It's little complicated.
App summary
Recipes app where users can search recipes by selected ingredients (collection ingredients exists in firestore db). I want to store for every ingredient statistics how much did users search with that selected ingredient, so I can show them later at the top ingredients which they used mostly for searching recipes.
This is how my collection looks like:
http://prntscr.com/nlz062
And now I would like to order recipes by statistics that created logged in user.
first = firebaseHelper
.getDb()
.collection(Constants.INGREDIENTS_COLLECTION)
.orderBy("statistics." + firebaseHelper.getCurrentUser().getUid() + ".count")
.limit(25);
If logged in user hasn't yet searched recipes with ingredients, then it should order normally. Anyway the query above is not working. Is it possible this use case to be done with Firestore.
Note: Statistics may exists or may not for logged in user, it all depends on his search.
You can't query and documents by fields that don't immediately exist within the document. Or, in other words, you can't use fields documents within subcollections that are not in the named collection being queried.
As of today (using the latest Firestore client libraries), you could instead perform a collection group query to query all of the subcollections called "statistics" for their count field. However, that will still only get you the statictics documents. You would have to iterate those documents, parse the ingredient document ID out of its reference, and individually get() each one of those documents in order to display a UI.
The collection group query would look something like this in JavaScript:
firestore
.collectionGroup("statistics")
.where(FieldPath.documentId())
.orderBy("count")
.limit(25)
You should be able to iterate those results and get the related documents with no problem.

Schema on mongodb for reducing API calls with two collections

Not quite sure what the best practice is if I have two collections, a user collection and a picture collection - I do not want to embed all my pictures into my user collection.
My client searches for pictures under a certain criteria. Let's say he gets 50 pictures back from the search (i.e. one single mongodb query). Each picture is associated to one user. I want the user name displayed as well. I assume there is no way to do a single search performance wise on the user collection returning the names of each user for each picture, i.e. I would have to do 50 searches. Which means, I could only avoid this extra performance load by duplicating data (next to the user_id, also the user_name) in my pictures collection?
Same question the other way around. If my client searches for users and say 50 users are returned from the search through one single query. If I want the last associated picture + title also displayed next to the user data, I would again have to add that to the users collection, otherwise I assume I need to do 50 queries to return the picture data?
Lets say the schema for your picture collection is as such:
Picture Document
{
_id: Objectid(123),
url: 'img1.jpg',
title: 'img_one',
userId: Objectid(342)
}
1) Your picture query will return documents that look like the above. You don't have to make 50 calls to get the user associated with the images. You can simply make 1 other query to the Users Collection using the user ids taken from the picture documents like such:
db.users.find({_id: {$in[userid_1,user_id2,userid_3,...,userid_n]}})
You will receive an array of user documents with the user information. You'll have to handle their display on the client afterwards. At most you'll need 2 calls.
Alternatively
You could design the schema as such:
Picture Document
{
_id: Objectid(123),
url: 'img1.jpg',
title: 'img_one',
userId: Objectid(342),
user_name:"user associated"
}
If you design it this way. You would only require 1 call, but the username won't be in sync with user collection documents. For example lets say a user changes their name. A picture that was saved before may have the old user name.
2) You could design your User Collection as such:
User Document
{
_id: Objectid(342),
name: "Steve jobs",
last_assoc_img: {
img_id: Object(342)
url: 'img_one',
title: 'last image title
}
}
You could use the same principles as mentioned above.
Assuming that you have a user id associated with every user and you're also storing that id in the picture document, then your user <=> picture is a loosely coupled relationship.
In order to not have to make 50 separate calls, you can use the $in operator given that you are able to pull out those ids and put them into a list to run the second query. Your query will basically be in English: "Look at the collection, if it's in the list of ids, give it back to me."
If you intend on doing this a lot and intend for it to scale, I'd either recommend using a relational database or a NoSQL database that can handle joins to not force you into an embedded document schema.

how to join a collection and sort it, while limiting results in MongoDB

lets say I have 2 collections wherein each document may look like this:
Collection 1:
target:
_id,
comments:
[
{ _id,
message,
full_name
},
...
]
Collection 2:
user:
_id,
full_name,
username
I am paging through comments via $slice, let's say I take the first 25 entries.
From these entries I need the according usernames, which I receive from the second collection. What I want is to get the comments sorted by their reference username. The problem is I can't add the username to the comments because they may change often and if so, I would need to update all target documents, where the old username was in.
I can only imagine one way to solve this. Read out the entire full_names and query them in the user collection. The result would be sortable but it is not paged and so it takes a lot of resources to do that with large documents.
Is there anything I am missing with this problem?
Thanks in advance
If comments are an embedded array, you will have to do work on the client side to sort the comments array unless you store it in sorted order. Your application requirements for username force you to either read out all of the usernames of the users who commented to do the sort, or to store the username in the comments and have (much) more difficult and expensive updates.
Sorting and pagination don't work unless you can return the documents in sorted order. You should consider a different schema where comments form a separate collection so that you can return them in sorted order and paginate them. Store the username in each comment to facilitate the sort on the MongoDB side. Depending on your application's usage pattern this might work better for you.
It also seems strange to sort on usernames and expect/allow usernames to change frequently. If you could drop these requirements it'd make your life easier :D

mongodb - add column to one collection find based on value in another collection

I have a posts collection which stores posts related info and author information. This is a nested tree.
Then I have a postrating collection which stores which user has rated a particular post up or down.
When a request is made to get a nested tree for a particular post, I also need to return if the current user has voted, and if yes, up or down on each of the post being returned.
In SQL this would be something like "posts.*, postrating.vote from posts join postrating on postID and postrating.memberID=currentUser".
I know MongoDB does not support joins. What are my options with MongoDB?
use map reduce - performance for a simple query?
in the post document store the ratings - BSON size limit?
Get list of all required posts. Get list of all votes by current user. Loop on posts and if user has voted add that to output?
Is there any other way? Can this be done using aggregation?
NOTE: I started on MongoDB last week.
In MongoDB, the simplest way is probably to handle this with application-side logic and not to try this in a single query. There are many ways to structure your data, but here's one possibility:
user_document = {
name : "User1",
postsIhaveLiked : [ "post1", "post2" ... ]
}
post_document = {
postID : "post1",
content : "my awesome blog post"
}
With this structure, you would first query for the user's user_document. Then, for each post returned, you could check if the post's postID is in that user's "postsIhaveLiked" list.
The main idea with this is that you get your data in two steps, not one. This is different from a join, but based on the same underlying idea of using one key (in this case, the postID) to relate two different pieces of data.
In general, try to avoid using map-reduce for performance reasons. And for this simple use case, aggregation is not what you want.