Paginating a collection from DB receiving updates - mongodb

I need to serve elements sorted on a particular field (score) to a client in a paginated fashion.
The elements are stored in MongoDB as part of a collection. One document looks like this:
{
"id": ObjectId("<>"),
"score": 10
}
To serve the elements, I reverse sort the documents on the score field, and serve 10 elements to the client.
Also, the value in the score field is continuously getting updates from another consumer in an async fashion.
How can I perform pagination of such documents? I was thinking about the following approaches that I usually use, but cannot find a way to fit them in the above design:
Return the last served score as offset and in the next request to fetch elements use the offset.
Issue: This would return some duplicates with the same score (as many elements can have same scores).

Related

Query documents in one collection that aren't referenced in another collection with Firestore

I have a firestore DB where I'm storing polls in one collection and responses to polls in another collection. I want to get a document from the poll collection that isn't referenced in the responses collection for a particular user.
The naive approach would be to get all of the poll documents and all of the responses filtered by user ID then filter the polls on the client side. The problem is that there may be quite a few polls and responses so those queries would have to pull down a lot of data.
So my question is, is there a way to structure my data so that I can query for polls that haven't been completed by a user without having to pull down the collections in their entirety? Or more generally, is there some pattern to use when you need to query for documents in one collection that aren't referenced by another?
The documents in each of the collections look something like this:
Polls:
{
question: string;
answers: Answer[];
}
Responses:
{
userId: string;
pollId: string;
answerId: string;
}
Anyhelp would be much appreciated!
Queries in Firestore can only return documents from one collection (or from all collections with the same name) and can only contain conditions on the data that they actually return.
Since there's no way to filter based on a condition in some other documents, you'll need to include the information that you want to filter on in the polls documents.
For example, you could include a completionCount field in each poll document, that you initially set to 0, and then update only every poll completion. With that in place, the query becomes a simple query on the completionCount field of the polls collection.
For a specific user I'd actually add all polls to their profile document, and remove them from there. Duplicating data is usually the easiest (and sometimes only) way to implement use-cases such as this.
If you're worried about having to add each new poll to each new user profile when it is created, you can also query all polls on their creation timestamp when you next load a user profile and perform that sync at that moment.
load user profile,
check when they were last active,
query for new polls,
add them to user profile.

Firestore pagination by offset

I would like to create two queries, with pagination option. On the first one I would like to get the first ten records and the second one I would like to get the other all records:
.startAt(0)
.limit(10)
.startAt(9)
.limit(null)
Can anyone confirm that above code is correct for both condition?
Firestore does not support index or offset based pagination. Your query will not work with these values.
Please read the documentation on pagination carefully. Pagination requires that you provide a document reference (or field values in that document) that defines the next page to query. This means that your pagination will typically start at the beginning of the query results, then progress through them using the last document you see in the prior page.
From CollectionReference:
offset(offset) → {Query}
Specifies the offset of the returned results.
As Doug mentioned, Firestore does not support Index/offset - BUT you can get similar effects using combinations of what it does support.
Firestore has it's own internal sort order (usually the document.id), but any query can be sorted .orderBy(), and the first document will be relative to that sorting - only an orderBy() query has a real concept of a "0" position.
Firestore also allows you to limit the number of documents returned .limit(n)
.endAt(), .endBefore(), .startAt(), .startBefore() all need either an object of the same fields as the orderBy, or a DocumentSnapshot - NOT an index
what I would do is create a Query:
const MyOrderedQuery = FirebaseInstance.collection().orderBy()
Then first execute
MyOrderedQuery.limit(n).get()
or
MyOrderedQuery.limit(n).get().onSnapshot()
which will return one way or the other a QuerySnapshot, which will contain an array of the DocumentSnapshots. Let's save that array
let ArrayOfDocumentSnapshots = QuerySnapshot.docs;
Warning Will Robinson! javascript settings is usually by reference,
and even with spread operator pretty shallow - make sure your code actually
copies the full deep structure or that the reference is kept around!
Then to get the "rest" of the documents as you ask above, I would do:
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).get()
or
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).onSnapshot()
which will start AFTER the last returned document snapshot of the FIRST query. Note the re-use of the MyOrderedQuery
You can get something like a "pagination" by saving the ordered Query as above, then repeatedly use the returned Snapshot and the original query
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).limit(n).get() // page forward
MyOrderedQuery.endBefore(ArrayOfDocumentSnapshots[0]).limit(n).get() // page back
This does make your state management more complex - you have to hold onto the ordered Query, and the last returned QuerySnapshot - but hey, now you're paginating.
BIG NOTE
This is not terribly efficient - setting up a listener is fairly "expensive" for Firestore, so you don't want to do it often. Depending on your document size(s), you may want to "listen" to larger sections of your collections, and handle more of the paging locally (Redux or whatever) - Firestore Documentation indicates you want your listeners around at least 30 seconds for efficiency. For some applications, even pages of 10 can be efficient; for others you may need 500 or more stored locally and paged in smaller chucks.

MongoDB, sort an array that is the value of a field of a document, with a slice on that array

I have a Mongo collection for profile comments, which is structured like this:
{
"_id": "",
"comments": []
}
The id refers to the ID of the user from the profiles collection. For improved server power, I only want to show 10 comments per page. To do this, I use Mongo's $slice operator. Here is my code for the query.
mongoCols.profileComments.findOne({"_id":doc._id},{comments:{$slice:[(comPage-1)*10,10]}},function(err,doc) {
Here's the problem. I want to show the comments in reverse order, meaning that the newest comments are on the first page, and latest comments are on the last. I thought of a few solutions to this but they aren't very efficient.
1) I could retrieve the entire array, then use JavaScript (I'm using nodejs for this) to sort that array, then only take the 10 elements that I want. This seems inefficient because I'm asking Mongo to retrieve what is potentially a ton of elements from an array, when I only need 10.
2) I could make each comment a separate document, with a field saying what user the comment is for. I could then only find documents where the comment was sent to the requested user, and use the skip and limit options to only retrieve the 10 documents I want. My problem with this is that Mongo will have to go through almost every single comment every time you request for a user's comments. This seems inefficient, but it is my best solution so far.
I would prefer to keep the structure I currently have, but if I need to change for it to work, then I will comply.

How to implement unread/new posts/comments in NoSQL document store like Mongodb?

I've searched and didn't find any exact answer to this common problem.
I would like to show to users new/unread posts in a way that they get for example list of topics with unreaded posts.
If user decide and open any of that topics then automatically it's marked as read and will not show inside that list again when he click on unread posts. Plus the possibility to mark all as read.
I was thinking about maybe showing unread posts only from last 30 days, so data would not be so big.
The obvious and best solution would be making this inside objects embedded inside arrays, every array would have userid and timestamp of last view of specific topic, then i would just compare timestamp of the last post in thread to timestamp of last view of that thread by the user.
Only 2-3 queries then would be needed to show results to the user.
So for example it would look like this:
{
_id: uniqueObjectid,
id: topic_id,
topic: topic of the thread,
last_update: timestamp of last reply to that topic,
reads: [
{id: userid, last_view: timestamp of last view on this topic by user},
...
]
}
I would delete all threads from this collection that have last_update field older than 30 days.
Showing then unread posts for users would be very easy, just compare last_update with last_view for certain userid.
But it's not a good solution, from what i've read the way arrays are implemented in mongodb make that solution very slow. Imagine having last view of some topic for 1000 users, it means 1000 indexed array elements.
So it can't be done in arrays.
Here Asya from MongoDB describes why big embedded arrays should not be used link
I am having difficulties to think of any other efficient way of solving this issue.

Returning custom fields in MongoDB

I have a mongoDB collection with an array field that represents the lists the user is member of.
user {
screen_name: string
listed_in: ['list1', 'list2', 'list3', ...] //Could be more than 10000 elements (I'm aware of the BSON 16MB limits)
}
I am using the *listed_in* field to get the members list
db.user.find({'listed_in': 'list2'});
I also need to query for a specific user and know if he is member of certain lists
var user1 = db.findOne({'screen_name': 'user1'});
In this case I will get the *listed_in* field with all its members.
My question is:
Is there a way to pre-compute custom fields in mongoDB?
I would need to be able to get fields like these, user1.isInList1, user1.isInList2
Right now I have to do it in the client side by iterating through the *listed_in* array to know if the user is member of "list1" but *listed_in* could have thousand elements.
My question is: Is there a way to pre-compute custom fields in mongoDB?
Not really. MongoDB does not have any notion of "computed columns". So the query you're looking for doesn't exist.
Right now I have to do it in the client side by iterating through the *listed_in* array to know if the user is member of "list1" but *listed_in* could have thousand elements
In your case you're basically trying to push a client-side for loop onto the server. However, some process still has to do the for loop. And frankly, looping through 10k items is not really that much work for either client or server.
The only real savings here is preventing extra data on the network.
If you really want to save that network traffic, you will need to restructure your data model. This re-structure will likely involve two queries to read and write, but less data over the wire. But that's the trade-off.