Firestore pagination by offset - google-cloud-firestore

I would like to create two queries, with pagination option. On the first one I would like to get the first ten records and the second one I would like to get the other all records:
.startAt(0)
.limit(10)
.startAt(9)
.limit(null)
Can anyone confirm that above code is correct for both condition?

Firestore does not support index or offset based pagination. Your query will not work with these values.
Please read the documentation on pagination carefully. Pagination requires that you provide a document reference (or field values in that document) that defines the next page to query. This means that your pagination will typically start at the beginning of the query results, then progress through them using the last document you see in the prior page.

From CollectionReference:
offset(offset) → {Query}
Specifies the offset of the returned results.

As Doug mentioned, Firestore does not support Index/offset - BUT you can get similar effects using combinations of what it does support.
Firestore has it's own internal sort order (usually the document.id), but any query can be sorted .orderBy(), and the first document will be relative to that sorting - only an orderBy() query has a real concept of a "0" position.
Firestore also allows you to limit the number of documents returned .limit(n)
.endAt(), .endBefore(), .startAt(), .startBefore() all need either an object of the same fields as the orderBy, or a DocumentSnapshot - NOT an index
what I would do is create a Query:
const MyOrderedQuery = FirebaseInstance.collection().orderBy()
Then first execute
MyOrderedQuery.limit(n).get()
or
MyOrderedQuery.limit(n).get().onSnapshot()
which will return one way or the other a QuerySnapshot, which will contain an array of the DocumentSnapshots. Let's save that array
let ArrayOfDocumentSnapshots = QuerySnapshot.docs;
Warning Will Robinson! javascript settings is usually by reference,
and even with spread operator pretty shallow - make sure your code actually
copies the full deep structure or that the reference is kept around!
Then to get the "rest" of the documents as you ask above, I would do:
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).get()
or
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).onSnapshot()
which will start AFTER the last returned document snapshot of the FIRST query. Note the re-use of the MyOrderedQuery
You can get something like a "pagination" by saving the ordered Query as above, then repeatedly use the returned Snapshot and the original query
MyOrderedQuery.startAfter(ArrayOfDocumentSnapshots[n-1]).limit(n).get() // page forward
MyOrderedQuery.endBefore(ArrayOfDocumentSnapshots[0]).limit(n).get() // page back
This does make your state management more complex - you have to hold onto the ordered Query, and the last returned QuerySnapshot - but hey, now you're paginating.
BIG NOTE
This is not terribly efficient - setting up a listener is fairly "expensive" for Firestore, so you don't want to do it often. Depending on your document size(s), you may want to "listen" to larger sections of your collections, and handle more of the paging locally (Redux or whatever) - Firestore Documentation indicates you want your listeners around at least 30 seconds for efficiency. For some applications, even pages of 10 can be efficient; for others you may need 500 or more stored locally and paged in smaller chucks.

Related

How many reads does using .contains() produce? - Flutter/Firebase

I'm using Firebase & Flutter and wondering how many reads does using .contains() produce?
For example: Let's say we have a button that you can click on and whenever the current user clicks on it, it takes their UID and stores it in an array/list in the Firestore Database.
Then I want to check if the current user's UID is inside of that list. To do this, I'm using .contains(uid). So as an example, let's say the list contains a total of 10 different values/UIDs. Does that mean using .contains() would produce 10 reads or only 1?
Firestore read count based only on the entire document. Also you are fetching the doc and do .contains() in the client side. So this results in only 1 read.
By the way, you can add in an array without reading the doc. It adds to the array only if it does not have the item.
FirebaseFirestore.instance
.collection('<collection name>')
.doc('<docId>').update({'<arrayFieldKey>':FieldValue.arrayUnion([<new item>])});
Hope it helps!

query modified documents and load other from cache in firestore

To reduce the number of reads it is a general technique to maintain timestamp of last edits in documents and comparing timestamp to load only modified documents.
Here is an example from firebase docs:
db.collection('groups')
.where('participants', 'array-contains', 'user123')
.where('lastUpdated', '>', lastFetchTimestamp)
.orderBy('lastUpdated', 'desc')
.limit(25)
They claim this would reduce the reads.
I tried implementing the use-case, I have a document as shown below:
I have sections in my app where I use scorecards to list top scorers, My query is as follows
private void loadFriendScores(UserScorecard scorecard) {
Query friendScoreQuery=scorecardRef.whereIn("uid", scorecard.getFriendsList())
.whereGreaterThan("lastActive", scorecard.getLastActive()).limit(5);
FirestoreRecyclerOptions<UserScorecard> friends = new FirestoreRecyclerOptions
.Builder<UserScorecard>()
.setQuery(friendScoreQuery, UserScorecard.class)
.setLifecycleOwner(getViewLifecycleOwner())
.build();
TopScoresAdapter friendsAdapter = new TopScoresAdapter(friends, getContext(), this);
binding.topScorersFriendsRcv.setAdapter(friendsAdapter);
binding.topScorersFriendsRcv.setLayoutManager(new LinearLayoutManager(getContext()));
}
I assumed the query to load all modified changes along with others (from cache):
The screen on android is as follows:
While I expected it to load all of my friendlist (as I understood from docs).
I suppose they did not mention that we need to fetch the cached list, there is a way to do a cached request in firestore.
but I'm not sure if this is reliable perhaps the cache will be cleaned and the last request would be empty ,
then, you should save the last response using the localstorage library
#react-native-async-storage/async-storage
I'm struggling myself with the costs issue. The reads are way higher then 50 reads and I'm not sure how to count them properly. so I upvoted the issue

Implement a firestore infinite scolling list which updates on collection changes

What am I trying to accomblish?
I am currently facing a bunch of problems implementing a real time updated infinite scrolling list with the firestore backend.
In my application I want to display comments (like in e.g. YouTube or other social media sites) to the user. Since the number of comments in a collection might be quite big, I see an option to paginate the collection, while receiving real time updates based on snapshots. So I initially load x comments with the option to load up to x more items whenever the user presses a button. In the image below x = 3.
The standard solution
Based on other SO questions I figured out that one is supposed to use the .limit() and the .startAfter() methods to implement such behaviour.
So the first page is loaded as:
query = this
.collection
.orderBy('date', descending: true)
.limit(pageSize);
query.snapshots().map((QuerySnapshot snap) {
lastVisible = snap.documents.last;
// convert the DocumentSnapshot into model object
});
All additional pages are loaded with the following code:
query = this.collection
.orderBy('date', descending: true)
.startAfterDocument(lastVisible)
.limit(pageSize);
Furthermore, I'd like to add that this code is located in a repository class which is used with the BLoC pattern similar to the code shown in Felix Angelov's Flutter Todos Tutorial.
While Felix uses a simple flutter list to show the items, I have a list of pages showing comments based on the data provided by their BLoCs. Note that each BLoC accesses a shared repository (parts of the repository code is shown below).
The Problem with the standard solution
With the code shown above I see multiple problems:
If a comment is inserted in the middle of the ordered collection (how is not of importance), the added comment is shown because of the Stream provided by the snapshot. However, another comment that already existed is not longer shown because of the .limit() operator in the query. One could increase the limit by one but I'm not sure how to edit a snapshot query. In the case that editing a snapshot query is not possible, one could create a new (and bigger) query, but that would cost additional reads.
Similar to 1., if a comment in the middle is deleted, the snapshot will return a list which does not longer contain the deleted comment, however another comment (which is already covered by a different page) appears. E.g., in the scenario shown in the image above 5 comments are loaded. Assuming that comment 3 is deleted, comment 2 will show twice.
Improving the standard solution
Based on these two problems discussed above, I decided that the solution is not sufficient and I implemented a solution which first loads x items by obtaining two "interval" documents. Then a query which fetches the required items in an interval using .startAtDocument() and .endAtDocument() is created, which eliminates the .limit() operator.
DocumentSnapshot pageStartDocument;
DocumentSnapshot pageEndDocument;
Future<Stream<List<Comment>>> comments() async {
// This fetches the first and next Document as initialization
// (maybe should be implemented in constructor)
if (pageStartDocument == null) {
Query query = collection
.orderBy('date', descending: true)
.limit(pageSize);
QuerySnapshot snap = await query.getDocuments();
pageStartDocument = snap.documents.first;
pageEndDocument = snap.documents.last;
} else {
Query query = collection
.orderBy('date', descending: true)
.startAfterDocument(pageEndDocument)
.limit(pageSize);
QuerySnapshot snap = await query.getDocuments();
pageStartDocument = snap.documents.first;
pageEndDocument = snap.documents.last;
}
// This fetches a subcollection of elements from the collection
// with the tradeof of double the reads
Query query = this
.collection
.orderBy('date', descending: true)
.startAtDocument(pageStartDocument)
.endAtDocument(pageEndDocument);
return query.snapshots().asyncMap((QuerySnapshot snap) async {
// convert the QuerySnapshot into model objects
});
As commented in the code, this solution has the following drawback:
Since a query is required to obtain the pageStartDocument and pageEndDocument, the number of reads is doubled, because all the data is read again when the second query is created. The performance impact might be neglectable because I believe the data is cashed, however having 2x database read cost can be significant.
Question:
Since I am not only implementing pagination but also real time updates (with collection insertions), the .limit() operator seems to be not working in my case.
How does one implement a pagination with real time updates (without double reads)?
Side Notes:
I watched how Todd Kerpelman devoures a massive gummy bear while explaining pagination, but in the video it seems to be not so trivial (and a point was made that a tradeoff might be necessary).
If further code from my side is required please say so in the comments.
For the scenario of comments it does not really makes sense that an item is inserted into the middle of the (sorted) collection. However I would like to understand how it should be implemented if the scenario requires such a feature.
this may come as a very late answer. The OP probably won't need help anymore, however for anyone who should stumble on this I wrote a tutorial with a solution that partly solve this:
the Bloc keep a list of stream subscription to keep trace of realtime updates to the list.
however concerning the insertion problem, since when you will have paginated streams based on a document cursor, upon insertion or deletion you necessarily need to reset your pagination stream subscriptions unless it is the last page.
Hence my solution around it was to update the list when modifications occur but reset it when insertions or deletions occur.
Here is the link to the tutorial :
https://link.medium.com/2SPf2Qsbsgb

Swift and Cloud Firestore Transactions - getDocuments?

Transactions in Cloud Firestore support getting a document using transaction.getDocument, but even though there is a .getDocuments method, there doesn’t seem to be a .getDocuments for getting multiple documents that works with a transaction.
I have a Yelp-like app using a Cloud Firestore database with the following structure:
- Places to rate are called spots.
- Each spot has a document in the spots collection (identified by a unique documentID).
- Each spot can have a reviews collection containing all reviews for that spot.
- Each review is identified by its own unique documentID, and each review document contains a rating of the spot.
Below is an image of my Cloud Firestore setup with some data.
I’ve tried to create a transaction getting data for all of the reviews in a spot, with the hope that I could then make an updated calculation of average review & save this back out to a property of the spot document. I've tried using:
let db = Firestore.firestore()
db.runTransaction({ (transaction, errorPointer) -> Any? in
let ref = db.collection("spots").document(self.documentID).collection("reviews")
guard let document = try? transaction.getDocuments(ref) else {
print("*** ERROR trying to get document for ref = \(ref)")
return nil
}
…
Xcode states:
Value of type ‘Transaction’ has no member ‘getDocuments’.
There is a getDocument, which that one can use to get a single document (see https://firebase.google.com/docs/firestore/manage-data/transactions).
Is it possible to get a collection of documents in a transaction? I wanted to do this because each place I'm rating (spot) has an averageRating, and whenever there's a change to one of the ratings, I want to call a function that:
- starts a transaction (done)
- reads in all of the current reviews for that spot (can't get to work)
- calculates the new averageRating
- updates the spot with the new averageRating value.
I know Google's FriendlyEats uses a technique where each change is applied to the current average rating value, but I'd prefer to make a precise re-calculation with each change to keep numerical precision (even if it's a bit more expensive w/an additional query).
Thanks for advice.
No. Client libraries do not allow you to make queries inside of transactions. You can only request specific documents inside of a query. You could do something really hacky, like run the query outside of the transaction, then request every individual document inside the transaction, but I would not recommend that.
What might be better is to run this on the server side. Like, say, with a Cloud Function, which does allow you to run queries inside transactions. More importantly, you no longer have to trust the client to update the average review score for a restaurant, which is a Bad Thing.
That said, I still might recommend using a Cloud Function that does some of the same logic that Friendly Eats does, where you say something along the lines of New average = Old average + new review / (Total number of reviews) It'll make sure you're not performing excessive reads if your app gets really popular.

IBM Cloudant DB - get historical data - best way?

I'm pretty confused concerning this hip thing called NoSQL, especially CloudantDB by Bluemix. As you know, this DB doesn't store the values chronologically. It's the programmer's task to sort the entries in case he wants the data to.. well.. be sorted.
What I try to achive is to simply get the last let's say 100 values a sensor has sent to Watson IoT (which saves everything in the connected CloudantDB) in an ORDERED way. In the end it would be nice to show them in a D3.css style kind of graph but that's another task. I first need the values in an ordered array.
What I tried so far: I used curl to get the data via PHP from https://averylongID-bluemix.cloudant.com/iotp_orgID_iotdb_2018-01-25/_all_docs?limit=20&include_docs=true';
What I get is an unsorted array of 20 row entries with random timestamps. The last 20 entries in the DB. But not in terms of timestamps.
My question is now: Do you know of a way to get the "last" 20 entries? Sorted by timestamp? I did a POST request with a JSON string where I wanted the data to be sorted by the timestamp, but that doesn't work, maybe because of the ISO timestamp string.
Do I really have to write a javascript or PHP script to get ALL the database entries and then look for the 20 or 100 last entries by parsing the timestamp, sorting the array again and then get the (now really) last entries? I can't believe that.
Many thanks in advance!
I finally found out how to get the data in a nice ordered way. The key is to use the _design api together with the _view api.
So a curl request with the following URL / attributes and a query string did the job:
https://alphanumerical_something-bluemix.cloudant.com/iotp_orgID_iotdb_2018-01-25/_design/iotp/_view/by-date?limit=120&q=name:%27timestamp%27
The curl result gets me the first (in terms of time) 120 entries. I just have to find out how to get the last entries, but that's already a pretty good result. I can now pass the data on to a nice JS chart and display it.
One option may be to include the timestamp as part of the ID. The _all_docs query returns documents in order by id.
If that approach does not work for you, you could look at creating a secondary index based on the timestamp field. One type of index is Cloudant Query:
https://console.bluemix.net/docs/services/Cloudant/api/cloudant_query.html#query
Cloudant query allows you to specify a sort argument:
https://console.bluemix.net/docs/services/Cloudant/api/cloudant_query.html#sort-syntax
Another approach that may be useful for you is the _changes api:
https://console.bluemix.net/docs/services/Cloudant/api/database.html#get-changes
The changes API allows you to receive a continuous feed of changes in your database. You could feed these changes into a D3 chart for example.