Reading from the same document multiple times in Firestore - google-cloud-firestore

If I had a function that reads the same document from Firestore multiple times does each read count towards the read count?
Or does the SDK use the cached version and so only add a single count?
I forgot to add. This is a question about the Admin SDK in a cloud function.

The key thing to realize is that you're charged for every document that is read for you on (and usually downloaded from) the server. So if a document is read from the cache, that usually won't count as a charged document read. But if the client needs to check with the server whether its local copy is up to date (the average document-level get() call), that does lead to a document read charge.
The Admin SDKs don't have a persistent cache, so in general each read would have to reach out to the server - and thus count as a charged document read. But some of it depends on how you actually perform the read operation, so it'll be easier to help if you can show an MCVE for that.

Related

How does cloud FireStore read count work?

This is image of my project where is only one map1 document
inside the document multiple map object
when I fetch data only map1 document then my read count increased 25 times sometimes 18.
So my question is that why is increasing like this?
And second thing that I read fireStore document where mentioned that
according to your Number of document is return by query as read count increase.
On the second question first:
The read count you refer to is called document read count. As that name implies it is increment by one for every document that is read on the server on your behalf. So if you request a numner of documents from the server, you will be charged for that many document reads.
The first question is harder to say, because we have no way of reproducing the issue based on your post. But the most common cause of unexpected reads for folks new to Firestore is keeping the Firebase console open.
If you have the Firestore console open, it also reads documents; and those are charged document reads too.

MongoDb concurrency best practices

I am new with MongoDb, I am creating an application that manage a very big list of items (resources), and for each resources the application should manage a kind of booking.
My idea is to embed booking document inside resource document, and to avoid concurrency problem I need to lock the resource during booking.
I see that MongoDB allow locks at collection level, but this will create a bottleneck on the booking functionality because all resources inside the collection will be looked until the current booking is in progress, so for a large amount of users and large amount of resources this solution will have poor performance.
In addition to that, in case of a deadlock occurred booking a resource, all resources will be locked.
Are there alternative solutions or best practices to improve performance and scalability of this use case?
A possible solution should be to have a lock not at collection level but a document level (the resource in my example), in this way a user booking a resource doesn't lock another user to book another resource, even if (also in this case) I am not sure of the final result because write commands are not executed in parallel: I suppose I'll probably also need a cluster of servers to manage multiple writes in parallel.
You are absolutely right, you should definitely not lock the entire collection for just updating a single document.
Now this problem depends on how you update your document.
If you update your document with a single update query, then since document update is atomic you would have no problem.
But if you first have to read the document, change the document, save the document, then you would have the concurrency problem. Just before you save the changed document, it could be updated by some other request and the document you have read would no longer be up to date, hence your new updates will not be right either.
The simple solution to this concurrency problem is solved by storing a version number(usually _v) in each of your documents. And for every update you increment the version number. Then every time you do a read & change & update, you make sure that the version of your read document and the version of that document in the database are identical. When the version number differs the update will fail and you can simply try again.
If you are using node.js, then you are probably using mongoose and mongoose will generate _v and do concurrency checks behind the scenes. So you do not have to do any extra job to solve this concurrency issue.

Is Google Firestore get() request for a non existing document, charged?

I recently realized that even if a Firestore query doesn't match any document, I will still be charged for 1 read.
In my case, there could be lots of queries for non-existing docs, and I want to avoid this cost.
In my case, the client already has (or can generate locally) the relevant document Id beforehand, but the client still doesn't know if this document exists or not.
So instead of querying and receiving the doc, I can do get(docId)
Question: Does the Firestore charge for replying error to a get() request of the non-existing document?
A get() call for a document that requires the server to read data is charged as a document read. Since the server needs to check whether the document exists, that is a charged read operation (as far as i know).
The documentation on Firestore pricing says:
Minimum charge for queries
There is a minimum charge of one document read for each query that you
perform, even if the query returns no results.
So it sounds like you will be charged. The important thing to realize is that the indexes the Firestore uses to manage your documents do take time and space to maintain, so if you make use of an index, it's reasonable to expect that it's going cost money because of resources consumed.

Firestore full collection update for schema change

I am attempting to figure out a solid strategy for handling schema changes in Firestore. My thinking is that schema changes would often require reading and then writing to every document in a collection (or possibly documents in a different collection).
Here are my concerns:
I don't know how large the collection will be in the future. Will I hit any limitations on how many documents can be read in a single query?
My current plan is to run the schema change script from Cloud Build. Is it possible this will timeout?
What is the most efficient way to do the actual update? (e.g. read document, write update to document, repeat...)
Should I be using batched writes?
Also, feel free to tell me if you think this is the complete wrong approach to implementing schema changes, and suggest a better solution.
I don't know how large the collection will be in the future. Will I hit any limitations on how many documents can be read in a single query?
If the number of documents gets too large to handle in a single query, you can start paginating the results.
My current plan is to run the schema change script from Cloud Build. Is it possible this will timeout?
That's impossible to say at this moment.
What is the most efficient way to do the actual update? (e.g. read document, write update to document, repeat...)
If you need the existing contents of a document to determine its new contents, then you'll indeed need to read it. If you don't need the existing contents, all you need is the path, and you can consider using the Node.js API to only retrieve the document IDs.
Should I be using batched writes?
Batched writes have no performance advantages. In fact, they're often slower than sending the individual update calls in parallel from your code.

Firestore: Reading data with references do increase in number of requests?

When documents on firestore is read, firestore wont give references data, if any. so currently I am requesting firestore for data from reference path. Do this increase in number of requests to server, eventually decrease in performance and increase in pricing ? How storing references is helpful in terms of requesting data from server ?
Reading a document that has a reference counts as a read of that document. Reading the referenced document count as a read of another document. So in total that is two reads.
There is no hidden cost-inflation here: if the server were to automatically follow the reference, it would also have to read both documents.
If you're looking to minimize the number of documents you read, you can consider adding the minimum data you need from the referenced document into the document containing the reference. For example, if you have a chat app:
you might want to include the display name of each user posting the message in the message itself, so that you don't have to read the user's profile document.
if you do so, you'll have to consider what to do if the user updates their display name. See my answer here for some options: How to write denormalized data in Firebase
the number of users is likely smaller than the number of chat messages (and rather limited in a specific time-frame), making the number of reads of linked documents lower than the number of messages.
by duplicating the data, you may be inflating the bandwidth usage, especially if the number of users is much lower than the number of messages.
What this boils down to is: you're likely optimizing prematurely, but even if not: there's no one-size-fits-all approach. NoSQL data modeling depends on the use-cases of your app, and Firestore is no different.