Is a single Firestore write operation guaranteed to be atomic? - google-cloud-firestore

I have a Chat document that represents a chat between two users. It starts out empty, and eventually looks like this:
// chats/CHAT_ID
{
users: {
USER_ID1: true,
USER_ID2: true
},
lastAddedUser: USER_ID2
}
Each user is connected to a different Cloud Run container via websockets.
I would like to send a welcome message to both users once the second user connected. This message must be sent exactly once.
When a user sends a "connected" message to its websocket, the container performs something like the following:
// Return boolean reflecting whether the current container should emit the welcome message to both users
async addUserToChat(userId) {
// Write operation
await this.chatDocRef.set({ activeUsers: { [userId]: true }, lastAddedUser: userId, { merge: true })
// Read operation
const chatSnap = await this.chatDocRef.get();
const chatData = chatSnap.data();
return chatData.users.length === 2 && chatData.lastAddedUser === userId;
}
And there is a working mechanism that allows container A to send a message to a user connected to container B.
The issue is that sometimes, each container ends up concluding that it is the one that should send the welcome message to both users.
I am unclear as to why that would happen given Firestore's "immediately consistency model" (per this). The only explanation I can think of that allows racing condition is that write operations involving multiple fields are not guaranteed to be atomic. So this:
await this.chatDocRef.set({ activeUsers: { [userId]: true }, lastAddedUser: userId, { merge: true })
actually performs two separate updates for activeUsers and lastAddedUser, opening the possibility for a scenario where after partial update of activeUsers by container A, container B completes the write and read operations before container A overwrites lastAddedUser.
But this sounds wrong.
Can anyone shed light on why racing conditions might occur?

I no longer have racing conditions if I base the logic on the server timestamps instead of the lastAddedUser field.
The document is now simpler:
// chats/CHAT_ID
{
users: {
USER_ID1: true,
USER_ID2: true
}
}
And the function looks like this:
// Return boolean reflecting whether the current container should emit the welcome message to both users
async addUserToChat(userId) {
// Write operation
const writeResult = await this.chatDocRef.set({ activeUsers: { [userId]: true }, { merge: true })
// Read operation
const chatSnap = await this.chatDocRef.get();
const chatData = chatSnap.data();
return chatData.users.length === 2 && writeResult.writeTime.isEqual(chatSnap.updateTime);
}
In other words, the condition for sending the welcome message now becomes: the executing container is the container responsible for the update that resulted in having two users.
While the problem is solved, I am still unclear as to why relying on document data (instead of server metadata) opens up the possibility for racing conditions to occur. If anyone knows the explanation behind this phenomenon, please add an answer and I'll accept it as the solution to this question.

Related

Flutter Future timeouts not always working correctly

Hey I need some help here for How to use timeouts in flutter correctly. First of all to explain what the main goal is:
I want to recive data from my Firebase RealTime Database but need to secure this request api call with an time out of 15 sec. So after 15 sec my timeout should throw an exception that will return to the Users frontend the alert for reasons of time out.
So I used the simple way to call timeouts on future functions:
This functions should only check if on some firebase node an ID is existing or not:
Inside this class where I have declared this functions I also have an instance which called : timeoutControl this is a class which contains a duration and some reasons for the exceptions.
Future<bool> isUserCheckedIn(String oid, String maybeCheckedInUserIdentifier, String onGateId) async {
try {
databaseReference = _firebaseDatabase.ref("Boarding").child(oid).child(onGateId);
final snapshot = await databaseReference.get().timeout(Duration(seconds: timeoutControl.durationForTimeOutInSec), onTimeout: () => timeoutControl.onEppTimeoutForTask());
if(snapshot.hasChild(maybeCheckedInUserIdentifier)) {
return true;
}
else {
return false;
}
}
catch (exception) {
return false;
}
}
The TimeOutClass where the instance timeoutControl comes from:
class CustomTimeouts {
int durationForTimeOutInSec = 15; // The seconds for how long to try until we throw an timeout exception
CustomTimeouts();
// TODO: Implement the exception reasons here later ...
onEppTimeoutForUpload() {
throw Exception("Some reason ...");
}
onEppTimeoutForTask() {
throw Exception("Some reason ...");
}
onEppTimeoutForDownload() {
throw Exception("Some reason ...");
}
}
So as you can see for example I tried to use this implementation above. This works fine ... sometimes I need to fight with un explain able things -_-. Let me try to introduce what in somecases are the problem:
Inside the frontend class make this call:
bool isUserCheckedIn = await service.isUserCheckedIn(placeIdentifier, userId, gateId);
Map<String, dynamic> data = {"gateIdActive" : isUserCheckedIn};
/*
The response here is an Custom transaction handler which contains an error or an returned param
etc. so this isn't relevant for you ...
*/
_gateService.updateGate(placeIdentifier, gateId, data).then((response) {
if(response.hasError()) {
setState(() {
EppDialog.showErrorToast(response.getErrorMessage()); // Shows an error message
isSendButtonDiabled = false; /*Reset buttons state*/
});
}
else {
// Create an gate process here ...
createGateEntrys(); // <-- If the closures update was successful we also handle some
// other data inside the RTDB for other reasons here ...
}
});
IMPORTANT to know for you guys is that I am gonna use the returned "boolean" value from this function call to update some other data which will be pushed and uploaded into another RTDB other node location for other reasons. And if this was also successful the application is going on to update some entrys also inside the RTDB -->createGateEntrys()<-- This function is called as the last one and is also marked as an async function and called with its closures context and no await statement.
The Data inside my Firebase RTDB:
"GateCheckIns" / "4mrithabdaofgnL39238nH" (The place identifier) / "NFdxcfadaies45a" (The Gate Identifier)/ "nHz2mhagadzadzgadHjoeua334" : 1 (as top of the key some users id who is checked in)
So on real devices this works always without any problems... But the case of an real device or simulator could not be the reason why I'am faceing with this problem now. Sometimes inside the Simulator this Function returns always false no matter if the currentUsers Identifier is inside the this child nodes or not. Therefore I realized the timeout is always called immediately so right after 1-2 sec because the exception was always one of these I was calling from my CustomTimeouts class and the function which throws the exception inside the .timeout(duration, onTimeout: () => ...) call. I couldn't figure it out because as I said on real devices I was not faceing with this problem.
Hope I was able to explain the problem it's a little bit complicated I know but for me is important that someone could explain me for what should I pay attention to if I am useing timeouts in this style etc.
( This is my first question here on StackOverFlow :) )

How can I do an offline batch in Firebase RTDB?

I have reason to believe that some ServerValue.increment() commands are not executing.
In my App, when the user submits, two commands are executed:
Future<void> _submit() async {
alimentoBloc.descontarAlimento(foodId, quantity);
salidaAlimentoBloc.crearSalidaAlimento(salidaAlimento);
}
The first command updates the amount of inventory left in the warehouse (using ServerValue.increment)...
Future<bool> descontarAlimento(String foodId, int quantity) async {
try {
dbRef.child('inventory/$foodId/quantity')
.set(ServerValue.increment(-quantity));
} catch (e) {
print(e);
}
return true;
}
The second command makes a food output register, where it records the quantity, type of food and other key data.
Future<bool> crearSalidaAlimento(SalidaAlimentoModel salidaAlimento) async {
try {
dbRef.child('output')
.push().set(salidaAlimento.toJson());
} catch (e) {
print(e);
}
return true;
}
After several reviews, I have noticed that the increase command is not executed sometimes, and then the inventory does not correspond to what it should be.
Then, I would like to do something similar to a transaction, this is: If neither of the two commands is executed, do not execute either of the two.
Is it possible to do a batch of commands in Firebase Realtime without losing the offline functionalities?
You can do a multi-path update to perform both writes transactionally:
var id = dbRef.push().key;
Map<String, dynamic> updates = {
"inventory/$foodId/quantity": ServerValue.increment(-quantity),
"output/$id": salidaAlimento.toJson()
}
dbRef.update(updates);
With the above, either both writes are completed, or neither of them is.
While you're offline, the client will fire local events based on its best guess for the current value of the server (which is gonna be 0 if it never read the value), and it will then send all pending changes to the server when it reconnects. For a quick test, see https://jsbin.com/wuhuyih/2/edit?js,console
You can't use a transaction while the device is offline.
They need to check the current value on the database and that is not possible while offline. If you want to make sure that they succeed you would need to check if a connection is awailable or not.

Is it necessary to close a Mongodb Change Stream?

I coded the next Node/Express/Mongo script:
const { MongoClient } = require("mongodb");
const stream = require("stream");
async function main() {
// CONECTING TO LOCALHOST (REPLICA SET)
const client = new MongoClient("mongodb://localhost:27018");
try{
// CONECTION
await client.connect();
// EXECUTING MY WATCHER
console.log("Watching ...");
await myWatcher(client, 15000);
} catch (e) {
// ERROR MANAGEMENT
console.log(`Error > ${e}`);
} finally {
// CLOSING CLIENT CONECTION ???
await client.close(); << ????
}
}main().catch(console.error);
// MY WATCHER. LISTENING CHANGES FROM MY DATABASE
async function myWatcher(client, timeInMs, pipeline = []) {
// TARGET TO WATCH
const watching = client.db("myDatabase").collection("myCollection").watch(pipeline);
// WATCHING CHANGES ON TARGET
watching.on("change", (next) => {
console.log(JSON.stringify(next));
console.log(`Doing my things...`);
});
// CLOSING THE WATCHER ???
closeChangeStream(timeInMs, watching); << ????
}
// CHANGE STREAM CLOSER
function closeChangeStream(timeInMs = 60000, watching) {
return new Promise((resolve) => {
setTimeout(() => {
console.log("Closing the change stream");
watching.close();
resolve();
}, timeInMs);
});
}
So, the goal is to keep always myWatcher function in an active state, to watch any database changes and for example, send an user notification when is detected some updating. The closeChangeStream function close myWatcher function in X seconds after any database changes. So, to keep the myWatcher always active, do you recomment not to use the closeChangeStream function ??
Another thing. With this goal in mind, to keep always myWatcher function in an active state, if I keep the await client.close();, my code emits an error: Topology is closed, so when I ignore this await client.close(), my code works perfectly. Do you recomment not to use the await client.close() function to keep always myWatcher function in an active state ??
Im a newbee in this topics !
thanks for the advice !
Thanks for help !
MongoDB change streams are implemented in a pub/sub paradigm.
Send your application to a friend in the Sudan. Have both you and your friend run the application (that has the change stream implemented). If you open up mongosh and run db.getCollection('myCollection').updateOne({_id: ObjectId("6220ee09197c13d24a7997b7")}, {FirstName: Bob}); both you and your friend will get the console.log for the change stream.
This is assuming you're not running localhost, but you can simulate this with two copies of the applications locally.
The issue comes from going into production and suddenly you have 200 load bearers, 5 developers, etc. running and your watch fires a ton of writes around the globe.
I believe, the practice is to functionize it. Wrap your watch in a function and fire the function when you're about to do a write (and close after you do your associated writes).

Mongoose how to listen for collection changes

I need to build a mongo updater process to dowload mongodb data to local IoT devices (configuration data, etc.)
My goal is to watch for some mongo collections in a fixed interval (1 minute, for example). If I have changed a collection (deletion, insertion or update) I will download the full collection to my device. The collections will have no more than a few hundred simple records, so it´s gonna not be a lot of data to download.
Is there any mechanism to find out a collection has changed since last pool ? What mongo features should be used in that case ?
To listen for changes to your MongoDB collection, set up a Mongoose Model.watch.
const PersonModel = require('./models/person')
const personEventEmitter = PersonModel.watch()
personEventEmitter.on('change', change => console.log(JSON.stringify(change)))
const person = new PersonModel({name: 'Thabo'})
person.save()
// Triggers console log on change stream
// {_id: '...', operationType: 'insert', ...}
Note: This functionality is only available on a MongoDB Replicaset
See Mongoose Model Docs for more:
If you want to listen for changes to your DB, use Connection.watch.
See Mongoose Connection Docs for more
These functions listen for Change Events from MongoDB Change Streams as of v3.6
I think best solution would be using post update middleware.
You can read more about that here
http://mongoosejs.com/docs/middleware.html
I have the same demand on an embedded that works quite autonomously, and it is always necessary to auto adjust your operating parameters without having to reboot your system.
For this I created a configuration manager class, and in its constructor I coded a "parameter monitor", which checks the database only the parameters that are flagged for it, of course if a new configuration needs to be monitored, I inform the config -manager in another part of the code to reload such an update.
As you can see the process is very simple, and of course can be improved to avoid overloading the config-manager with many updates and also prevent them from overlapping with a very small interval.
Since there are many settings to be read, I open a cursor for a query as soon as the database is connected and opened. As data streaming sends me new data, I create a proxy for it so that it can be manipulated according to the type and internal details of Config-manager. I then check if the property needs to be monitored, if so, I call an inner-function called watch that I created to handle this, and it queries the subproject of the same name to see what default time it takes to check in the database by updates, and thus registers a timeout for that task, and each check recreates the timeout with the updated time or interrupts the update if watch no longer exists.
this.connection.once('open', () => {
let cursor = Config.find({}).cursor();
cursor.on('data', (doc) => {
this.config[doc.parametro] = criarProxy(doc.parametro, doc.valor);
if (doc.watch) {
console.log(sprintf("Preparando para Monitorar %s", doc.parametro));
function watch(configManager, doc) {
console.log("Monitorando parametro: %s", doc.parametro);
if (doc.watch) setTimeout(() => {
Config.findOne({
parametro: doc.parametro
}).then((doc) => {
console.dir(doc);
if (doc) {
if (doc.valor != configManager.config[doc.parametro]) {
console.log("Parametro monitorado: %(parametro)s, foi alterado!", doc);
configManager.config[doc.parametro] = criarProxy(doc.parametro, doc.valor);
} else
console.log("Parametro monitorado %{parametro}s, não foi alterado", doc);
watch(configManager, doc);
} else
console.log("Verifique o parametro: %s")
})
},
doc.watch)
}
watch(this, doc);
}
});
cursor.on('close', () => {
if (process.env.DEBUG_DETAIL > 2) console.log("ConfigManager closed cursor data");
resolv();
});
cursor.on('end', () => {
if (process.env.DEBUG_DETAIL > 2) console.log("ConfigManager end data");
});
As you can see the code can improve a lot, if you want to give suggestions for improvements according to your environment or generics please use the gist: https://gist.github.com/carlosdelfino/929d7918e3d3a6172fdd47a59d25b150

RXJS : Idiomatic way to create an observable stream from a paged interface

I have paged interface. Given a starting point a request will produce a list of results and a continuation indicator.
I've created an observable that is built by constructing and flat mapping an observable that reads the page. The result of this observable contains both the data for the page and a value to continue with. I pluck the data and flat map it to the subscriber. Producing a stream of values.
To handle the paging I've created a subject for the next page values. It's seeded with an initial value then each time I receive a response with a valid next page I push to the pages subject and trigger another read until such time as there is no more to read.
Is there a more idiomatic way of doing this?
function records(start = 'LATEST', limit = 1000) {
let pages = new rx.Subject();
this.connect(start)
.subscribe(page => pages.onNext(page));
let records = pages
.flatMap(page => {
return this.read(page, limit)
.doOnNext(result => {
let next = result.next;
if (next === undefined) {
pages.onCompleted();
} else {
pages.onNext(next);
}
});
})
.pluck('data')
.flatMap(data => data);
return records;
}
That's a reasonable way to do it. It has a couple of potential flaws in it (that may or may not impact you depending upon your use case):
You provide no way to observe any errors that occur in this.connect(start)
Your observable is effectively hot. If the caller does not immediately subscribe to the observable (perhaps they store it and subscribe later), then they'll miss the completion of this.connect(start) and the observable will appear to never produce anything.
You provide no way to unsubscribe from the initial connect call if the caller changes its mind and unsubscribes early. Not a real big deal, but usually when one constructs an observable, one should try to chain the disposables together so it call cleans up properly if the caller unsubscribes.
Here's a modified version:
It passes errors from this.connect to the observer.
It uses Observable.create to create a cold observable that only starts is business when the caller actually subscribes so there is no chance of missing the initial page value and stalling the stream.
It combines the this.connect subscription disposable with the overall subscription disposable
Code:
function records(start = 'LATEST', limit = 1000) {
return Rx.Observable.create(observer => {
let pages = new Rx.Subject();
let connectSub = new Rx.SingleAssignmentDisposable();
let resultsSub = new Rx.SingleAssignmentDisposable();
let sub = new Rx.CompositeDisposable(connectSub, resultsSub);
// Make sure we subscribe to pages before we issue this.connect()
// just in case this.connect() finishes synchronously (possible if it caches values or something?)
let results = pages
.flatMap(page => this.read(page, limit))
.doOnNext(r => this.next !== undefined ? pages.onNext(this.next) : pages.onCompleted())
.flatMap(r => r.data);
resultsSub.setDisposable(results.subscribe(observer));
// now query the first page
connectSub.setDisposable(this.connect(start)
.subscribe(p => pages.onNext(p), e => observer.onError(e)));
return sub;
});
}
Note: I've not used the ES6 syntax before, so hopefully I didn't mess anything up here.