Mongoose how to listen for collection changes - mongodb

I need to build a mongo updater process to dowload mongodb data to local IoT devices (configuration data, etc.)
My goal is to watch for some mongo collections in a fixed interval (1 minute, for example). If I have changed a collection (deletion, insertion or update) I will download the full collection to my device. The collections will have no more than a few hundred simple records, so it´s gonna not be a lot of data to download.
Is there any mechanism to find out a collection has changed since last pool ? What mongo features should be used in that case ?

To listen for changes to your MongoDB collection, set up a Mongoose Model.watch.
const PersonModel = require('./models/person')
const personEventEmitter = PersonModel.watch()
personEventEmitter.on('change', change => console.log(JSON.stringify(change)))
const person = new PersonModel({name: 'Thabo'})
person.save()
// Triggers console log on change stream
// {_id: '...', operationType: 'insert', ...}
Note: This functionality is only available on a MongoDB Replicaset
See Mongoose Model Docs for more:
If you want to listen for changes to your DB, use Connection.watch.
See Mongoose Connection Docs for more
These functions listen for Change Events from MongoDB Change Streams as of v3.6

I think best solution would be using post update middleware.
You can read more about that here
http://mongoosejs.com/docs/middleware.html

I have the same demand on an embedded that works quite autonomously, and it is always necessary to auto adjust your operating parameters without having to reboot your system.
For this I created a configuration manager class, and in its constructor I coded a "parameter monitor", which checks the database only the parameters that are flagged for it, of course if a new configuration needs to be monitored, I inform the config -manager in another part of the code to reload such an update.
As you can see the process is very simple, and of course can be improved to avoid overloading the config-manager with many updates and also prevent them from overlapping with a very small interval.
Since there are many settings to be read, I open a cursor for a query as soon as the database is connected and opened. As data streaming sends me new data, I create a proxy for it so that it can be manipulated according to the type and internal details of Config-manager. I then check if the property needs to be monitored, if so, I call an inner-function called watch that I created to handle this, and it queries the subproject of the same name to see what default time it takes to check in the database by updates, and thus registers a timeout for that task, and each check recreates the timeout with the updated time or interrupts the update if watch no longer exists.
this.connection.once('open', () => {
let cursor = Config.find({}).cursor();
cursor.on('data', (doc) => {
this.config[doc.parametro] = criarProxy(doc.parametro, doc.valor);
if (doc.watch) {
console.log(sprintf("Preparando para Monitorar %s", doc.parametro));
function watch(configManager, doc) {
console.log("Monitorando parametro: %s", doc.parametro);
if (doc.watch) setTimeout(() => {
Config.findOne({
parametro: doc.parametro
}).then((doc) => {
console.dir(doc);
if (doc) {
if (doc.valor != configManager.config[doc.parametro]) {
console.log("Parametro monitorado: %(parametro)s, foi alterado!", doc);
configManager.config[doc.parametro] = criarProxy(doc.parametro, doc.valor);
} else
console.log("Parametro monitorado %{parametro}s, não foi alterado", doc);
watch(configManager, doc);
} else
console.log("Verifique o parametro: %s")
})
},
doc.watch)
}
watch(this, doc);
}
});
cursor.on('close', () => {
if (process.env.DEBUG_DETAIL > 2) console.log("ConfigManager closed cursor data");
resolv();
});
cursor.on('end', () => {
if (process.env.DEBUG_DETAIL > 2) console.log("ConfigManager end data");
});
As you can see the code can improve a lot, if you want to give suggestions for improvements according to your environment or generics please use the gist: https://gist.github.com/carlosdelfino/929d7918e3d3a6172fdd47a59d25b150

Related

Is a single Firestore write operation guaranteed to be atomic?

I have a Chat document that represents a chat between two users. It starts out empty, and eventually looks like this:
// chats/CHAT_ID
{
users: {
USER_ID1: true,
USER_ID2: true
},
lastAddedUser: USER_ID2
}
Each user is connected to a different Cloud Run container via websockets.
I would like to send a welcome message to both users once the second user connected. This message must be sent exactly once.
When a user sends a "connected" message to its websocket, the container performs something like the following:
// Return boolean reflecting whether the current container should emit the welcome message to both users
async addUserToChat(userId) {
// Write operation
await this.chatDocRef.set({ activeUsers: { [userId]: true }, lastAddedUser: userId, { merge: true })
// Read operation
const chatSnap = await this.chatDocRef.get();
const chatData = chatSnap.data();
return chatData.users.length === 2 && chatData.lastAddedUser === userId;
}
And there is a working mechanism that allows container A to send a message to a user connected to container B.
The issue is that sometimes, each container ends up concluding that it is the one that should send the welcome message to both users.
I am unclear as to why that would happen given Firestore's "immediately consistency model" (per this). The only explanation I can think of that allows racing condition is that write operations involving multiple fields are not guaranteed to be atomic. So this:
await this.chatDocRef.set({ activeUsers: { [userId]: true }, lastAddedUser: userId, { merge: true })
actually performs two separate updates for activeUsers and lastAddedUser, opening the possibility for a scenario where after partial update of activeUsers by container A, container B completes the write and read operations before container A overwrites lastAddedUser.
But this sounds wrong.
Can anyone shed light on why racing conditions might occur?
I no longer have racing conditions if I base the logic on the server timestamps instead of the lastAddedUser field.
The document is now simpler:
// chats/CHAT_ID
{
users: {
USER_ID1: true,
USER_ID2: true
}
}
And the function looks like this:
// Return boolean reflecting whether the current container should emit the welcome message to both users
async addUserToChat(userId) {
// Write operation
const writeResult = await this.chatDocRef.set({ activeUsers: { [userId]: true }, { merge: true })
// Read operation
const chatSnap = await this.chatDocRef.get();
const chatData = chatSnap.data();
return chatData.users.length === 2 && writeResult.writeTime.isEqual(chatSnap.updateTime);
}
In other words, the condition for sending the welcome message now becomes: the executing container is the container responsible for the update that resulted in having two users.
While the problem is solved, I am still unclear as to why relying on document data (instead of server metadata) opens up the possibility for racing conditions to occur. If anyone knows the explanation behind this phenomenon, please add an answer and I'll accept it as the solution to this question.

RN Web + Firebase: snapshot listeners unsubscribe in unmount vs in global unmount (using context)

In short: which is most memory + cost efficient way to use Firestore snapshot listeners, unmount them always at screen unmount or have the unsubscribe function in context and unmount when whole site "unmounts"?
Lets say in home screen I use snapshot listener for collection "events" which has 100 documents. Now I navigate through the site and return to home screen 2 more times during using the site. In this case which is better memory and cost efficiently wise (is there also other things to consider) and is there drawbacks?
to mount and unmount the listener on each mount and unmount of the home screen.
to mount on home screen and to unmount in whole site "unmount" (for example using window.addEventListener('beforeunload', handleSiteClose).
The usage of first is probably familiar with most but usage of the second could be done with something like this:
-Saving listener unsubscribe function in context with collection name as key:
const { listenerHolder, setListenerHolder } = DataContext();
useEffect(() => {
const newListeners = anyDeepCopyFunction(listenerHolder);
const collection = 'events';
if (listenerHolder[collection] === undefined) {
//listenerBaseComponent would be function to establish listener and return unsubscribe function
const unSub = listenerBaseComponent();
if (unSub)
newListeners[collection] = unSub;
}
if (Object.entries(newListeners).length !== Object.entries(listenerHolder).length) {
setListenerHolder(newListeners);
}
}, []);
-Unmounting all listeners (in component that holds inside of it all screens and is unmounted only when whole site is closed):
const { listenerHolder, setListenerHolder } = DataContext();
const handleTabClosing = () => {
Object.entries(listenerHolder).forEach(item => {
const [key, value] = item;
if (typeof value === 'function')
value();
});
setListenerHolder({});
}
useEffect(() => {
window.addEventListener('beforeunload', handleTabClosing)
return () => {
window.removeEventListener('beforeunload', handleTabClosing)
}
})
In both cases the home screen is showing most recent from "events" collection, but in my understanding...
-The first approach creates listener 3 times to collection "events" and so 3 x 100 read operations are done.
-The second approach creates listener 1 time to collection "events" and so 1 x 100 read operations are done.
If storing the unsubscribe function to context is possible and all listener unsubscribtions are handled at once in site unmount or in logout, doesn't this make using it this way super easy, more maintainable and more cost efficient? If I would need to see data from "events" collection in any other screen I would not have to do get call / create a new listener, because I would always have latest data from "events" when site is used. Just check if there is (in this case) collection name as key in global state "listenerHolder", and if there is, there would be most up to date data always for events.
Since there wasn't information from others about this use case I made some testing myself jumping from this "homescreen" to another screen back and forth multiple times. "Homescreen" has about 150 items and second screen 65.
The results are from Firebase, Cloud Firestore usage tab:
This is the result of reads from that jumping: 654(1.52pm-1.53pm) + 597(1.53pm-1.54pm) = 1251 reads
Now I tested the same jumping back and forth when using global context listeners: 61(1.59pm-2.00pm) + 165(2.00pm-2.01pm) = 226 reads
So using listeners in global context will result significantly less reads. This is depending how many times new listeners (in normal use case) would need to be recreated.
I have not yet tested well enough memory usage comparing these two cases. But if I test it, I will add the results here for others to benefit.

Is it necessary to close a Mongodb Change Stream?

I coded the next Node/Express/Mongo script:
const { MongoClient } = require("mongodb");
const stream = require("stream");
async function main() {
// CONECTING TO LOCALHOST (REPLICA SET)
const client = new MongoClient("mongodb://localhost:27018");
try{
// CONECTION
await client.connect();
// EXECUTING MY WATCHER
console.log("Watching ...");
await myWatcher(client, 15000);
} catch (e) {
// ERROR MANAGEMENT
console.log(`Error > ${e}`);
} finally {
// CLOSING CLIENT CONECTION ???
await client.close(); << ????
}
}main().catch(console.error);
// MY WATCHER. LISTENING CHANGES FROM MY DATABASE
async function myWatcher(client, timeInMs, pipeline = []) {
// TARGET TO WATCH
const watching = client.db("myDatabase").collection("myCollection").watch(pipeline);
// WATCHING CHANGES ON TARGET
watching.on("change", (next) => {
console.log(JSON.stringify(next));
console.log(`Doing my things...`);
});
// CLOSING THE WATCHER ???
closeChangeStream(timeInMs, watching); << ????
}
// CHANGE STREAM CLOSER
function closeChangeStream(timeInMs = 60000, watching) {
return new Promise((resolve) => {
setTimeout(() => {
console.log("Closing the change stream");
watching.close();
resolve();
}, timeInMs);
});
}
So, the goal is to keep always myWatcher function in an active state, to watch any database changes and for example, send an user notification when is detected some updating. The closeChangeStream function close myWatcher function in X seconds after any database changes. So, to keep the myWatcher always active, do you recomment not to use the closeChangeStream function ??
Another thing. With this goal in mind, to keep always myWatcher function in an active state, if I keep the await client.close();, my code emits an error: Topology is closed, so when I ignore this await client.close(), my code works perfectly. Do you recomment not to use the await client.close() function to keep always myWatcher function in an active state ??
Im a newbee in this topics !
thanks for the advice !
Thanks for help !
MongoDB change streams are implemented in a pub/sub paradigm.
Send your application to a friend in the Sudan. Have both you and your friend run the application (that has the change stream implemented). If you open up mongosh and run db.getCollection('myCollection').updateOne({_id: ObjectId("6220ee09197c13d24a7997b7")}, {FirstName: Bob}); both you and your friend will get the console.log for the change stream.
This is assuming you're not running localhost, but you can simulate this with two copies of the applications locally.
The issue comes from going into production and suddenly you have 200 load bearers, 5 developers, etc. running and your watch fires a ton of writes around the globe.
I believe, the practice is to functionize it. Wrap your watch in a function and fire the function when you're about to do a write (and close after you do your associated writes).

Azure Mobile Services for Xamarin Forms - Conflict Resolution

I'm supporting a production Xamarin Forms app with offline sync feature implemented using Azure Mobile Services.
We have a lot of production issues related to users losing data or general instability that goes away if the reinstall the app. After having a look through, I think the issues are around how the conflict resolution is handled in the app.
For every entity that tries to sync we handle MobileServicePushFailedException and then traverse through the errors returned and take action.
catch (MobileServicePushFailedException ex)
{
foreach (var error in ex.PushResult.Errors) // These are MobileServiceTableOpearationErrors
{
var status = error.Status; // HttpStatus code returned
// Take Action based on this status
// If its 409 or 412, we go in to conflict resolving and tries to decide whether the client or server version wins
}
}
The conflict resolving seems too custom to me and I'm checking to see whether there are general guidelines.
For example, we seem to be getting empty values for 'CreatedAt' & 'UpdatedAt' timestamps for local and server versions of the entities returned, which is weird.
var serverItem = error.Result;
var clientItem = error.Item;
// sometimes serverItem.UpdatedAt or clientItem.UpdatedAt is NULL. Since we use these 2 fields to determine who wins, we are stumped here
If anyone can point me to some guideline or sample code on how these conflicts should be generally handled using information from the MobileServiceTableOperationError, that will be highly appreciated
I came across the following code snippet from the following doc.
// Simple error/conflict handling.
if (syncErrors != null)
{
foreach (var error in syncErrors)
{
if (error.OperationKind == MobileServiceTableOperationKind.Update && error.Result != null)
{
//Update failed, reverting to server's copy.
await error.CancelAndUpdateItemAsync(error.Result);
}
else
{
// Discard local change.
await error.CancelAndDiscardItemAsync();
}
Debug.WriteLine(#"Error executing sync operation. Item: {0} ({1}). Operation discarded.",
error.TableName, error.Item["id"]);
}
}
Surfacing conflicts to the UI I found in this doc
private async Task ResolveConflict(TodoItem localItem, TodoItem serverItem)
{
//Ask user to choose the resolution between versions
MessageDialog msgDialog = new MessageDialog(
String.Format("Server Text: \"{0}\" \nLocal Text: \"{1}\"\n",
serverItem.Text, localItem.Text),
"CONFLICT DETECTED - Select a resolution:");
UICommand localBtn = new UICommand("Commit Local Text");
UICommand ServerBtn = new UICommand("Leave Server Text");
msgDialog.Commands.Add(localBtn);
msgDialog.Commands.Add(ServerBtn);
localBtn.Invoked = async (IUICommand command) =>
{
// To resolve the conflict, update the version of the item being committed. Otherwise, you will keep
// catching a MobileServicePreConditionFailedException.
localItem.Version = serverItem.Version;
// Updating recursively here just in case another change happened while the user was making a decision
UpdateToDoItem(localItem);
};
ServerBtn.Invoked = async (IUICommand command) =>
{
RefreshTodoItems();
};
await msgDialog.ShowAsync();
}
I hope this helps provide some direction. Although the Azure Mobile docs have been deprecated, the SDK hasn't changed and should still be relevant. If this doesn't help, let me know what you're using for a backend store.

Vertx CompositeFuture

I am working on a solution where I am using vertx 3.8.4 and vertx-mysql-client 3.9.0 for asynchronous database calls.
Here is the scenario that I have been trying to resolve, in a proper reactive manner.
I have some mastertable records which are in inactive state.
I run a query and get the list of records from the database.
This I did like this :
Future<List<Master>> locationMasters = getInactiveMasterTableRecords ();
locationMasters.onSuccess (locationMasterList -> {
if (locationMasterList.size () > 0) {
uploadTargetingDataForAllInactiveLocations(vertx, amazonS3Utility,
locationMasterList);
}
});
Now in uploadTargetingDataForAllInactiveLocations method, i have a list of items.
What I have to do is, I need to iterate over this list, for each item, I need to download a file from aws, parse the file and insert those data to db.
I understand the way to do it using CompositeFuture.
Can someone from vertx dev community help me with this or with some documentation available ?
I did not find good contents on this by googling.
I'm answering this as I was searching for something similar and I ended up spending some time before finding an answer and hopefully this might be useful to someone else in future.
I believe you want to use CompositeFuture in vertx only if you want to synchronize multiple actions. That means that you either want an action to execute in the case that either all your other actions on which your composite future is built upon succeed or at least one of the action on which your composite future is built upon succeed.
In the first case I would use CompositeFuture.all(List<Future> futures) and in the second case I would use CompositeFuture.any(List<Future> futures).
As per your question, below is a sample code where a list of item, for each item we run an asynchronous operation (namely downloadAnProcessFile()) which returns a Future and we want to execute an action doAction() in the case that all the async actions succeeded:
List<Future> futures = new ArrayList<>();
locationMasterList.forEach(elem -> {
Promise<Void> promise = Promise.promise();
futures.add(promise.future());
Future<Boolean> processStatus = downloadAndProcessFile(); // doesn't need to be boolean
processStatus.onComplete(asyncProcessStatus -> {
if (asyncProcessStatus.succeeded()){
// eventually do stuff with the result
promise.complete();
} else {
promise.fail("Error while processing file whatever");
}
});
});
CompositeFuture.all(futures).onComplete(compositeAsync -> {
if (compositeAsync.succeeded()){
doAction(); // <-- here do what you want to do when all future complete
} else {
// at least 1 future failed
}
});
This solution is probably not perfect and I suppose can be improved but this is what I found works for me. Hopefully will work for someone else.