How do you sync many writable MongoDB databases? - mongodb

What I mean by witable is that you can CRUD on each database, and it automatically syncs with the other so that all of them are synced all the time (as much as possible).
I want to start a project for a company with some tricks.
The company is present in many locations (at least 5) and wants the app to run locally (with local database), but when there's a change(Create Update or Delete), the change is propagated to the other databases.
The goal is to have them all synced at every moment, but with the possibility that if internet connection is lost on one site, they continue to use the app properly since they are actually connected to the local database. That's why they don't one a totally online database.
They use MongoDB.
I saw the replica sets technology, but since it's with a unique master, it seems complicated.
Please can you share solutions to such a situation?

Related

Moodle asynchronous replication

I'm looking to deploy moodle in the cloud however I have some 50 odd sites which require access to this moodle possibly even temporarily offline. So I'm looking into replicating moodle down onto each site. From what I understand there are 2 data stores that require replication, moodledata and the database, postgresql in our case. moodledata if I'm not mistaken contains the multimedia data and the database among other things all the user records. Luckily the multimedia data will be centralized and is thus synched only one way down to the nodes, that seems doable. Where I'm stuck is how do I handle the Postgres database where the sync will need to be bidirectional?

Database and application design to guarantee application availability when internet connection temporary lost

I want to create a web application for restaurants and because of business model reasons it should be a online web application (on the cloud). So a restaurant can have an account on the app and it creates its own menu and adds its waiters and the cook.
The waiters should be able to access the menu and place orders all the time. My main issue which i should decide how to go about is:
"How can i grant fulltime availability to the waiters or the cook even when the internet connection is lost for several seconds to several minutes or even hours during the day"
I was thinking of installing the app in a sever in the local network of the restaurant which takes over the responsibility of the could server when there is no internet connection which means all orders are saved in the DB of the local server. And as soon as the connection is back the local DB is synced to the cloud DB (i was told Postgresql might have plugins supporting this, via on-premise or sth similar). Which means local DB records should be pushed to the cloud DB.
Can someone give me a hint on what tech (open source and no enterprise solutions pls) to use, to accomplish the DB syncing when internet goes on and off.
Am i on the right track or completely off with what i suggested previously?

Sync two offline masters when network available

I have a use case where I need to set up two physical stations at a venue. Each station will be running a couple of app servers and a mongodb server.
I can't rely on the venue's internet access so I need my app to be able to work offline and "sync" the dbs every once in a while.
I initially thought about having two masters that would somehow sync with a remote one but TIL that master-master replication is not possible with mongodb.
I've read about the active-active approach, however, that won't let me write to a different shard when offline.
I'm running out of ideas, any recommendation would be greatly appreciated.
------ Update on what I'm trying to achieve:
I'm working with a venue that has two entrances. The idea is to be able to capture some information from people attending the events (name, email, etc). After getting registered we will print a name tag with some of the info.
Everything sounds pretty easy, however, if possible, I would like to not rely on the venue's network (internet). So that's where I started struggling figuring out whats the best approach. I guess what I want is being able to have a remote mongo but if the network goes down somehow keep saving records locally and send them to the remote mongo instance when network is available again.
Extra considerations:
- Events last a couple of days, some people lose their name tag overnight, they should be able to go to either of the entrances and get it reprinted. So we should be able to find their info even if they registered in entrance A but they are asking for a reprint in entrance B.
More questions:
- Am I overthinking it? Maybe venue's network + a 4G/LTE modem as a backup should be enough? I would prefer not relying on it tho.
I believe you're overthinking things. Here's what I would do if faced with a similar situation:
From the description, it doesn't sound like the two sites need to be connected in real time at all. I would create a server on Entry A, another in Entry B, and consolidate their data each day after the day ended if required. This is because:
It's unlikely that one person will register in both sites within a single day. If they lost their tag on that day, I'll just tell them to go back to where they registered earlier and get it reprinted there. Worst case, you'll create a duplicate entry (should be obvious which is the duplicate since no one would lose their tag within seconds) but I would not anticipate hundreds of people all lost their tags within a day.
If the attendee lost their tag overnight, both servers will have synced data and should be able to reprint.
If you're concerned about the venue's Wifi access, just run cables from the server to the printing stations.
Personally, I would argue that the overnight sync is not really needed at all (see the likelihood of people registering twice). I would just collect the data from both servers after the event ended. That is, unless you have specific needs for the combined data from both entries during the 2nd day.
Note: please make sure you're running a minimum of 3-node replica set. Running a standalone instance for prod environment is not recommended. Hardware/disk corruption is a common event.

Realtime Content and Database Replication for Alfresco

I'm currently researching on how to replicate physical files (store contents) and the meta data (database) of Alfresco. This is of course a safety measure in case of server failure or whatsoever.
Currently i am running Alfresco's Database on PostgreSQL Engine, and by far, have learned PostgreSQL's WAL and Stream replication. Of which i believe, i can use in terms on replicating Alfresco's meta data (database) real time.
The next problem i face now, is as to how i can replicate alfresco's repository/physical files (store contents) in real time ?
i am currently looking at Alfresco's Built-in Replication Job. But as far as i have read, it is "scheduled" and not in real time. And, it needs another instance of Alfresco running on the "SLAVE" Server.
So my question is:
Does Alfresco's Built-in Replication Job cover both the Physical/Repository Files (store contents) and meta data (database) contents ?
or
what is/are the viable ways of replication Alfresco's Physical/Repository Files (store contents) and meta data (database) contents in real time ?
The replication service can be used to replicate objects from one Alfresco server to another at the object level, not the file system and database level. So, of course there are files and database records that are created when an object is replicated, but the those are by-products of the object being created in the replication target.
The replication service is really used to make it easier for objects in a particular path to be read by people in another office. When they read the object they get it locally. When they click "Edit" in Share they will be redirected back to the source Alfresco server.
Long story short, the replication service is in no way, shape, or form, to be used to replicate data for backup or disaster recovery.
If you are running on EC2 or a local filer that supports it, it should be enough to take volume snapshots.
Otherwise, you could use something like rsync scheduled with cron.
But this approach sounds risky. I'm not sure how you will ensure that your database is kept in sync with your file system, which is a requirement for your Alfresco repo to remain consistent.

PostgreSQL - Periodically copying data from one database to another

I'm trying to set up an architecture with 2 databases, say preview and live, that have the exact same schemas. The use case is that edits can be made to the preview database and then pushed to the live database after they are vetted and approved. The production application would read from the live database.
What would be the most appropriate way to push all data from the preview database to the live database without bringing the live database down? Ideally the copy from preview to live would be an atomic transaction.
I've worked with this type of setup in MSSQL, but I'm fairly new to Postgres. So I'm open to hearing other ways to architect this (with Schemas perhaps?).
EDIT: The main reason to use separate databases is that I may need more than 1 target database (not just a single "live" database). I also may need to switch target databases on the fly without altering the source database schema.
I think what you're looking for is a "hot standby". This would be a separate instance of Postgresql, possibly on the same server but usually not, which is a near-real-time replica of the primary server.
In broad strokes, this is done by shipping the binary transaction logs from the primary server to the backup server, and then "replaying" them there. The exact mechanism for transmitting the logs may vary depending on your requirements.
Fortunately, the docs on this are excellent:
https://www.postgresql.org/docs/9.3/static/warm-standby.html
https://www.postgresql.org/docs/9.0/static/hot-standby.html