Postgres replication betwenn 2 databases on same server - postgresql

I need to create a replica of existing database, that would copy any changing operation from master to slave, I.e create a mirror some sort of. I found a lot of examples in web but they all describes process when master and slave are on different servers.
I would like to create a write replica on the same server where master is located , without spinning up second instance of Postgres.
Is it possible to do so and could you point me a direction where I could find a solution how to do it?
Thank you.
P.S. I understand that replication on 2 servers is better, but I just need to do it on one common server.

If you want physical replication, you will need to run two instances of PostgreSQL. If they are on the same server machine, they will need to have different port numbers. The different port numbers is the only complexity, otherwise it is just like running on two different servers.
If you want logical replication, you can do that within a single instance, but you will need to jump through some hoops to create the subscription intra-instance, as described in the "Notes" section

You could consider using a simple trigger to insert/update/delete data on the other database as soon as the main one get modified.
A more "professional" way would be to use synchronous replication.

Related

Can A Postgres Replication Publication And Subscription Exist On The Same Server

I have a request asking for a read only schema replica for a role in postgresql. After reading documentation and better understanding replication in postgresql, I'm trying to identify whether or not I can create the publisher and subscriber within the same database.
Any thoughts on the best approach without having a second server would be greatly appreciated.
You asked two different question. Same database? No. Since Pub/Sub requires tables to have the same name (including schema) on both ends, you would be trying to replicate a table onto itself. Using logical replication plugins other than the built-in one might get around this restriction.
Same server? Yes. You can replicate between two databases of the same instance (but see the note in the docs about some extra hoops you need to jump through) or between two instances on the same host. So whichever of those things you meant by "same server", yes, you can.
But it seems like an odd way to do this. If the access is read only, why does it matter whether it is to a replica of the real data or to the real data itself?

Postgres architecture for one machine with several apps

I have one machine on which several applications are hosted. Applications work on separated data and don't interact - each application only needs access to its own data. I want to use PostgreSQL as RDBMS. Which one of the following is best and why?
One global Postgres sever, one global database, one schema per application.
One global Postgres server, one database per application.
One Postgres server per application.
Feel free to suggest additional architectures if you think they would be better than the ones above.
The questions you need to ask yourself: does any application ever need to access data from another application (in the same SQL statement). If you can can answer that with a clear NO, then you should at least go for separate databases. Cross-database queries aren't that straight-forward in Postgres, so if the different applications do need a lot of data from other applications, then solution 1 might be deployment layout to think about. If this would only concern very few tables, then using foreign data wrappers with different databases might still be a better solution.
Solution 2 and 3 are more or less the same from the perspective of each application. One thing to keep in mind when deciding between 2 and 3 is availability. Some configuration changes to Postgres require a restart of the service. Is an outage of all applications acceptable in that case, even though the change was only necessary for one?
But you can always start with option 2 and then move database to different servers later.
Another question to ask is if all applications always use the same (major) Postgres version With solution 2 you must make sure that all applications are compatible with a new Postgres version if one of them wants to upgrade e.g. because of new features that the application wants to use.
Solution 1 is stupid : a SQL schema is not a database. Use SQL schema for one application that have multiple "parts" like "Poduction", "sales", "marketing", "finances"...
While the final volume of the data won't be too heavy and the number of user won't be too much, use only one PG cluster to facilitate administration tasks
If the volume of data or the number of user increases, it will be time to separates your different databases on new distinct PG clusters....

Replicating data between 2 Mongo replica sets

I'm currently have a mongo replicaset consisting of 1 primary and 2 slaves, that is used by a read-only application. I'm adding a 2nd read-only application that requires access to the same data. I have / am considering using the same RS for both applications, but was wondering if there's a way to create a specific type of configuration with Mongo, that works something like this:
1 x primary, that handles all writes, but is not seen as part of a replicaset by the application, and then 2 sets of read-only secondaries that replicate from primary. Each set of secondary replicates writes from the master. Conceptually, something like:
/----> RS1: |Secondary1|Secondary2|..|SecondaryN| <--- App1
PRIMARY|=>
\----> RS2: |Secondary1|Secondary2|..|SecondaryN| <--- App2
Is this sort of configuration possible at all? What similar architectures could I consider for this use-case?
Thanks in advance.
Brett
I came across a way to implement this using mongo tooling:
Create a replica set to use as a master. Data updates are written to this RS (represented by "PRIMARY" in the diagram). Do not enable authentication on this host
Create 2 replica sets with the same data (completely independent of each other)
Schedule regular "mongooplog" runs, using #1 as from and each of the RS's for host see the manual
Authentication can be set up in the RSs from #2, that only give applications read access to the data.
I haven't tested this yet, but from what I can tell, this approach should work for my objectives - is there anything I've overlooked in this approach?
edit
While trying to use this approach, I ran into issues when trying to use mongooplog with authentication on the destination. over and above that, mongooplog doesn't cater for authentication on the source / --from rs, and so I wrote a tool to cater for this:
https://github.com/brettcave/mongo-oplogreplay
It supports authentication on both source and destination, as well as replicaset targets.

Consolidating shard data into single persistent DB in MongoDB

We have software that generates a large amount of data in a short period of time, and is stored in a single MongoDB database. To increase write performance we are looking into setting up a sharded cluster to handle the incoming data. Because this is all being done on amazon ec2 instances, we would prefer to consolidate our data from the sharded cluster to a single persistent server once the process is done to save on cost. Obviously we can write a python script that will port the data off the cluster when done, but I am hoping there is a cleaner, more automated method. Once the data has been written, the access is all read-only and a single server can handle the workload sufficiently. I was looking for some solution combining replica sets and sharding, but that doesn't seem to to be the way those work. Any suggestions for how to best implement this architecture?
One way to migrate a MongoDB with zero downtime is to create a replica-set consisting of the old and the new servers and removing the old ones as soon as the new have synced. But that doesn't work when the old database is sharded and the new one isn't, because shards are build from replica-sets, not the other way around. That means that you have to copy the database the old-fashioned way. There are two methods to do this:
The network method: Use the command db.copyDatabase(<remote_db_name>, <local_db_name>, <remote_host>, <remote_username>, <remote_password>)
on the destination to copy the database from the source via network.
The file method: Do a mongodump on the source to export the data to a file. Then do a mongorestore on the new server to import it.

Replicate selected postgresql tables between two servers?

What would be the best way to replicate individual DB tables from a Master postgresql server to a slave machine? It can be done with cron+rsync, or with whatever postgresql might have build in, or some sort of OSS tool, but so far the postgres docs don't seem to cover how to do table replication. I'm not able to do a full DB replication because some tables have license->IP stuff connected to it, and I can't replicate those on the slave machine. I don't need instant replication, hourly would be acceptable as well.
If I need to just rsync, can someone help identify what files within the /var/lib/pgsql directory would need to be synced, or how I would know what tables they are.
Starting with Postgres 10, logical replication is built into Postgres! This is often a better solution than external solutions. The Postgres docs are great and easy to follow. It's very easy. See the quick setup docs, which in essense boils down to running this:
-- On publisher DB
CREATE PUBLICATION mypub FOR TABLE users, departments;
-- On subscriber DB
CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICATION mypub;
You might want to try Bucardo, which is an open source software to synchronize rows between tables even if they are in a remote location. It's a very simple software, and it is capable of creating one-way synchronization relationships as well.
Check out http://bucardo.org/wiki/Bucardo
You cannot get anything useful by copying individual tables files in the data directory. If you want to replicate selected tables, there are a number of good options.
http://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling