Workaround for excluding a table in streaming replication in postgres - postgresql

I have 2 database nodes working as master-slave which streaming replication in place. In one of our use cases, we require to exclude a table from getting replicated to slave. Is there a way or a workaround to exclude a table from getting copied to slave if I have to stay on this WAL based streaming replication

It is not possible to do this using physical replication. You could create a role on the master with no privs to this table, then only allow that role to connect on the replica. This would require you to trust the admin of the replica to respect your wishes, and it wouldn't help if the goal is to reduce the size of the replica.

Related

Postgres Streaming and Logical Replication

Currently we are using postgres streaming replication to sync/replicate databases on Primary and replica server. We are planning to use one of the application to sync data from our secondary or replica server to our data warehouse which needs logical replication to be enabled for tracking the changes and syncing the data from our replica server to data warehouse. Can we enable logical replication on top of streaming replication ? Is it possible or good practice to enable both on the same server or database ? If so, will there be any performance impact or what are the considerations or best practices to be followed?
There is no problem with using streaming (physical) replication and logical replication at the same time, but a physical standby server cannot be a logical primary. So you will have to use the same server as primary server for both physical and logical replication. But that shouldn't be a problem, since streaming replication primary and standby are physically identical anyway.

Do I need to archive postgres WAL records if I am already streaming them to a standby server?

I have a postgres master node which is streaming WAL records to a standby slave node. The slave database runs in read only mode and has a copy of all data on the master node. It can be switched to master by creating a recovery.conf file in /tmp.
On the master node I am also archiving WAL records. I am just wondering if this is necessary if they are already streamed to the slave node? The archived WAL records are 27GB at this point. The disk will fill eventually.
A standby server is no backup; it only protects you from hardware failure on the primary.
Just imagine that somebody by mistakes deletes data or drops a table, then you won't be able to recover from this problem without a backup.
Create a job that regularly cleans up archived WALs if they exceed a certain age.
Once you have a full backup, then you can purge the preceding WAL files associated.
The idea is to preserve the WAL Files for PITR in case if your server crashes.
If your Primary server crashes, then you can certainly use your hot-standby and make it primary, but at this time you have to build another server (as a hot-standby). Typically you don't want to build it using streaming replication.
You will be using full backup+wal backups to build a server and then proceed further instead of relying on streaming replication.

What is the consistency of Postgresql HA cluster with Patroni?

What is the consistency of Postgresql HA cluster with Patroni?
My understanding is that because the fail-over is using a consensus (etc or zookeeper) the system will stay consistent under network partition.
Does this mean that transaction running under the serializable Isolation Level will also provide linearizability.
If not which consistency will I get Sequential Consistency, Causal Consistency .. ?
You shouldn't mix up consistency between the primary and the replicas and consistency within the database.
A PostgreSQL database running in a Patroni cluster is a normal database with streaming replicas, so it provides the eventual consistency of streaming replication (all replicas will eventually show the same values as the primary).
Serializabiliy guarantees that you can establish an order in the database transactions that ran against the primary such that the outcome of a serialized execution in that order is the same as the workload had in reality.
If I read the definition right, that is just the same as “linearizability”.
Since only one of the nodes in the Patroni cluster can be written to (the primary), this stays true, no matter if the database is in a Patroni cluster or not.
In a distributed context, where we have multiple replicas of an object’s state, A schedule is linearizable if it is as if they were all updated at once at a single point in time.
Once a write completes, all later reads (wall-clock time) from any replica should see the value of that write or the value of a later write.
Since PostgreSQL version 9.6 its possible to have multiple synchronous standy node. This mean if we have 3 server and use num_sync = 2, the primary will always wait for write to be on the 2 standby before doing commit.
This should satisfy the constraint of linearizable schedule even with failover.
Since version 1.2 of Patroni, When synchronous mode is enabled, Patroni will automatically fail over only to a standby that was synchronously replicating at the time of the master failure.
This effectively means that no user visible transaction gets lost in such a case.

In master-master/multi-master replication, who is the secondary?

Silly question, when we talk about secondaries in the context of failover behaviour, with regards to master-master/multi-master, is that basically any node other than the one that we are currently reading from or writing to?
In master-master replication both the nodes are primary and secondary. In multi master replication every node is secondary but some or all are primary.
Multi master means there many database servers over which write can perform. In order to sync with other data nodes or database server we have to read all other writes and It behaves as secondary. In master slave replication we have only one master and many slaves. Master ensures that he is only write enabled and no one can writes so no need to read any one. and it behave as primary only.
For example- mysql 5.6 replication has support master-master replication but doesn't support multi master replication. But in mysql 5.7 replication it also support multi master replication. In mongoDB It only support master - slave replication.

MongoDB replica set secondary node data loss

I have two mongod instances without replication each having same collection name but different data.Now initialized replication between them.Secondary machine copies all data from primary machine and looses it's original data.Can I recover original data present in secondary machine ?
This is the expected behaviour with MongoDB replica sets: data from the primary is replicated to the secondaries. When you add a server as a new secondary, it does an "initial sync" which copies data from the primary. The replica sets are designed for failover and redundancy; your secondary nodes should have data consistent with the primary and their current replication lag.
If you have overwritten your previous database, your only option is to restore from a backup.
See also:
Backup and Restoration Strategies
Replica Set Internals Part V: Initial Sync