Is there a standard way of estimating what total bandwidth requirements are between a primary and replica using streaming replication?
I understand we can take the size and number of the WAL files but I also understand that using streaming replication data is propagated ahead of the WAL file being filled so I assume that there is a streamed + WAL file type calculation to perform.
Short of tracking the data at the network level is there a rough way to calculate the requirement?
Thanks
Simon
WAL information is not transferred using log file shipping, but it is still the same WAL. So the amount of WAL written to the log files is a good measure for the required bandwidth.
I am creating replication slot and streaming changes from AWS Postgres RDS to java process through JDBC driver.
My replication slot creation code looks like this.
final ReplicationSlotInfo replicationSlotInfo = pgConnection.getReplicationAPI()
.createReplicationSlot()
.logical()
.withSlotName(replicationSlotName)
.withOutputPlugin("wal2json")
.make();
and I get replication stream using following code.
pgConnection.getReplicationAPI()
.replicationStream()
.logical()
.withSlotName(replicationSlotName)
.withSlotOption("include-xids", true)
.withSlotOption("include-timestamp", true)
.withSlotOption("pretty-print", false)
.withSlotOption("add-tables", "public.users")
.withStatusInterval(10, TimeUnit.SECONDS)
.start()
When replicator java process is not running, the WAL size gets increased. Here is the query I use to find replication lag.
SELECT
slot_name,
pg_size_pretty(pg_xlog_location_diff(pg_current_xlog_location(), restart_lsn)) AS replicationSlotLag,
active
FROM
pg_replication_slots;
Output:
slot_name replicationslotlag active
data_stream_slot 100 GB f
This replication lag gets increased beyond RDS Disk, which shuts RDS down.
I thought wal_keep_segments will take care of this, which was set to 32. But it did not work.
Is there any other property which I have to set to avoid this situation, even when Java Replication process is not running.
There is a proposal to allow a logical replication slots WAL retention to be limited. I think that that is just what you need, but it is not clear when/if it will become available.
In the meantime, all you can do is monitor the situation, then then drop the slot if it starts to fall behind too far. Of course this does mean you will have a problem re-establishing synchronization later, but there is no way around that (other than fixing whatever it is that is causing the replication process to go away and/or fall behind).
Since you say the java process is not running, dropping the slot is easy to do. If it were running, but just not keeping up, then you would have to do the sad little dance where you kill the wal sender, then try to drop the slot before it gets restarted (and I don't know how you do that on RDS)
wal_keep_segments is only applicable to physical replication, not logical. And it is for use instead of slots, not in addition to them. If you have both, then WAL is retained until both criteria are met. Indeed that is the problem you are facing; logical replication cannot be done without use of slots the way physical replication can.
wal_keep_segments is irrelevant for logical decoding.
With logical decoding, you always have to use a logical replication slot, which is a data structure which marks a position in the transaction log (WAL), so that the server never discards old WAL segments that logical decoding might still need.
That is why your WAL directory grows if you don't consume the changes.
wal_keep_segments specifies a minimum number of old WAL segments to retain. It is used for purposes like streaming replication, pg_receivewal or pg_rewind.
wal_keep_segments specifies the minimum number of segments PostgreSQL should keep in pg_xlog directory. There can be a few reasons why PostgreSQL doesn't remove segments:
There is a replication slot at a WAL location older than the WAL files, you can check it with this query:
SELECT slot_name,
lpad((pg_control_checkpoint()).timeline_id::text, 8, '0') ||
lpad(split_part(restart_lsn::text, '/', 1), 8, '0') ||
lpad(substr(split_part(restart_lsn::text, '/', 2), 1, 2), 8, '0')
AS wal_file
FROM pg_replication_slots;
WAL archiving is enabled and archive_command fails. Please, check PostgreSQL logs in this case.
There was no checkpoint for a long time.
We had a severe slow down of our applications in our HADR environment. We are seeing the following when we run db2pd -hadr:
HADR_FLAGS = STANDBY_RECV_BLOCKED
STANDBY_RECV_BUF_PERCENT = 100
STANDBY_SPOOL_PERCENT = 100
These recovered later and seem better now with STANDBY_SPOOL_PERCENT coming down gradually. Can you please help understand the implications of the values of the above parameters and what needs to be done to ensure we don't get into such a situation?
This isssue is most likely triggered by a peak amount of transactions occuring on the primary. The standby receive buffer and spool got saturated. Unless you are running with configuration parameter HADR_SYNCMODE in SUPERASYNC mode, you could fall into this situation. The slow down on application was induced by the primary waiting for an acknowledgement from the standby that it had received the log file, but since its spool and receive buffers were full at the time, the standby was delaying this acknowledgement.
You could consider setting HADR_SYNCMODE to SUPERASYNC, but this would also mean that the system will be more vulnerable to data loss should there be a failure on the primary. To manage these temporary peaks, you can make either of the following configuration changes:
Increase the size of the log receive buffer on the standby database
by modifying the value of the DB2_HADR_BUF_SIZE registry
variable.
Enable log spooling on a standby database by setting the
HADR_SPOOL_LIMIT
For further details, you can refer to HADR Performance Guide
I've been reading through the WAL chapter of the Postgres manual and was confused by a portion of the chapter:
Using WAL results in a significantly reduced number of disk writes, because only the log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction.
How is it that continuous writing to WAL more performant than simply writing to the table/index data itself?
As I see it (forgetting for now the resiliency benefits of WAL) postgres need to complete two disk operations; first pg needs to commit to WAL on disk and then you'll still need to change the table data to be consistent with WAL. I'm sure there's a fundamental aspect of this I've misunderstood but it seems like adding an additional step between a client transaction and the and the final state of the table data couldn't actually increase overall performance. Thanks in advance!
You are fundamentally right: the extra writes to the transaction log will per se not reduce the I/O load.
But a transaction will normally touch several files (tables, indexes etc.). If you force all these files out to storage (“sync”), you will incur more I/O load than if you sync just a single file.
Of course all these files will have to be written and sync'ed eventually (during a checkpoint), but often the same data are modified several times between two checkpoints, and then the corresponding files will have to be sync'ed only once.
I am planning to migrate my production oracle cluster to postgresql cluster. Current systems support 2000TPS and in order to support that TPS, I would be very thankful if someone could clarify bellow.
1) What is the best replication strategy ( Streaming or DRBD based replication)
2) In streaming replication, can master process traffic without slave and when slave come up does it get what lost in down time ?
About TPS - it depends mainly on your HW and PostgreSQL configuration. I already wrote about it on Stackoverflow in this answer. You cannot expect magic on some notebook-like configuration. Here is my text "PostgreSQL and high data load application".
1) Streaming replication is the simplest and almost "painless" solution. So if you want to start quickly I highly recommend it.
2) Yes but you have to archive WAL logs. See below.
All this being said here are links I would recommend you to read:
how to set streaming replication
example of WAL log archiving script
But of course streaming replication has some caveats which you should know:
problem with increasing some parameters like max_connections
how to add new disk and new tablespace to master and replicas
There is no “best solution” in this case. Which solution you pick depends on your requirements.
Do you need a guarantee of no data lost?
How big a performance hit can you tolerate?
Do you need failover or just a backup?
Do you need PITR (Point In Time Recovery)?
By default, I think a failed slave will be ignored. Depending on your configuration, the slave might take a long time to recover after e.g. a boot.
I'd recommend you read https://www.postgresql.org/docs/10/static/different-replication-solutions.html