Postgres Streaming Replication Bandwidth Estimation - postgresql

Is there a standard way of estimating what total bandwidth requirements are between a primary and replica using streaming replication?
I understand we can take the size and number of the WAL files but I also understand that using streaming replication data is propagated ahead of the WAL file being filled so I assume that there is a streamed + WAL file type calculation to perform.
Short of tracking the data at the network level is there a rough way to calculate the requirement?
Thanks
Simon

WAL information is not transferred using log file shipping, but it is still the same WAL. So the amount of WAL written to the log files is a good measure for the required bandwidth.

Related

Postgres replication size send during the streaming replication progress

I would like to ask a quest about postgresql-12 streaming replication. I just want to know the size of file/data/packet sent out trough the replication, with standard configuration of 16MB per segments size. Its going send out exactly 16MB per transaction or less? any tools to find it out other than tcpdump or wireshark. Thanks.
Whenever a WAL record is written to the WAL buffers shared memory area, it will be streamed to the standby immediately. WAL records have variable size. WAL can be streamed to the standby even before the transaction ends.

How does write ahead logging improve IO performance in Postgres?

I've been reading through the WAL chapter of the Postgres manual and was confused by a portion of the chapter:
Using WAL results in a significantly reduced number of disk writes, because only the log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction.
How is it that continuous writing to WAL more performant than simply writing to the table/index data itself?
As I see it (forgetting for now the resiliency benefits of WAL) postgres need to complete two disk operations; first pg needs to commit to WAL on disk and then you'll still need to change the table data to be consistent with WAL. I'm sure there's a fundamental aspect of this I've misunderstood but it seems like adding an additional step between a client transaction and the and the final state of the table data couldn't actually increase overall performance. Thanks in advance!
You are fundamentally right: the extra writes to the transaction log will per se not reduce the I/O load.
But a transaction will normally touch several files (tables, indexes etc.). If you force all these files out to storage (“sync”), you will incur more I/O load than if you sync just a single file.
Of course all these files will have to be written and sync'ed eventually (during a checkpoint), but often the same data are modified several times between two checkpoints, and then the corresponding files will have to be sync'ed only once.

Avoiding small files from Kafka connect using HDFS connector sink in distributed mode

We have a topic with messages at the rate of 1msg per second with 3 partitions and I am using HDFS connector to write the data to HDFS as AVRo format(default), it generates files with size in KBS,So I tried altering the below properties in the HDFS properties.
"flush.size":"5000",
"rotate.interval.ms":"7200000"
but the output is still small files,So I need clarity on the following things to solve this issue:
is flush.size property mandatory, in-case if we do not mention the flus.size property how does the data gets flushed?
if the we mention the flush size as 5000 and rotate interval as 2 hours,it is flushing the data for every 2 hours for first 3 intervals but after that it flushes data randomly,Please find the timings of the file creation(
19:14,21:14,23:15,01:15,06:59,08:59,12:40,14:40)--highlighted the mismatched intervals.is it because of the over riding of properties mentioned?that takes me to the third question.
What is the preference for flush if we mention all the below properties (flush.size,rotate.interval.ms,rotate.schedule.interval.ms)
Increasing the rate of msg and reducing the partition is actually showing an increase in the size of the data being flushed, is it the only way to have control over the small files,how can we handle the properties if the rate of the input events are varying and not stable?
It would be great help if you could share documentations regarding handling small files in kafka connect with HDFS connector,Thank you.
If you are using a TimeBasedPartitioner, and the messages are not consistently going to have increasing timestamps, then you will end up with a single writer task dumping files when it sees a message with a lesser timestamp in the interval of rotate.interval.ms of reading any given record.
If you want to have consistent bihourly partition windows, then you should be using rotate.interval.ms=-1 to disable it, then rotate.schedule.interval.ms to some reasonable number that is within the partition duration window.
E.g. you have 7200 messages every 2 hours, and it's not clear how large each message is, but let's say 1MB. Then, you'd be holding ~7GB of data in a buffer, and you need to adjust your Connect heap sizes to hold that much data.
The order of presecence is
scheduled rotation, starting from the top of the hour
flush size or "message-based" time rotation, whichever occurs first, or there is a record that is seen as "before" the start of the current batch
And I believe flush size is mandatory for the storage connectors
Overall, systems such as Uber's Hudi or the previous Kafka-HDFS tool of Camus Sweeper are more equipped to handle small files. Connect Sink Tasks only care about consuming from Kafka, and writing to downstream systems; the framework itself doesn't recognize Hadoop prefers larger files.

PostgreSQL streaming replication for high load

I am planning to migrate my production oracle cluster to postgresql cluster. Current systems support 2000TPS and in order to support that TPS, I would be very thankful if someone could clarify bellow.
1) What is the best replication strategy ( Streaming or DRBD based replication)
2) In streaming replication, can master process traffic without slave and when slave come up does it get what lost in down time ?
About TPS - it depends mainly on your HW and PostgreSQL configuration. I already wrote about it on Stackoverflow in this answer. You cannot expect magic on some notebook-like configuration. Here is my text "PostgreSQL and high data load application".
1) Streaming replication is the simplest and almost "painless" solution. So if you want to start quickly I highly recommend it.
2) Yes but you have to archive WAL logs. See below.
All this being said here are links I would recommend you to read:
how to set streaming replication
example of WAL log archiving script
But of course streaming replication has some caveats which you should know:
problem with increasing some parameters like max_connections
how to add new disk and new tablespace to master and replicas
There is no “best solution” in this case. Which solution you pick depends on your requirements.
Do you need a guarantee of no data lost?
How big a performance hit can you tolerate?
Do you need failover or just a backup?
Do you need PITR (Point In Time Recovery)?
By default, I think a failed slave will be ignored. Depending on your configuration, the slave might take a long time to recover after e.g. a boot.
I'd recommend you read https://www.postgresql.org/docs/10/static/different-replication-solutions.html

How to modify the configuration of Kafka to process large amount of data

I am using kafka_2.10-0.10.0.1. I have two questions:
- I want to know how I can modify the default configuration of Kafka to process large amount of data with good performance.
- Is it possible to configure Kafka to process the records in memory without storing in disk?
thank you
Is it possible to configure Kafka to process the records in memory without storing in disk?
No. Kafka is all about storing records reliably on disk, and then reading them back quickly off of disk. In fact, its documentation says:
As a result of taking storage seriously and allowing the clients to control their read position, you can think of Kafka as a kind of special purpose distributed filesystem dedicated to high-performance, low-latency commit log storage, replication, and propagation.
You can read more about its design here: https://kafka.apache.org/documentation/#design. The implementation section is also quite interesting: https://kafka.apache.org/documentation/#implementation.
That said, Kafka is also all about processing large amounts of data with good performance. In 2014 it could handle 2 million writes per second on three cheap instances: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines. More links about performance:
https://docs.confluent.io/current/kafka/deployment.html
https://www.confluent.io/blog/optimizing-apache-kafka-deployment/
https://community.hortonworks.com/articles/80813/kafka-best-practices-1.html
https://www.cloudera.com/documentation/kafka/latest/topics/kafka_performance.html