Is there anyway I can speed up the rate at which the replicas will fetch data from leader?
I am using bin/kafka-producer-perf-test.sh to test the throughput of my producer. And I have set a client quota of 50 MBps. Now without any replicas I am getting throughput ~ 50MBps but when replication factor is set to 3, it reduces to ~30 MBps.
There is no other traffic in the network so I am not sure why things are slowing down. Is there some parameter like replica.socket.receive.buffer.bytes, replica.fetch.min.bytes that needs to be tuned to achieve high throughput? How can I speed up my replicas?
increasing value of num.replica.fetchers should help, its Number of threads used to replicate messages from leaders. Increasing this value can increase the degree of I/O parallelism in the follower broker. default value 1
Related
We have too many partitions per topic and we have a long way until we will have the time to attend the issue.
In the mean time we are trying to mitigate by decreasing the replication factor.
Will it affect the count per broker?
Decreasing replica count will reduce the number of file-handles on the brokers, but it will not affect partition count of the topics.
Partitions cannot be reduced.
I understand that ideally tasks.max = # of partitions for max throughput. But how many tasks per cpu core is ideal?
I am trying to improve Kafka producer throughput, we have CSV reports which are getting process and publish to Kafka topic. using default Kafka settings we are getting on avg 300-500 kbps Kafka throughput. to improve the throughput I have tried testing some combinations with linger.ms and batch.size but it's not helping.
tried with
"linger.ms= 30000","batch.size= 1000000","buffer.memory=16777216"
"linger.ms= 40000","batch.size= 1500000","buffer.memory=16777216"
even tried with lesser linger.ms and batch.size
linger.ms = 200, batch.size=65000
but still, throughput is around 150-200 kbps
but throughput is just decreasing to 100-150kbps.
Kafka topic has 12 partitions.
ack is all, and compression is snappy
any suggestions are welcome.
There is a comprehensive white paper from Confluent which explains how to increase throughput and which configurations to look at.
Basically, you have already done the right steps by increasing batch.size and tuning linger.ms. Depending on your requirements of potential data loss you may also reduce the retries. As an important factor for increasing throughput, you should use a compression.type in your producer while at the same time set the compression.type=producer at broker-level.
Remember that Kafka scales with the partitions and this can only happen if you have enough brokers in your cluster. Having many partitions, all located on the same broker will not increase throughput.
To summarize, the white paper mentiones the following producer configurations to increase throughput:
batch.size: increase to 100000 - 200000 (default 16384)
linger.ms: increase to 10 - 100 (default 0)
compression.type=lz4 (default none)
acks=1 (default 1)
retries=0 (default 0)
buffer.memory: increase if there are a lot of partitions (default 33554432)
Keep in mind, that in the end, each cluster behaves differently. In addition, each use case has different structure of messages (volume, frequency, byte size, ...). Therefore, it is important to get an understanding of the mentioned producer configurations and test their sensitivity on your actual cluster.
although we do not have any perfomance issues yet, and the nodes are pretty much idle, is it advisable to increase the number of kafka brokers (and zookeepers) from 3 to 5 immediately to improve cluster high availability? The intention is then of course to increase the replication factor from 3 to 5 as a default config for critical topics.
If high level of data replication is essential for your business, it is advisable to increase the count of brokers. To attain this, on top of extra nodes, you are creating a technical debt on network load also. Obviously if you increase the number of brokers in cluster, you are decreasing the risk related to loosing high availability.
Depending of your needs. If you do not have to ensure a very high availability(example a bank), the increase of replication factor in your cluster will reducer the overall performance because when you write a message on a topic/partition, that message will be replicated in 5 nodes instead of 3. You can increase the number of nodes for high availability and distribute less partitions on every node, but without a increase of replication factor.
What is the max throughput of kafka in MB/second.
I am trying to send messages of size 2MB each and get a throughput of about 30 records per second (i.e. 60 MB/second)
I wanted to check what was the theoretical max throughput that could be reached.
Kafka is usually network bound -- so it depends on your hardware. Theoretical max for 1Gbit Ethernet would be 125MB/sec.
Also check out this blog post: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines