What is the max throughput of kafka in MB/second.
I am trying to send messages of size 2MB each and get a throughput of about 30 records per second (i.e. 60 MB/second)
I wanted to check what was the theoretical max throughput that could be reached.
Kafka is usually network bound -- so it depends on your hardware. Theoretical max for 1Gbit Ethernet would be 125MB/sec.
Also check out this blog post: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
Related
I want to set up a system of 100-200 sensors that send their data (in a frequency of about 30 mins) to a MQTT broker based on RaspberryPi. Sensors data is obtained in an ESP8266, which would transmit via WiFi to the MQTT broker (which is in a distance of about 2 meters).
I wanted to know if is it possible for a broker of these characteristics to handle that many connections simultaneously.
Thank you so much!
Diego
A single broker can handle many 1000s of clients.
The limiting factor is likely to be the size and frequency of message, but assuming the messages are not 10s of megabytes each then 200 messages spread of 30mins will be trivial.
Even if they are all grouped at the same time rough time (allowing for clock drift) then small messages will again not be a problem.
I understand that ideally tasks.max = # of partitions for max throughput. But how many tasks per cpu core is ideal?
I am trying to improve Kafka producer throughput, we have CSV reports which are getting process and publish to Kafka topic. using default Kafka settings we are getting on avg 300-500 kbps Kafka throughput. to improve the throughput I have tried testing some combinations with linger.ms and batch.size but it's not helping.
tried with
"linger.ms= 30000","batch.size= 1000000","buffer.memory=16777216"
"linger.ms= 40000","batch.size= 1500000","buffer.memory=16777216"
even tried with lesser linger.ms and batch.size
linger.ms = 200, batch.size=65000
but still, throughput is around 150-200 kbps
but throughput is just decreasing to 100-150kbps.
Kafka topic has 12 partitions.
ack is all, and compression is snappy
any suggestions are welcome.
There is a comprehensive white paper from Confluent which explains how to increase throughput and which configurations to look at.
Basically, you have already done the right steps by increasing batch.size and tuning linger.ms. Depending on your requirements of potential data loss you may also reduce the retries. As an important factor for increasing throughput, you should use a compression.type in your producer while at the same time set the compression.type=producer at broker-level.
Remember that Kafka scales with the partitions and this can only happen if you have enough brokers in your cluster. Having many partitions, all located on the same broker will not increase throughput.
To summarize, the white paper mentiones the following producer configurations to increase throughput:
batch.size: increase to 100000 - 200000 (default 16384)
linger.ms: increase to 10 - 100 (default 0)
compression.type=lz4 (default none)
acks=1 (default 1)
retries=0 (default 0)
buffer.memory: increase if there are a lot of partitions (default 33554432)
Keep in mind, that in the end, each cluster behaves differently. In addition, each use case has different structure of messages (volume, frequency, byte size, ...). Therefore, it is important to get an understanding of the mentioned producer configurations and test their sensitivity on your actual cluster.
I have setup a sample Kafka cluster on AWS and am trying to identify maximum throughput possible with the given configurations. I am currently following post provided here for this analysis.
https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
I would appreciate it if you could clarify the following issues.
I observed a throughput of 40MB/s for messages of size 512 bytes ( single producer - single consumer ) with given hardware. Assume I need to achieve a throughput of 80MB/s.
As I understand one way to do this to increase the number of partitions per topic and increase the number of threads in producer and consumer. ( Assuming I do not change the default values for batch size, compression ratio etc. )
How to find the maximum throughput possible with given hardware? The point after which we are required to improve our hardware resources if we are to further improve the throughput?
( In other words how to make the decision "With X GB RAM and Y GB disk space this is the maximum throughput I can achieve. If I need to further improve the throughput I have to upgrade RAM to XX GB and disk space to YY GB" )
2.Should we scale the cluster vertically or horizontally? What is the recommended approach?
Thank you.
If we define throughput as the volume of data transmitted over the network per second, the maximum throughput should not exceed #machine number * bandwidth. Given a single machine whose NIC is configured with 1Gbps, the max TPS on single machine cannot be larger than 1Gbps. In your case, TPS is 40MB/s, namely 320Mbps,which is quite less than 1Gbps, meaning there is still room for improvement. However, if your target is far larger than 1Gbps, you definitely need more machines.
AFAIK, bandwidth is the most likely cause for the system bottleneck. Unlike CPU and RAM, it's not easy to scale vertically, so a horizontally scaling might be an option.
You could do some maths before scaling. Say the throughput target is "produce 2 billion of records with 512Bytes in 1 hour". That's to say, the TPS has to achieve 2,000,000,000 * 8 * 512 / 3600 / 1024 / 1024 = 2170mbps. Assuming available bandwidth for single machine is 700mbps(Over 70% usage normally brings 'packet loss'), at least 4 machines should be planned for the producer application.
Is there anyway I can speed up the rate at which the replicas will fetch data from leader?
I am using bin/kafka-producer-perf-test.sh to test the throughput of my producer. And I have set a client quota of 50 MBps. Now without any replicas I am getting throughput ~ 50MBps but when replication factor is set to 3, it reduces to ~30 MBps.
There is no other traffic in the network so I am not sure why things are slowing down. Is there some parameter like replica.socket.receive.buffer.bytes, replica.fetch.min.bytes that needs to be tuned to achieve high throughput? How can I speed up my replicas?
increasing value of num.replica.fetchers should help, its Number of threads used to replicate messages from leaders. Increasing this value can increase the degree of I/O parallelism in the follower broker. default value 1