Improving performance of Kafka Producer - apache-kafka

We're running on apache kafka 0.10.0.x and spring 3.x and cannot use spring kafka as it is supported with spring framework version 4.x.
Therefore, we are using the native Kafka Producer API to produce messages.
Now the concern that i have is the performance of my producer. The thing is i believe a call to producer.send is what really makes the connection to the Kafka broker and then puts the message onto the buffer and then attempts to send and then possibly calls your the provided callback method in the producer.send().
Now the KafkaProducer documentation says that it uses a buffer and another I/O thread to perform the send and that they should be closed appropriately so that there is no leakage of resources.
From what i understand, this means that if i have 100s of messages being sent every time i invoke producer.send() it attempts to connect to the broker which is an expensive I/O operation.
Can you please correct my understanding if i am wrong or maybe suggest a better to use the KafkaProducer?

The two important configuration parameters of kafka producer are 'batch.size' and 'linger.ms'. So you basically have a choice: you can wait until the producer batch is full, or the producer time out.
batch.size – This is an upper limit of how many messages Kafka Producer will attempt to batch before sending – specified in bytes.
linger.ms – How long will the producer wait before sending in order to allow more messages to get accumulated in the same batch.
It depends on your use case, but I would suggest to take a closer look on these parameters.

Your understanding is partially right.
As #leshkin pointed out there are configuration parameters to tune how the KafkaProducer will handle buffering of messages to be sent.
However independently from the buffering strategy, the producer will take care of caching established connections to topic-leader brokers.
Indeed you can tune for how long the producer will keep such connection around using the connections.max.idle.ms parameter (defaults to 9 minutes).
So to respond to your original question, the I/O cost of establishing a connection to the broker happens only on the first send invocation and will be amortised over time as long as you have data to send.

In the below conditions you need to configure batch.size, linger.ms & compression.type properties in your kafka prodocer to increase the performance.
1) If records are arriving faster than the kafka producer can send.
2) If you have huge amount of data in the your respective Topic, its really burden to your kafka producer.
3) if you have a bottlenecks
batch.size = 16_384 * 4
linger.ms 200
compression.type = "snappy"
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16_384 * 4);
// Send with little bit buffering
props.put(ProducerConfig.LINGER_MS_CONFIG, 200);
//Use Snappy compression for batch compression.
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
kafka Dzone
Performance tunning
Kafka Perforamnce tunning

Related

Kafka producer timeout issues

I am looking for some clarification regarding properties which we can be used to avoid producer timeout due to either more time taken since batch creation with blocked batch or timeout with metadata read. I am confused if I should increase max.block.ms or delivery.timeout.ms?? And if we also need to set buffer.memory with these timeouts to avoid blockage with memory issue??
I am using spring kafka template send method to produce message with defined producer properties bean.

Is it possible to implement dynamic batching with Kafka Producer?

Now I'm doing some tests with Apache Kafka. In the configuration of Kafka Producer the parameters batch.size and linger.ms controls the batching strategy. Is it possible to make these parameters dynamically while producing? e.g. If the data ingestion rate rises fast, we may want to increase batch.size to accumulate more messages per batch. I failed to find any example of dynamic batching with Kafka Producer. Is it possible to implement?
It's possible, but you would have to close and re-open a new Producer instance yourself with the updated configurations during runtime, while making sure that you aren't dropping events between that action.

What atomicity guarantees - if any - does Kafka have regarding batch writes?

We're now moving one of our services from pushing data through legacy communication tech to Apache Kafka.
The current logic is to send a message to IBM MQ and retry if errors occur. I want to repeat that, but I don't have any idea about what guarantees the broker provide in that scenario.
Let's say I send 100 messages in a batch via producer via Java client library. Assuming it reaches the cluster, is there a possibility only part of it be accepted (e.g. a disk is full, or some partitions I touch in my write are under-replicated)? Can I detect that problem from my producer and retry only those messages that weren't accepted?
I searched for kafka atomicity guarantee but came up empty, may be there's a well-known term for it
When you say you send 100 messages in one batch, you mean, you want to control this number of messages or be ok letting the producer batch a certain amount of messages and then send the batch ?
Because not sure you can control the number of produced messages in one producer batch, the API will queue them and batch them for you, but without guarantee of batch them all together ( I'll check that though).
If you're ok with letting the API batch a certain amount of messages for you, here is some clues about how they are acknowledged.
When dealing with producer, Kafka comes with some kind of reliability regarding writes ( also "batch writes")
As stated in this slideshare post :
https://www.slideshare.net/miguno/apache-kafka-08-basic-training-verisign (83)
The original list of messages is partitioned (randomly if the default partitioner is used) based on their destination partitions/topics, i.e. split into smaller batches.
Each post-split batch is sent to the respective leader broker/ISR (the individual send()’s happen sequentially), and each is acked by its respective leader broker according to request.required.acks
So regarding atomicity.. Not sure the whole batch will be seen as atomic regarding the above behavior. Maybe you can assure to send your batch of message using the same key for each message as they will go to the same partition, and thus maybe become atomic
If you need more clarity about acknowlegment rules when producing, here how it works As stated here https://docs.confluent.io/current/clients/producer.html :
You can control the durability of messages written to Kafka through the acks setting.
The default value of "1" requires an explicit acknowledgement from the partition leader that the write succeeded.
The strongest guarantee that Kafka provides is with "acks=all", which guarantees that not only did the partition leader accept the write, but it was successfully replicated to all of the in-sync replicas.
You can also look around producer enable.idempotence behavior if you aim having no duplicates while producing.
Yannick

Kafka KStream OutOfOrderSequenceException

Our application intermittently encounters OutOfOrderSequenceException in our streams code. Which causes stream thread to stop.
Implementation is simple, 2 KStreams join and output to another topic.
When searching for a solution to this OutOfOrderSequenceException
I have found below documentation on Confluent
https://docs.confluent.io/current/streams/concepts.html#out-of-order-handling
But could not find what settings, config or trade-offs are being referred here ?
How to manually do bookkeeping ?
If users want to handle such out-of-order data, generally they need to
allow their applications to wait for longer time while bookkeeping
their states during the wait time, i.e. making trade-off decisions
between latency, cost, and correctness. In Kafka Streams, users can
configure their window operators for windowed aggregations to achieve
such trade-offs (details can be found in the Developer Guide).
From the JavaDocs of OutOfOrderSequenceException:
This exception indicates that the broker received an unexpected sequence number from the producer, which means that data may have been lost. If the producer is configured for idempotence only (i.e. if enable.idempotence is set and no transactional.id is configured), it is possible to continue sending with the same producer instance, but doing so risks reordering of sent records. For transactional producers, this is a fatal error and you should close the producer.
Sequence numbers are internally assigned numbers to each message that is written into a topic.
Because it is an internal error, it's hard to tell what the root cause could be though.
Updates :
After updating Kafka Brokers and KStream version, issue seems to have subsided.
Also, as per the recommendation,
https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#recommended-configuration-parameters-for-resiliency
I have updated acks to all. replication factor was already 3.

How to set Kafka Producer message rate per second?

I am reading a csv file and giving the rows of this input to my Kafka Producer. now I want my Kafka Producer to produce messages at a rate of 100 messages per second.
Take a look at linger.ms and batch.size properties of Kafka Producer.
You have to adjust these properties correspondingly to get desired rate.
The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle's algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get batch.size worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will 'linger' for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay). Setting linger.ms=5, for example, would have the effect of reducing the number of requests sent but would add up to 5ms of latency to records sent in the absense of load.
If you like stream processing then akka-streams has nice support for throttling: http://doc.akka.io/docs/akka/current/java/stream/stream-quickstart.html#time-based-processing
Then the akka-stream-kafka (aka reactive-kafka) library allows you to connect the two together: http://doc.akka.io/docs/akka-stream-kafka/current/home.html
In Kafka JVM Producer, the throughput depends upon multiple factors. And most commonly it's calculated in MB/sec rather than Msg/sec. In your example, if let's say each of your row in CSV is 1MB in size then you need to tune your producer configs to achieve 100MB/sec, so that you can achieve your target throughput of 100 Msg/sec. While tuning producer configs, you have to take into the consideration what's your batch.size ( measured in bytes ) config value? If it's set too low then producer will try to send messages more often and wait for reply from server. This will improve the producer's throughput. But would impact the latency. If you are using async callback based producer then in this case your overall throughput will be limited by how many number of messages producer can send before waiting for reply from server determined by max.in.flight.request.per.connection.
If you keep batch.size too high then producer throughput will get affected since after waiting for linger.ms period kafka producer will send the all messages in a batch to broker for that particular partition at once. But having bigger batch.size means bigger buffer.memory which might put pressure on GC.