How does kafka compression relate to configurations that refer to bytes? - apache-kafka

It's unclear to me (and I haven't managed to find any documentation that makes it perfectly clear) how compression affects kafka configurations that deal with bytes.
Take a hypothetical message that is exactly 100 bytes, a producer with a batch size of 1000 bytes, and a consumer with a fetch size of 1000 bytes.
With no compression it seems pretty clear that my producer would batch 10 messages at a time and my consumer would poll 10 messages at a time.
Now assume a compression (specified at the producer -- not on the broker) that (for simplicity) compresses to exactly 10% of the uncompressed size.
With that same config, would my producer still batch 10 messages at a time, or would it start batching 100 messages at a time? I.e. is the batch size pre- or post-compression? The docs do say this:
Compression is of full batches of data
...which I take to mean that it would compress 1000 bytes (the batch size) down to 100 bytes and send that. Is that correct?
Same question for the consumer fetch. Given a 1K fetch size, would it poll just 10 messages at a time (because the uncompressed size is 1K) or would it poll 100 messages (because the compressed size is 1K)? I believe that the fetch size will cover the compressed batch, in which case the consumer would be fetching 10 batches as-produced-by-the-producer at a time. Is this correct?
It seems confusing to me that, if I understand correctly, the producer is dealing with pre-compression sizes and the consumer is dealing with post-compression sizes.

It's both simpler and more complicated ;-)
It's simpler in that both the producer and the consumer compresses and uncompresses the same Kafka Protocol Produce Requests and Fetch Requests and the broker just stores them with zero copy in their native wire format. Kafka does not compress individual messages before they are sent. It waits until a batch of messages (all going to the same partition) are ready for send and then compresses the entire batch and sends it as one Produce Request.
It's more complicated because you also have to factor in the linger time which will trigger a send of a batch of messages earlier than when the producer buffer size is full. You also have to consider that messages may have different keys, or for other reasons be going to different topic partitions on different brokers so it's not true to say that qty(10) records compressed to 100 bytes each go all as one batch to one broker as a single produce request of 1000 bytes (unless all the messages are being sent to a topic with a single partition).
From https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
The producer maintains buffers of unsent records for each partition.
These buffers are of a size specified by the batch.size config. Making
this larger can result in more batching, but requires more memory
(since we will generally have one of these buffers for each active
partition).
By default a buffer is available to send immediately even if there is
additional unused space in the buffer. However if you want to reduce
the number of requests you can set linger.ms to something greater than
0. This will instruct the producer to wait up to that number of milliseconds before sending a request in hope that more records will
arrive to fill up the same batch. This is analogous to Nagle's
algorithm in TCP. For example, in the code snippet above, likely all
100 records would be sent in a single request since we set our linger
time to 1 millisecond. However this setting would add 1 millisecond of
latency to our request waiting for more records to arrive if we didn't
fill up the buffer. Note that records that arrive close together in
time will generally batch together even with linger.ms=0 so under
heavy load batching will occur regardless of the linger configuration;
however setting this to something larger than 0 can lead to fewer,
more efficient requests when not under maximal load at the cost of a
small amount of latency.

Related

Kafka: Throughput of producing to thousands of topics with different message rate

The task is routing messages from a single huge source topic to many (few thousands) destination topics. Overall rate is about few millions of records per second. It barely handles such payload now, and we are looking for a solution to optimise it. However, it does not seem it reached any limit at hardware or network level, so I suppose it can be improved. A latency isn't important (few minutes delay is fine), an average message size is less than 1 KiB.
The most obvious way to increase throughput is to make batch.size and linger.ms larger. But the problem is a different message rate in destination topics: depends on a message destination the rate may vary from few messages per second to hundreds of thousands per second.
As I understand (please, correct me if I'm wrong), but batch.size is per-partition parameter. So, if we set batch.size too big we will go out of memory, because it was multiplied by a number of destination topics even all of them have only one partition. Otherwise, if batch.size will be small, then producer will send requests to broker too often. In each app instance we use a single producer for all destination topics (ProduceRequest can include batches to different topics). The only way to set this parameter different per topic is using a separate producer per topic, but it means thousands of threads and many context switches.
Can we set a minimum size of actual ProduceRequest, i.e. like batch.size, but for overall batches in the request, i.e. something opposite to max.request.size?
Or is there any way to increase throughput of producer?
the problem looks solveable and seems like we solved. it's not a big problem for Kafka to stream to 3k topics, but there are some things you should take care about:
Kafka-producer tries to allocate batch.size * number_of_destination_partitions memory on the start. if you have batch.size equals 10mb and 3k topics with 1 partition per topic, Kafka-producer will require at least ~30gb on the start (source code).
so the more destination partitions you have, the less batch.size you have to set up or the more memory you need. we chose small batch.size
messages rate per destination topics does't affect general performance. Kafka-producer sends several batches per one request. here max.request.size comes into the play (source code, maxSize is max.request.size). the higher max.request.size, the more batches could be sent per one request. it is important to understand that reaching a batch.size or a linger.ms don't instantly triggers sending batch to the broker. as soon as batch reaches the batch.size or the linger.ms, it is marked as sendable and will be processed later with other batches (source code).
moreover, batch.size or a linger.ms are not the only reasons to mark batch as sendable (check the previous link). and this is where the batches are actually sent (source code). that's why the same events rate per destination topics is not required, but still there are some nuances which will be described next.
2.1. a few words about linger.ms. can't say for sure how it acts in this scenario. on the one hand, the larger it is, the longer Kafka-producer will wait to collect messages for exact partition and the more data for that partition will be send per one request. one the other hand, it seems like the less it is, there more batches for different partitions could be packed into one request. while there is no certainty about how to do better.
despite that Kafka-producer is able to send more than one batch per request, it can't send more that one batch per request for one specific partition. thats why if you have skewed messages rate for destination topics, you have to increase partitions count for most loaded ones to increase throughput. but it's always necessary to remember that an increasing partitions count leads to an increase in memory usage.
actually, an information above helped us to solve our problems with performance. but there may be other nuances that we don't know about yet.
I hope it will be useful.

Kafka Buffer Size And Time Interval

Kafka keeps data in Buffer as per buffer.memory (32 MB in my case). Does kafka writes record to topic once it reaches to 32 MB limit or is there any time associated with it as well ?
From the Kafka docs buffer.memory is only the property to specify the limit of buffer for a producer to use. But since setting this property producer will not make the producer wait until the buffer gets full for sending records into the server.
buffer.memory
The total bytes of memory the producer can use to buffer records waiting to be sent to the server. If records are sent faster than they can be delivered to the server the producer will block for max.block.ms after which it will throw an exception.
This setting should correspond roughly to the total memory the producer will use, but is not a hard bound since not all memory the producer uses is used for buffering. Some additional memory will be used for compression (if compression is enabled) as well as for maintaining in-flight requests.
If you want the producer to wait until the batch some records in buffer gets, you can use linger.ms property to make the producer wait. But as far as my knowledge there is no strict way to keep producer to wait and send records only if the buffer is full
KafkaProducer
By default a buffer is available to send immediately even if there is additional unused space in the buffer. However, if you want to reduce the number of requests you can set linger.ms to something greater than 0. This will instruct the producer to wait up to that number of milliseconds before sending a request in hope that more records will arrive to fill up the same batch.

when will trigger producer send a request?

if i send just one record at producer side and wait, when will producer sends the record to broker?
In kafka docs, i found the config called "linger.ms", and it says:
once we get batch.size worth of records for a partition it will be
sent immediately regardless of this setting, however if we have
fewer
than this many bytes accumulated for this partition we will 'linger'
for the specified time waiting for more records to show up.
According above docs, i have two questions.
if producer receives datas which size reaches batch.size, it will immediately trigger to send a request which only contains one batch to broker? But as we know, one request can contain many batches, so how does it happen?
does it mean that even the received datas are not enough of batch.size, it will also trigger to send a request to broker after waiting linger.ms ?
In Kafka, the lowest unit of sending is a record (a KV pair).
Kafka producer attempts to send records in batches in-order to optimize data transmission. So a single push from producer to the cluster -- to the broker leader to be precise -- could contain multiple records.
Moreover, batching always applies only to a given partition. Records produced to different partitions cannot be batched together, though they could form multiple batches.
There are a few parameters which influence the batching behaviour, as described in the documentation:
buffer.memory -
The total bytes of memory the producer can use to buffer records
waiting to be sent to the server. If records are sent faster than they
can be delivered to the server the producer will block for
max.block.ms after which it will throw an exception.
batch.size -
The producer will attempt to batch records together into fewer
requests whenever multiple records are being sent to the same
partition. This helps performance on both the client and the server.
This configuration controls the default batch size in bytes. No
attempt will be made to batch records larger than this size.
Requests sent to brokers will contain multiple batches, one for each
partition with data available to be sent.
linger.ms -
The producer groups together any records that arrive in between
request transmissions into a single batched request. Normally this
occurs only under load when records arrive faster than they can be
sent out. However in some circumstances the client may want to reduce
the number of requests even under moderate load. This setting
accomplishes this by adding a small amount of artificial delay—that
is, rather than immediately sending out a record the producer will
wait for up to the given delay to allow other records to be sent so
that the sends can be batched together. This can be thought of as
analogous to Nagle's algorithm in TCP. This setting gives the upper
bound on the delay for batching: once we get batch.size worth of
records for a partition it will be sent immediately regardless of this
setting, however if we have fewer than this many bytes accumulated for
this partition we will 'linger' for the specified time waiting for
more records to show up. This setting defaults to 0 (i.e. no delay).
Setting linger.ms=5, for example, would have the effect of reducing
the number of requests sent but would add up to 5ms of latency to
records sent in the absence of load.
So from above documentation, you could understand - linger.ms is an artificial delay to wait if there are not enough bytes to transmit, but if producer accumulates enough bytes before linger.ms is elapsed, then the request is sent anyway.
On top of that, batching is also influenced by max.request.size
max.request.size -
The maximum size of a request in bytes. This setting will limit the
number of record batches the producer will send in a single request to
avoid sending huge requests. This is also effectively a cap on the
maximum record batch size. Note that the server has its own cap on
record batch size which may be different from this.

How to set Kafka Producer message rate per second?

I am reading a csv file and giving the rows of this input to my Kafka Producer. now I want my Kafka Producer to produce messages at a rate of 100 messages per second.
Take a look at linger.ms and batch.size properties of Kafka Producer.
You have to adjust these properties correspondingly to get desired rate.
The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle's algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get batch.size worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will 'linger' for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay). Setting linger.ms=5, for example, would have the effect of reducing the number of requests sent but would add up to 5ms of latency to records sent in the absense of load.
If you like stream processing then akka-streams has nice support for throttling: http://doc.akka.io/docs/akka/current/java/stream/stream-quickstart.html#time-based-processing
Then the akka-stream-kafka (aka reactive-kafka) library allows you to connect the two together: http://doc.akka.io/docs/akka-stream-kafka/current/home.html
In Kafka JVM Producer, the throughput depends upon multiple factors. And most commonly it's calculated in MB/sec rather than Msg/sec. In your example, if let's say each of your row in CSV is 1MB in size then you need to tune your producer configs to achieve 100MB/sec, so that you can achieve your target throughput of 100 Msg/sec. While tuning producer configs, you have to take into the consideration what's your batch.size ( measured in bytes ) config value? If it's set too low then producer will try to send messages more often and wait for reply from server. This will improve the producer's throughput. But would impact the latency. If you are using async callback based producer then in this case your overall throughput will be limited by how many number of messages producer can send before waiting for reply from server determined by max.in.flight.request.per.connection.
If you keep batch.size too high then producer throughput will get affected since after waiting for linger.ms period kafka producer will send the all messages in a batch to broker for that particular partition at once. But having bigger batch.size means bigger buffer.memory which might put pressure on GC.

Testing Kafka producer throughput

We have a Kafka cluster consists of 3 nodes each with 32GB of RAM and 6 core 2.5 CPU.
We wrote a kafka producer that receive tweets from twitter and send it to Kafka in batches of 5000 tweets.
In the Producer we uses producer.send(list<KeyedMessages>) method.
The avg size of the tweet is 7KB.
Printing the time in milliseconds before and after the send statement to measure the time taken to send 5000 messages we found that it takes about 3.5 seconds.
Questions
Is the way we test the Kafka performance correct?
Is using the send method that takes list of keyed messages the correct way to send batch of messages to Kafka? Is there any other way?
What are the important configurations that affects the producer performance?
You're measuring only the producer side? That metric tells you only how much data you can store in a unit of time.
Maybe that's what you wanted to measure, but since the title of your question is "Kafka performance", I would think that you'd actually want to measure the throughput, i.e. how long does it take for a message to go though Kafka (usually referred to as end-to-end latency).
You'd achieve that by measuring the difference in time between sending a message and receiving that message on the other side, by a consumer.
If the cluster is configured correctly (default configuration will do), you should see latency ranging from only a couple of ms (less than 10ms), up to 50ms (few tens of milliseconds).
Kafka is able to do that because messages you read by the consumer don't even touch the disk, cuz' they are still in RAM (page cache and socket buffer cache). Keep in mind that this only works while you're able to "catch up" with your consumers, i.e. don't have a large consumer lag. If a consumer lags behind producers, the messages will eventually be purged from cache (depending on the rate of messages - how long it takes for the cache to fill up with new messages), and will thus have to be read from disk. Even that is not the end of the world (order of magnitude slower, in the range of low 100s of ms), because messages are written consecutively, one by one is a straight line, which is a single disk seek.
BTW you'd want to give Kafka only a small percentage of those 32GB, e.g. 5 to 8GB (even G1 garbage collector slows down with bigger sizes) and leave everything else unassigned so OS can use it for page and buffer cache.