max.poll.records in conjunction with fetch.min.bytes - apache-kafka

I am reading this How does max.poll.records affect the consumer poll, as well as apache kafka docs, and I am still not sure if fetch.min.bytes is not changed, and the default is 1, is kafka broker obligated to return max.poll.records of records, if that much is available, or not?
According to our tests, it does not always return that much even if there is sufficient data available in a topic, and the explanation of that parameter from documentation and its sheer name does not imply it should, but some people tend to think the opposite. We also increased the limits that could potentially prevent this from happen, like message.max.bytes, max.message.bytes, max.partition.fetch.bytes, and fetch.max.bytes (that one we actually didn't have to increase, since the default is rather high, 50 MB), but that didn't change a thing.
We also didn't change fetch.max.wait.ms, and default is 500, that is a half of a second, so, if fetch.min.bytes is not set to something more than 1 byte, then this setting becomes effective, ie, it determines how much records is actually returned? Which would mean that if less then max.poll.records was returned, it is because it would take more than 500 ms to fetch that much?

These 2 configurations can be confusing and while at first sight they look similar they work in very different ways.
fetch.min.bytes: This value is one of the fields of Fetch Requests (it's min_bytes in http://kafka.apache.org/protocol#The_Messages_Fetch). This value is used by the broker to decide when to send a Fetch Response back to the client. When a broker receives a Fetch Request it can hold it for up to fetch.max.wait.ms if there are not fetch.min.bytes bytes available for consumption (for example the consumer is at the end of the log or the messages to be consumed add to less than that size).
max.poll.records: This setting is only used within the Consumer and is never sent to brokers. In the background (asynchronously), the consumer client actively fetches records from the broker and buffers them so when poll() is called, it can return records already fetched. As the name suggest, this settings control how many records at most poll() can return from the consumer buffer.

Related

Kafka: Throughput of producing to thousands of topics with different message rate

The task is routing messages from a single huge source topic to many (few thousands) destination topics. Overall rate is about few millions of records per second. It barely handles such payload now, and we are looking for a solution to optimise it. However, it does not seem it reached any limit at hardware or network level, so I suppose it can be improved. A latency isn't important (few minutes delay is fine), an average message size is less than 1 KiB.
The most obvious way to increase throughput is to make batch.size and linger.ms larger. But the problem is a different message rate in destination topics: depends on a message destination the rate may vary from few messages per second to hundreds of thousands per second.
As I understand (please, correct me if I'm wrong), but batch.size is per-partition parameter. So, if we set batch.size too big we will go out of memory, because it was multiplied by a number of destination topics even all of them have only one partition. Otherwise, if batch.size will be small, then producer will send requests to broker too often. In each app instance we use a single producer for all destination topics (ProduceRequest can include batches to different topics). The only way to set this parameter different per topic is using a separate producer per topic, but it means thousands of threads and many context switches.
Can we set a minimum size of actual ProduceRequest, i.e. like batch.size, but for overall batches in the request, i.e. something opposite to max.request.size?
Or is there any way to increase throughput of producer?
the problem looks solveable and seems like we solved. it's not a big problem for Kafka to stream to 3k topics, but there are some things you should take care about:
Kafka-producer tries to allocate batch.size * number_of_destination_partitions memory on the start. if you have batch.size equals 10mb and 3k topics with 1 partition per topic, Kafka-producer will require at least ~30gb on the start (source code).
so the more destination partitions you have, the less batch.size you have to set up or the more memory you need. we chose small batch.size
messages rate per destination topics does't affect general performance. Kafka-producer sends several batches per one request. here max.request.size comes into the play (source code, maxSize is max.request.size). the higher max.request.size, the more batches could be sent per one request. it is important to understand that reaching a batch.size or a linger.ms don't instantly triggers sending batch to the broker. as soon as batch reaches the batch.size or the linger.ms, it is marked as sendable and will be processed later with other batches (source code).
moreover, batch.size or a linger.ms are not the only reasons to mark batch as sendable (check the previous link). and this is where the batches are actually sent (source code). that's why the same events rate per destination topics is not required, but still there are some nuances which will be described next.
2.1. a few words about linger.ms. can't say for sure how it acts in this scenario. on the one hand, the larger it is, the longer Kafka-producer will wait to collect messages for exact partition and the more data for that partition will be send per one request. one the other hand, it seems like the less it is, there more batches for different partitions could be packed into one request. while there is no certainty about how to do better.
despite that Kafka-producer is able to send more than one batch per request, it can't send more that one batch per request for one specific partition. thats why if you have skewed messages rate for destination topics, you have to increase partitions count for most loaded ones to increase throughput. but it's always necessary to remember that an increasing partitions count leads to an increase in memory usage.
actually, an information above helped us to solve our problems with performance. but there may be other nuances that we don't know about yet.
I hope it will be useful.

Kafka fetch max bytes doesn't work as expected

I have a topic worth 1 GB of messages. A. Kafka consumer decides to consume these messages. What could I do to prohibit the consumer from consuming all messages at once? I tried to set the
fetch.max.bytes on the broker
to 30 MB to allow only 30 MB of messages in each poll. The broker doesn't seem to honor that and tries to give all messages at once to the consumer causing Consumer out of memory error. How can I resolve this issue?
Kafka configurations can be quite overwhelming. Typically in Kafka, multiple configurations can work together to achieve a result. This brings flexibility, but flexibility comes with a price.
From the documentation of fetch.max.bytes:
Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress.
Only on the consumer side, there are more configurations to consider for bounding the consumer memory usage, including:
max.poll.records: limits the number of records retrieved in a single call to poll. Default is 500.
max.partition.fetch.bytes: limits the number of bytes fetched per partition. This should not be a problem as the default is 1MB.
As per the information in KIP-81, the memory usage in practice should be something like min(num brokers * max.fetch.bytes, max.partition.fetch.bytes * num_partitions).
Also, in the same KIP:
The consumer (Fetcher) delays decompression until the records are returned to the user, but because of max.poll.records, it may end up holding onto the decompressed data from a single partition for a few iterations.
I'd suggest you to also tune these parameters and hopefully this will get you into the desired state.

Better Understanding Min Fetch Bytes Within Kafka?

Looking at some config I'm tuning for Kafka for batching records to a file.
I see min fetch bytes which is the minimum number of bytes returned from a single poll across N partitions of a topic. Here is the scenario I'm concerned about:
I set min fetch to 100mb worth of record data. Let's say I have 250mb worth of data. I do two polls and persist 200mb. Now.. I have 50mb sitting in the queue, but I still want it to be proccessed, but don't plan of having more data to come in. If the timeout is hit, will it just grab the remaining 50mb?
Sorry, I should have looked at the docs a bit more closely. Seeing this is used in conjunction with the timeout.
fetch.max.wait.ms
By setting fetch.min.bytes, you tell Kafka to wait until it has enough
data to send before responding to the consumer. fetch.max.wait.ms lets
you control how long to wait. By default, Kafka will wait up to 500
ms. This results in up to 500 ms of extra latency in case there is not
enough data flowing to the Kafka topic to satisfy the minimum amount
of data to return. If you want to limit the potential latency (usually
due to SLAs controlling the maximum latency of the application), you
can set fetch.max.wait.ms to a lower value. If you set
fetch.max.wait.ms to 100 ms and fetch.min.bytes to 1 MB, Kafka will
receive a fetch request from the consumer and will respond with data
either when it has 1 MB of data to return or after 100 ms, whichever
happens first.
tl;dr if timeout exceeds before queue is filled, it would just return the remaining 50mb

Increase the number of messages read by a Kafka consumer in a single poll

Kafka consumer has a configuration max.poll.records which controls the maximum number of records returned in a single call to poll() and its default value is 500. I have set it to a very high number so that I can get all the messages in a single poll.
However, the poll returns only a few thousand messages(roughly 6000) in a single call even though the topic has many more. How can I further increase the number of messages read by a single consumer?
You can increase Consumer poll() batch size by increasing max.partition.fetch.bytes, but still as per documentation it has limitation with fetch.max.bytes which also need to be increased with required batch size. And also from the documentation there is one other property message.max.bytes in Topic config and Broker config to restrict the batch size. so one way is to increase all of these property based on your required batch size
In Consumer config max.partition.fetch.bytes default value is 1048576
The maximum amount of data per-partition the server will return. Records are fetched in batches by the consumer. If the first record batch in the first non-empty partition of the fetch is larger than this limit, the batch will still be returned to ensure that the consumer can make progress. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config). See fetch.max.bytes for limiting the consumer request size
In Consumer Config fetch.max.bytes default value is 52428800
The maximum amount of data the server should return for a fetch request. Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress. As such, this is not a absolute maximum. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config). Note that the consumer performs multiple fetches in parallel.
In Broker config message.max.bytes default value is 1000012
The largest record batch size allowed by Kafka. If this is increased and there are consumers older than 0.10.2, the consumers' fetch size must also be increased so that the they can fetch record batches this large.
In the latest message format version, records are always grouped into batches for efficiency. In previous message format versions, uncompressed records are not grouped into batches and this limit only applies to a single record in that case.
This can be set per topic with the topic level max.message.bytes config.
In Topic config max.message.bytes default value is 1000012
The largest record batch size allowed by Kafka. If this is increased and there are consumers older than 0.10.2, the consumers' fetch size must also be increased so that the they can fetch record batches this large.
In the latest message format version, records are always grouped into batches for efficiency. In previous message format versions, uncompressed records are not grouped into batches and this limit only applies to a single record in that case.
Most probably your payload is limited by max.partition.fetch.bytes, which is 1MB by default. Refer to Kafka Consumer configuration.
Here's good detailed explanation:
MAX.PARTITION.FETCH.BYTES
This property controls the maximum number of bytes the server will return per partition. The default is 1 MB, which means that when KafkaConsumer.poll() returns ConsumerRecords, the record object will use at most max.partition.fetch.bytes per partition assigned to the consumer. So if a topic has 20 partitions, and you have 5 consumers, each consumer will need to have 4 MB of memory available for ConsumerRecords. In practice, you will want to allocate more memory as each consumer will need to handle more partitions if other consumers in the group fail. max. partition.fetch.bytes must be larger than the largest message a broker will accept (determined by the max.message.size property in the broker configuration), or the broker may have messages that the consumer will be unable to consume, in which case the consumer will hang trying to read them. Another important consideration when setting max.partition.fetch.bytes is the amount of time it takes the consumer to process data. As you recall, the consumer must call poll() frequently enough to avoid session timeout and subsequent rebalance. If the amount of data a single poll() returns is very large, it may take the consumer longer to process, which means it will not get to the next iteration of the poll loop in time to avoid a session timeout. If this occurs, the two options are either to lower max. partition.fetch.bytes or to increase the session timeout.
Hope it helps!

How to set Kafka Producer message rate per second?

I am reading a csv file and giving the rows of this input to my Kafka Producer. now I want my Kafka Producer to produce messages at a rate of 100 messages per second.
Take a look at linger.ms and batch.size properties of Kafka Producer.
You have to adjust these properties correspondingly to get desired rate.
The producer groups together any records that arrive in between request transmissions into a single batched request. Normally this occurs only under load when records arrive faster than they can be sent out. However in some circumstances the client may want to reduce the number of requests even under moderate load. This setting accomplishes this by adding a small amount of artificial delay—that is, rather than immediately sending out a record the producer will wait for up to the given delay to allow other records to be sent so that the sends can be batched together. This can be thought of as analogous to Nagle's algorithm in TCP. This setting gives the upper bound on the delay for batching: once we get batch.size worth of records for a partition it will be sent immediately regardless of this setting, however if we have fewer than this many bytes accumulated for this partition we will 'linger' for the specified time waiting for more records to show up. This setting defaults to 0 (i.e. no delay). Setting linger.ms=5, for example, would have the effect of reducing the number of requests sent but would add up to 5ms of latency to records sent in the absense of load.
If you like stream processing then akka-streams has nice support for throttling: http://doc.akka.io/docs/akka/current/java/stream/stream-quickstart.html#time-based-processing
Then the akka-stream-kafka (aka reactive-kafka) library allows you to connect the two together: http://doc.akka.io/docs/akka-stream-kafka/current/home.html
In Kafka JVM Producer, the throughput depends upon multiple factors. And most commonly it's calculated in MB/sec rather than Msg/sec. In your example, if let's say each of your row in CSV is 1MB in size then you need to tune your producer configs to achieve 100MB/sec, so that you can achieve your target throughput of 100 Msg/sec. While tuning producer configs, you have to take into the consideration what's your batch.size ( measured in bytes ) config value? If it's set too low then producer will try to send messages more often and wait for reply from server. This will improve the producer's throughput. But would impact the latency. If you are using async callback based producer then in this case your overall throughput will be limited by how many number of messages producer can send before waiting for reply from server determined by max.in.flight.request.per.connection.
If you keep batch.size too high then producer throughput will get affected since after waiting for linger.ms period kafka producer will send the all messages in a batch to broker for that particular partition at once. But having bigger batch.size means bigger buffer.memory which might put pressure on GC.