Kafka consumer, which takes effect, max.poll.records or max.partition.fetch.bytes? - apache-kafka

I have trouble understand these two consumer settings, let's say I have a topic with 20 partitions, and all messages are 1KB size, and, to simplify this discussion, also suppose that I have only one consumer.
If I set max.partition.fetch.bytes=1024, then, since each partition will give me one message, I will get 20 messages in one go if I understanding it correctly.
But what if I set max.poll.records=10? Could anyone help to provide some explanation? Many thanks.

There is no direct relation between two configs. max.partition.fetch.bytes specifies max amount data per partition the server will return. On the other hand max.poll.records is specifies maximum number of records returned in a single poll().
So in your case if you have a topic with 20 partitions that each of them has one record with 1KB size and there is only one consumer which subscribe this topic,
then you can get each of the messages beacuse message size doesn't exceed max.partition.fetch.bytes but you can also get at most 10 messages in one poll.
As a result;
Your consumer will get 10 messages in the first call to poll().
In your second poll() it will get other 10 records.

Related

Limit for no of messages in a topic partition

Is there a limit for total no of messages (or records or offsets) within a partition in a topic? I know there is a max. message size limit we can set either per partition or at broker level but I want to know if there is a no of limit for total no or max no of messages within a particular partition.
Thanks
There is a theoretical limit. Kafka stores offsets as a Java Long. Therefore, I'd assume it's around Long.MAX_VALUE, but it would take years for you to really reach it.

How does one Kafka consumer read from more than one partition?

I would like to know how one consumer consumes from more than one partition, specifically, in what order are messages read from the different partitions?
I had a peek at the source code (Consumer, Fetcher) but I can't really follow all of it.
This is what I thought would happen:
Partitions are read sequentially. That is: all the messages in one partition will be read before continuing to the next. If we reach max.poll.records without consuming the whole partition, the next fetch will continue reading the current partition until it is exhausted, before going on to the next.
I tried setting max.poll.records to a relatively low number and seeing what happens.
If I send messages to a topic and then start a consumer, all the messages are read from one partition before continuing to the next, even if the number of messages in that partition is higher than max.poll.records.
Then I tried to see if I could "lock" the consumer in one partition, by sending messages to that partition continuously (using JMeter). But I couldn't do it: messages from other partitions were also being read.
The consumer is polling for messages from its assigned partitions in a greedy round-robin way.
e.g. if max.poll.records is set to 100, and there are 2 partitions assigned A,B. The consumer will try to poll 100 from A. If partition A hasn't had 100 available messages, it will poll whats left to complete to 100 messages from partition B.
Although this is not ideal, this way no partition will be starved.
This is also explain why ordering is not guaranteed between partitions.
I have read the KIP mentioned in the answer to the question linked in the comments and I think I finally understood how the consumer works.
There are two main configuration options that affect how data is consumed:
max.partition.fetch.bytes: the maximum amount of data that the server will return for a given partition
max.poll.records: the maximum amount of records that are returned each time the consumer polls
The process of fetching from each partition is greedy and proceeds in a round-robin way. Greedy means that as many records as possible will be retrieved from each partition; if all records in a partition occupy less than max.partition.fetch.bytes, then all of them will be fetched; otherwise, only max.partition.fetch.bytes will be fetched.
Now, not all the fetched records will be returned in a poll call. Only max.poll.records will be returned.
The remaining records will be retained for the next call to poll.
Moreover, if the number of retained records is less than max.poll.records, the poll method will start a new round of fetching (pre-fetching) before returning. This means that, usually, the consumer is processing records while new records are being fetched.
If some partitions receive considerably more messages than others, this could lead to the less active partitions not being processed for long periods of time.
The only downside to this approach is that it could lead to some partitions going unconsumed for an extended amount of time when there is a large imbalance between the partition's respective message rates. For example, suppose that a consumer with max messages set to 1 fetches data from partitions A and B. If the returned fetch includes 1000 records from A and no records from B, the consumer will have to process all 1000 available records from A before fetching on partition B again.
In order to prevent this, we could reduce max.partition.fetch.bytes.

How to configure kafka consumer so that during after autoscaling the total number of messages fetched remain same?

Let's say I have a consumer running which is fetching from 10 partitions. In one poll request the consumer fetches 10 records per partition so 100 records in total.
Now after adding one more consumer to the group and rebalancing both consumers are fetching from 5 partitions and each consumer is now fetching 50 records in total(10 per partition).
I want to know if there's a way we can configure Kafka Consumer so that even after adding one more consumer both the consumers starts fetching 20 records per partition so that total remains 100.
I tried using max.poll.records and fetch.max.bytes but it didn't work for me.
After setting fetch.max.bytes to say 1000 kafka was fetching 25 records from partitions.
And after setting max.poll.records to say 50 each partition had 25 max records during poll so 250 records for 10 partitions. I want to keep the records to 50 overall. How can I do that ?
There is no direct configuration you can set to tell the KafkaConsumer how many messages exactly it should fetch.
I am sure there are other solutions but I see the following two options:
If you have the knowledge of the messages sizes and the messages have roughly identical byte size, use fetch.min.bytes together with fetch.max.wait.ms to get the minimal required messages. Adjusting the max.poll.records you can try to get to the exact required number.
Use seek of the KafkaConsumer to tell the consumer exactly to which offset position per partition it should fetch the data on the next poll. The seek API is described in the JavaDocs of the KafkaConsumer as "Overrides the fetch offsets that the consumer will use on the next poll(timeout). If this API is invoked for the same partition more than once, the latest offset will be used on the next poll(). Note that you may lose data if this API is arbitrarily used in the middle of consumption, to reset the fetch offsets".

Kafka fetch max bytes doesn't work as expected

I have a topic worth 1 GB of messages. A. Kafka consumer decides to consume these messages. What could I do to prohibit the consumer from consuming all messages at once? I tried to set the
fetch.max.bytes on the broker
to 30 MB to allow only 30 MB of messages in each poll. The broker doesn't seem to honor that and tries to give all messages at once to the consumer causing Consumer out of memory error. How can I resolve this issue?
Kafka configurations can be quite overwhelming. Typically in Kafka, multiple configurations can work together to achieve a result. This brings flexibility, but flexibility comes with a price.
From the documentation of fetch.max.bytes:
Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress.
Only on the consumer side, there are more configurations to consider for bounding the consumer memory usage, including:
max.poll.records: limits the number of records retrieved in a single call to poll. Default is 500.
max.partition.fetch.bytes: limits the number of bytes fetched per partition. This should not be a problem as the default is 1MB.
As per the information in KIP-81, the memory usage in practice should be something like min(num brokers * max.fetch.bytes, max.partition.fetch.bytes * num_partitions).
Also, in the same KIP:
The consumer (Fetcher) delays decompression until the records are returned to the user, but because of max.poll.records, it may end up holding onto the decompressed data from a single partition for a few iterations.
I'd suggest you to also tune these parameters and hopefully this will get you into the desired state.

Increase the number of messages read by a Kafka consumer in a single poll

Kafka consumer has a configuration max.poll.records which controls the maximum number of records returned in a single call to poll() and its default value is 500. I have set it to a very high number so that I can get all the messages in a single poll.
However, the poll returns only a few thousand messages(roughly 6000) in a single call even though the topic has many more. How can I further increase the number of messages read by a single consumer?
You can increase Consumer poll() batch size by increasing max.partition.fetch.bytes, but still as per documentation it has limitation with fetch.max.bytes which also need to be increased with required batch size. And also from the documentation there is one other property message.max.bytes in Topic config and Broker config to restrict the batch size. so one way is to increase all of these property based on your required batch size
In Consumer config max.partition.fetch.bytes default value is 1048576
The maximum amount of data per-partition the server will return. Records are fetched in batches by the consumer. If the first record batch in the first non-empty partition of the fetch is larger than this limit, the batch will still be returned to ensure that the consumer can make progress. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config). See fetch.max.bytes for limiting the consumer request size
In Consumer Config fetch.max.bytes default value is 52428800
The maximum amount of data the server should return for a fetch request. Records are fetched in batches by the consumer, and if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned to ensure that the consumer can make progress. As such, this is not a absolute maximum. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config). Note that the consumer performs multiple fetches in parallel.
In Broker config message.max.bytes default value is 1000012
The largest record batch size allowed by Kafka. If this is increased and there are consumers older than 0.10.2, the consumers' fetch size must also be increased so that the they can fetch record batches this large.
In the latest message format version, records are always grouped into batches for efficiency. In previous message format versions, uncompressed records are not grouped into batches and this limit only applies to a single record in that case.
This can be set per topic with the topic level max.message.bytes config.
In Topic config max.message.bytes default value is 1000012
The largest record batch size allowed by Kafka. If this is increased and there are consumers older than 0.10.2, the consumers' fetch size must also be increased so that the they can fetch record batches this large.
In the latest message format version, records are always grouped into batches for efficiency. In previous message format versions, uncompressed records are not grouped into batches and this limit only applies to a single record in that case.
Most probably your payload is limited by max.partition.fetch.bytes, which is 1MB by default. Refer to Kafka Consumer configuration.
Here's good detailed explanation:
MAX.PARTITION.FETCH.BYTES
This property controls the maximum number of bytes the server will return per partition. The default is 1 MB, which means that when KafkaConsumer.poll() returns ConsumerRecords, the record object will use at most max.partition.fetch.bytes per partition assigned to the consumer. So if a topic has 20 partitions, and you have 5 consumers, each consumer will need to have 4 MB of memory available for ConsumerRecords. In practice, you will want to allocate more memory as each consumer will need to handle more partitions if other consumers in the group fail. max. partition.fetch.bytes must be larger than the largest message a broker will accept (determined by the max.message.size property in the broker configuration), or the broker may have messages that the consumer will be unable to consume, in which case the consumer will hang trying to read them. Another important consideration when setting max.partition.fetch.bytes is the amount of time it takes the consumer to process data. As you recall, the consumer must call poll() frequently enough to avoid session timeout and subsequent rebalance. If the amount of data a single poll() returns is very large, it may take the consumer longer to process, which means it will not get to the next iteration of the poll loop in time to avoid a session timeout. If this occurs, the two options are either to lower max. partition.fetch.bytes or to increase the session timeout.
Hope it helps!