Kafka: bizarre assignment of partitions in a topic - apache-kafka

I'm not sure how to explain the issue I'm facing with Kafka, but I'll try my best. I have a set of 4 consumers in the same consumer group named :
absolutegrounds.helper.processor
consuming from a topic with 5 partitions; therefore each consumer in the group is being assigned to 1 partition and 1 consumer to 2 partitions in order to distribute 5 partitions between 4 consumers fairly.
But for some reason I cannot figure out, the initial assignment turned into only 2 consumers being assigned into all the available partitions, i.e. 1 consumer with 3 partitions and 1 consumer with 2 partitions. However, there are still 4 consumers theoretically in the same consumer group:
[medinen#ocvlp-rks001 kafka_2.11-0.10.0.1]$ ./bin/kafka-run-class.sh kafka.admin.ConsumerGroupCommand --new-consumer --bootstrap-server localhost:9092 --describe --group absolutegrounds.helper.processor
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER
absolutegrounds.helper.processor AG_TASK_SOURCE 0 27286 31535 4249 consumer-1_/10.132.9.128
absolutegrounds.helper.processor AG_TASK_SOURCE 1 28015 28045 30 consumer-1_/10.132.9.128
absolutegrounds.helper.processor AG_TASK_SOURCE 2 35437 40091 4654 consumer-1_/10.132.9.128
absolutegrounds.helper.processor AG_TASK_SOURCE 3 31765 31874 109 consumer-1_/10.132.8.23
absolutegrounds.helper.processor AG_TASK_SOURCE 4 33279 38003 4724 consumer-1_/10.132.8.23
The most bizarre behaviour is that the other 2 consumers left out of the consumer group (as per the response from Kafka above) seem to be still consuming from the topic as per the logs I see in my application, although I cannot find them anywhere as part of the consumer group. And even more bizarre is the fact that 1 consumer is supposed to be assigned to all the partitions in the topic whereas the other one is assigned only partition 4. See logs from the application (this is a Spring Boot application using Spring Kafka):
First left consumer:
- - 08/05/2017 12:27:29.119 - [-kafka-consumer-1] INFO o.a.k.c.c.i.ConsumerCoordinator - Setting newly assigned partitions [AG_TASK_SOURCE-0, AG_TASK_SOURCE-1, AG_TASK_SOURCE-2, AG_TASK_SOURCE-3, AG_TASK_SOURCE-4] for group absolutegrounds.helper.processor
Second left consumer:
- - 08/05/2017 12:27:19.044 - [-kafka-consumer-1] INFO o.a.k.c.c.i.ConsumerCoordinator - Setting newly assigned partitions [AG_TASK_SOURCE-4] for group absolutegrounds.helper.processor
Trying to understand the reasoning behind this behaviour, I've taken a look into the topic that stores all the offsets for the consumers:
__consumer_offsets
using this command:
kafka/kafka_2.11-0.10.0.1/bin/kafka-console-consumer.sh --consumer.config /tmp/consumer.config --formatter "kafka.coordinator.GroupMetadataManager\$GroupMetadataMessageFormatter" --zookeeper ocvlp-rks003:2181 --topic __consumer_offsets --from-beginning | grep "absolutegrounds.helper.processor"
and this is what I've found:
absolutegrounds.helper.processor::[absolutegrounds.helper.processor,consumer,Stable,Map(consumer-1-170fb8f6-c8d3-4782-8940-350673b859cb -> [consumer-1-170fb8f6-c8d3-4782-8940-350673b859cb,consumer-1,/10.132.8.23,10000], consumer-1-b8d3afc0-159e-4660-bc65-faf68900c332 -> [consumer-1-b8d3afc0-159e-4660-bc65-faf68900c332,consumer-1,/10.132.9.128,10000], consumer-1-dddf10ad-187b-4a29-9996-e05edaad3caf -> [consumer-1-dddf10ad-187b-4a29-9996-e05edaad3caf,consumer-1,/10.132.8.22,10000], consumer-1-2e4069f6-f3a8-4ede-a4f4-aadce6a3adb7 -> [consumer-1-2e4069f6-f3a8-4ede-a4f4-aadce6a3adb7,consumer-1,/10.132.9.129,10000])]
absolutegrounds.helper.processor::[absolutegrounds.helper.processor,consumer,Stable,Map(consumer-1-66de4a46-538c-425f-8e95-5a00ff5eb5fd -> [consumer-1-66de4a46-538c-425f-8e95-5a00ff5eb5fd,consumer-1,/10.132.9.129,10000])]
absolutegrounds.helper.processor::[absolutegrounds.helper.processor,consumer,Stable,Map(consumer-1-5b96166e-e528-48f7-8f6e-18a67328eae6 -> [consumer-1-5b96166e-e528-48f7-8f6e-18a67328eae6,consumer-1,/10.132.9.128,10000], consumer-1-dcfff37a-8ad3-403c-a070-cca82a1f6d21 -> [consumer-1-dcfff37a-8ad3-403c-a070-cca82a1f6d21,consumer-1,/10.132.8.23,10000])]
absolutegrounds.helper.processor::[absolutegrounds.helper.processor,consumer,Stable,Map(consumer-1-5b96166e-e528-48f7-8f6e-18a67328eae6 -> [consumer-1-5b96166e-e528-48f7-8f6e-18a67328eae6,consumer-1,/10.132.9.128,10000], consumer-1-dcfff37a-8ad3-403c-a070-cca82a1f6d21 -> [consumer-1-dcfff37a-8ad3-403c-a070-cca82a1f6d21,consumer-1,/10.132.8.23,10000])]
From the response of Kafka, I can see that at some point in time all 4 consumers where properly distributed between the partitions:
absolutegrounds.helper.processor::[absolutegrounds.helper.processor,consumer,Stable,Map(consumer-1-170fb8f6-c8d3-4782-8940-350673b859cb -> [consumer-1-170fb8f6-c8d3-4782-8940-350673b859cb,consumer-1,/10.132.8.23,10000], consumer-1-b8d3afc0-159e-4660-bc65-faf68900c332 -> [consumer-1-b8d3afc0-159e-4660-bc65-faf68900c332,consumer-1,/10.132.9.128,10000], consumer-1-dddf10ad-187b-4a29-9996-e05edaad3caf -> [consumer-1-dddf10ad-187b-4a29-9996-e05edaad3caf,consumer-1,/10.132.8.22,10000], consumer-1-2e4069f6-f3a8-4ede-a4f4-aadce6a3adb7 -> [consumer-1-2e4069f6-f3a8-4ede-a4f4-aadce6a3adb7,consumer-1,/10.132.9.129,10000])]
However, at some point later, the assignment changed to the current scenario where only 2 of the 4 consumers in the consumer group are assigned partitions.
I've struggled to understand what could have led to this situation, but I cannot find a valid answer to figure it out and fix it.
Anyone can help here? Thanks.

Related

Kafka topic partition has missing offsets

I have a Flink streaming application which is consuming data from a Kafka topic which has 3 partitions. Even though, the application is continuously running and working without any obvious errors, I see a lag in the consumer group for the flink app on all 3 partitions.
./kafka-consumer-groups.sh --bootstrap-server $URL --all-groups --describe
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
group-1 topic-test 0 9566 9568 2 - - -
group-1 topic-test 1 9672 9673 1 - - -
group-1 topic-test 2 9508 9509 1 - - -
If I send new records, they get processed but the lag still exists. I tried to view the last few records for partition 0 and this is what I got (ommiting the message part) -
./kafka-console-consumer.sh --topic topic-test --bootstrap-server $URL --property print.offset=true --partition 0 --offset 9560
Offset:9560
Offset:9561
Offset:9562
Offset:9563
Offset:9564
Offset:9565
The log-end-offset value is at 9568 and the current offset is at 9566. Why are these offsets not available in the console consumer and why does this lag exist?
There were a few instances where I noticed missing offsets. Example -
Offset:2344
Offset:2345
Offset:2347
Offset:2348
Why did the offset jump from 2345 to 2347 (skipping 2346)? Does this have something to do with how the producer is writing to the topic?
You can describe your topic for any sort of configuration added while it was created. If the log compaction is enabled through log.cleanup.policy=compact, then the behaviour will be different in the runtime. You can see these lags, due to log compactions lags value setting or missing offsets may be due messages produced with a key but null value.
Configuring The Log Cleaner
The log cleaner is enabled by default. This will start the pool of cleaner threads. To enable log cleaning on a particular topic, add the log-specific property log.cleanup.policy=compact.
The log.cleanup.policy property is a broker configuration setting defined in the broker's server.properties file; it affects all of the topics in the cluster that do not have a configuration override in place. The log cleaner can be configured to retain a minimum amount of the uncompacted "head" of the log. This is enabled by setting the compaction time lag log.cleaner.min.compaction.lag.ms.
This can be used to prevent messages newer than a minimum message age from being subject to compaction. If not set, all log segments are eligible for compaction except for the last segment, i.e. the one currently being written to. The active segment will not be compacted even if all of its messages are older than the minimum compaction time lag.
The log cleaner can be configured to ensure a maximum delay after which the uncompacted "head" of the log becomes eligible for log compaction log.cleaner.max.compaction.lag.ms.
The lag is calculated based on the latest offset committed by the Kafka consumer (lag=latest offset-latest offset committed). In general, Flink commits Kafka offsets when it performs a checkpoint, so there is always some lag if check it using the consumer groups commands.
That doesn't mean that Flink hasn't consumed and processed all the messages in the topic/partition, it just means that it has still not committed them.

How to create multiple kafka consumer groups in same topic

Below is required scenario.
Topic-1 has 6 partitions, now I want to create 3 consumer groups cg1,cg2 and cg3 and map it like this (cg1 - 0,1 ; cg2 - 2,3 ; cg3 - 4,5). How can i create it using kafka-console-consumer.sh or kafka-consumer-groups.sh
Even Kafka documentation explained about this scenario but nowhere mentioned how to do it.
Any help is appreciated !!!
Kafka Consumer Group is a collection of consumers who shared the same group id. Consumer Group distributes processing by sharing partitions across consumers.
The diagram below shows a single topic with three partitions and a consumer group with two members. Each partition in the topic is assigned to exactly one member of the group.
Note: topic with n partition can at max consume by n consumer of Consumer Group with 1 partition per consumer.
In your case, if you use a consumer group on a topic means all partitions will get assigned to that Consumer group.
But if you are not interested in Consumer Group you can directly assign a partition to each consumer group in that case rebalance will not come in the picture
I am using Kafka Confluent-kafka 2.6.0-5.1.2:
sh kafka-console-consumer --bootstrap-server localhost:9092 --partition 0 --topic abc --group cg1
sh kafka-console-consumer --bootstrap-server localhost:9092 --partition 1 --topic abc --group cg1
--partition <Integer: partition> : The partition to consume from Consumption starts from the end of the partition unless '--offset' is
specified.
Using consumer group you can describe consumer details
sh kafka-consumer-groups --bootstrap-server localhost:9020 --describe --group a
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
abc 0 123 678 0 - - -
abc 1 234 345 0 - - -
You can also manually assigned partition through Java as below
List<TopicPartition> partitions = new ArrayList<>();
partitions.add(new TopicPartition("abc", 0));
partitions.add(new TopicPartition("abc", 1));
......
new KafkaConsumer<>(consumerProperties).assign(partitions);
Note that it isn't possible to mix manual partition assignment (i.e. using assign) with dynamic partition assignment through topic subscription (i.e. using subscribe).
Ref: here
There are below alternate approaches:
Use 3 separate topics to consume messages using a separate consumer group.
Programmatically filter partitions while consuming messages.
I can't try it out right now, but I think you need to pass it as a consumer property
kafka-console-consumer.sh --consumer-property group.id=${your_group_id}
or if you have a config file
kafka-console-consumer.sh --consumer.config ${your_config_file}

Current offset behavior when set by kafka-consumer-groups to earliest?

I have a kafka topic with 25 partitions and the cluster has been running for 5 months.
As per my understanding for each partition for a given topic, the offset starts from 0,1,2... (un-bounded)
I see log-end-offset at a very high value (right now -> 1230628032)
I created a new consumer group with offset being set to earliest; so i expected the offset from which a client for that consumer group will start from offset 0.
The command which I used to create a new consumer group with offset to earliest:
kafka-consumer-groups --bootstrap-server <IP_address>:9092 --reset-offsets --to-earliest --topic some-topic --group to-earliest-cons --execute
I see the consumer group being created. I expected the current-offset being to 0; however when I described the consumer group the current offset was very high , at the moment --> 1143755193.
The record retention period set is for 7 days (standard value).
My question is why didn't we see the first offset from which a consumer from this consumer group will read 0? Has it to do something with data-retention?
Can anyone help understand this?
It is exactly data retention. It is highly probable that Kafka already removed old messages with offset 0 from your partitions, so it doesn't make sense to start from 0. Instead, Kafka will set offset to the earliest available message on your partition. You can check those offsets using:
./kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list <IP_address>:9092 --topic some-topic --time -2
You will probably see values really close to what you're seeing as new consumer offset.
You can also try and set offset explicitly to 0:
./kafka-consumer-groups.sh --bootstrap-server <IP_address>:9092 --reset-offsets --to-offset 0 --topic some-topic --group to-earliest-cons --execute
However, you will see warning that offset 0 does not exist and it will use higher value (aforementioned earliest message available)
New offset (0) is lower than earliest offset for topic partition some-topic. Value will be set to 1143755193

How can we run multiple kafka consumers through command line?

I am testing kafka performance through the shell script they already provided in the kafka package. I have created a topic with 10 partitions and pumping data as shown below:
./bin/kafka-producer-perf-test.sh --topic test-topic --num-records 9000000 --record-size 300 --throughput 250000 --producer-props bootstrap.servers=110.17.14.302:9092 acks=1 max.in.flight.requests.per.connection=1 batch.size=5000
Now I want to consume the data which I am pumping as shown above from multiple consumers not just from single consumer. So I started using kafka-consumer-perf-test.sh. This is what I was doing:
./bin/kafka-consumer-perf-test.sh --zookeeper localhost:2181 --topic test-topic --group test1
Is there any way by which we can run multiple kafka consumers in a single consumer group through command line and each of those consumers working on different partitions using kafka-consumer-perf-test.sh? I am working with Kafka version 0.10.1.0
I saw this so post but it doesn't say where to configure how many consumers we want to run and what partition they will work on?
Update:
This is the error I saw:
./bin/kafka-consumer-perf-test.sh --zookeeper 110.27.14.10:2181 --messages 50 --topic test-topic --threads 1
[2017-01-11 22:34:09,785] WARN [ConsumerFetcherThread-perf-consumer-14195_kafka-cluster-3098529006-zeidk-1484174043509-46a51434-2-0], Error in fetch kafka.consumer.ConsumerFetcherThread$FetchRequest#54fb48b6 (kafka.consumer.ConsumerFetcherThread)
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:93)
at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:99)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:83)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:132)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:132)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:132)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:131)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:131)
at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:131)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:130)
at kafka.consumer.ConsumerFetcherThread.fetch(ConsumerFetcherThread.scala:109)
at kafka.consumer.ConsumerFetcherThread.fetch(ConsumerFetcherThread.scala:29)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
Just run the same command (i.e., ./bin/kafka-consumer-perf-test.sh) multiple times in different consoles.
About partition assignment: Kafka will so this automatically for you. If you use consumer groups.
If you want to do manual partition assignment, you cannot use consumer groups. For this, you cannot use kafka-consumer-perf-test.sh but need to write your own.
Read JavaDoc here: https://kafka.apache.org/0101/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html

Why kafka consumer (0.10.0.0) for new consumer group does see old/previously published messages?

I have a producer that publishes messages on a topic called 'mytopic' just fine. I have 2 consumers in 2 different consumer groups listening for these messages. I started these 2 consumers and producer in following sequence.
1) Start consumer 1 in group 'group1'
2) start producer to publish several hundreds messages
After sometime I check the offset of consumer 1, which is as I expect:
/opt/kafka_2.11-0.10.0.0/bin/kafka-consumer-offset-checker.sh --zookeeper localhost:2181 --topic mytopic --group group1
Output:
Group Topic Pid Offset logSize Lag Owner
group1 mytopic 0 30230 36942 6712 none
3) Now I start consumer 2 in group 'group2' to listen to the same messages but it comes back with 0 messages on every poll() call.
The offset check for this consumer shows me that its offset is same as the logSize.
/opt/kafka_2.11-0.10.0.0/bin/kafka-consumer-offset-checker.sh --zookeeper localhost:2181 --topic mytopic --group group2
Output:
Group Topic Pid Offset logSize Lag Owner
group2 mytopic 0 36942 36942 0 none
Same problem for any other consumer with a new consumer group. Why is the consumer joining a new consumer group after messages have been published not seeing the old messages even though messages exists on the topic (ie., haven't been deleted)?
You need to change parameter setting auto.offset.reset to value "earliest" in you consumer configuration -- default value is "latest" telling a new consumer to start consuming at the current end of the log.