Consumer wrongly ignoring already consumed messages - apache-kafka

I'm in the midst of migrating a kafka cluster (1.0.0) to a new kafka cluster (3.1). I'm using MirrorMaker2 to mirror the source cluster to the target cluster. My MirrorMaker2 setup looks something like
refresh_groups_interval_seconds = 60
refresh_topics_enabled = true
refresh_topics_interval_seconds = 60
sync_group_offsets_enabled = true
sync_topic_configs_enabled = true
emit_checkpoints_enabled = true
When looking at topics which doesn't have any migrated consumer groups, everything looks fine. When I migrate a consumer group to consumer from the target cluster (Kafka 3.1), some consumer groups are migrated successfully, while some get a huge negative lag on some partitions. This results in a lot of
Reader-18: ignoring already consumed offset <message_offset> for <topic>-<partition>
At first I didn't think of this as a big problem, I just figured that it would eventually get caught up, but after some investigation, this is a problem. I produced a new message on the source cluster, checked which offset and partition that specific message landed on the target cluster, and noticed that the migrated consumer decided to ignore that new message and log
Reader-18: ignoring already consumed offset <message_offset> for <topic>-<partition>
After that I found https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/main/src/main/java/com/google/cloud/teleport/kafka/connector/KafkaUnboundedReader.java#L202
So for some reason my consumer thinks its offset is much lower than it should be - on some partitions, not all. Any ideas on what can be wrong?
It should also be mentioned that the offset difference on the different partitions can be quite huge, almost stretching to an order of magnitude in difference.
p.s when migrating I noticed that I'm unable to do update a job. I have to kill the job and start a new one.

Related

Running two instances of MirrorMaker 2.0 halting data replication for newer topics

We tried below scenario using mirror-maker 2.0 and want to know if output of second scenario is expected.
Scenario 1.) We ran single mirror-maker 2.0 instance using the below properties and start command.
clusters=a,b
tasks.max=10
a.bootstrap.servers=kf-test-cluster-a:9092
a.config.storage.replication.factor=1
a.offset.storage.replication.factor=1
a.security.protocol=PLAINTEXT
a.status.storage.replication.factor=1
b.bootstrap.servers=kf-test-cluster-b:9092
b.config.storage.replication.factor=1
b.offset.storage.replication.factor=1
b.security.protocol=PLAINTEXT
b.status.storage.replication.factor=1
a->b.checkpoints.topic.replication.factor=1
a->b.emit.checkpoints.enabled=true
a->b.emit.hearbeats.enabled=true
a->b.enabled=true
a->b.groups=group1|group2|group3
a->b.heartbeats.topic.replication.factor=1
a->b.offset-syncs.topic.replication.factor=1
a->b.refresh.groups.interval.seconds=30
a->b.refresh.topics.interval.seconds=10
a->b.replication.factor=2
a->b.sync.topic.acls.enabled=false
a->b.topics=.*
Start command: /usr/bin/connect-mirror-maker.sh connect-mirror-maker.properties &
Verification: Created new topic "test" on source cluster(a), produced data to topic on source cluster and ran consumer on target-cluster(b),topic "a.test" to verify data replication.
Observation: Worked fine as expected.
Scenario 2.) Ran one more instance of MirrorMaker 2.0 using the same properties as mentioned above.
Start command: /usr/bin/connect-mirror-maker.sh connect-mirror-maker.properties &
Verification: Created one more "test2" topic on source cluster, produced data to topic on source cluster and ran consumer on target-cluster(b),topic "a.test2" to verify data replication.
Observation: MM2 was able to replicate the topic on the target cluster, a.test2 was present on target cluster b but consumer didn't get any record to consume.
On newer mirror-maker 2.0 instance logs, after topic replication, mirror-sourceconnector task had not restarted which was restarting in single instance after topic replication.
NOTE: There were no error logs seen.
I observed the same behavior, your messages are most likely replicated, you can verify this by checking your consumer group offset, the problem is most likely your lag offset is 0 meaning your consumer assumes all previous messages have been consumed. You can reset the offset or read from beginning.
Ideally, the checkpoint heartbeat should contain the latest offset but I currently find this to be empty even though starting with Kafka 2.7, checkpoint heartbeat replication should be automatic

Confluent.Kafka.KafkaException: Broker: Specified group generation id is not valid

Environment
3-node Kafka Cluster
Amazon MSK
v2.3
1 topic
6 partitions
1 consumer group with 2 consumers
Running in Kubernetes
Confluent .NET SDK 1.2.2
Except for bootstrap.servers and group.id, all of the default settings.
Problem
First, one of my consumers encounters the following exception.
Confluent.Kafka.KafkaException: Broker: Specified group generation id is not valid
at Confluent.Kafka.Impl.SafeKafkaHandle.Commit(IEnumerable`1 offsets)
at Confluent.Kafka.Consumer`2.Commit(IEnumerable`1 offsets)
The exception is trapped and the consumer is supposed to retry, but instead the app sits idle. The container is still up and running, but not consuming any more messages.
What's weirder is that the broker never reassigns that consumer's partitions so the consumer lag on those partitions begins to grow. It seems like the consumer is both alive (since the broker is not reassigning its partitions) and dead (since it cannot commit its offset or consume more messages). If we intervene and manually restart the consumers then the partitions are reassigned and the situation goes back to normal.
I'm not entirely sure what to make of the exception above. Google doesn't offer much. The most relevant lead I have is this issue in GitHub, which involves a broker restarting. To my knowledge, that is not happening in my situation. Any assistance would be greatly appreciated.
at least I have found a solution for me.
In my code I did a manual commit and set EnableAutoCommit = false.
Somehow it was possible that for an offset a commit was executed twice. I removed the manual commits on the consumer and set EnableAutoCommit = true.
After that it worked.

What happens to consumer groups in Kafka if the entire cluster goes down?

We have a consumer service that is always trying to read data from a topic using a consumer group. Due to redeployments, our Kafka cluster periodically is brought down and recreated again.
Whenever the cluster comes back again, we observed that although the previous topics are picked up (probably from zookeeper), the previous consumer groups are not created. Because of this, our running consumer process which is created with a previous consumer group gets stuck and never comes out.
Is this how the behavior of the consumer groups should be or is there a configuration we need to enable somewhere?
Any help is greatly appreciated.
Kafka Brokers keep a cache of healthy consumers and consumer groups, if the entire cluster is destroyed/recreated it no longer has knowledge of those consumers and groups, including offsets. The consumers will have to reconnect and re-establish the group and offsets from the beginning of the topic.
Operationally it makes more sense to keep the Kafka cluster running long-term, and do version upgrades in a rolling fashion so you don't interrupt the service.

Fixing under replicated partitions in kafka

In our production environment, we often see that the partitions go under-replicated while consuming the messages from the topics. We are using Kafka 0.11. From the documentation what is understand is
Configuration parameter replica.lag.max.messages was removed. Partition leaders will no longer consider the number of lagging messages when deciding which replicas are in sync.
Configuration parameter replica.lag.time.max.ms now refers not just to the time passed since last fetch request from the replica, but also to time since the replica last caught up. Replicas that are still fetching messages from leaders but did not catch up to the latest messages in replica.lag.time.max.ms will be considered out of sync.
How do we fix this issue? What are the different reasons for replicas go out of sync? In our scenario, we have all the Kafka brokers in the single RACK of the blade servers and all are using the same network with 10GBPS Ethernet(Simplex). I do not see any reason for the replicas to go out of sync due to the network.
We faced the same issue:
Solution was:
Restart the Zookeeper leader.
Restart the broker\brokers that are not replicating some of the partitions.
No data lose.
The issue is due to a faulty state in ZK, there was an opened issue on ZK for this, don't remember the number.
I faced the same issue on Kafka 2.0,
On restart Kafka controller node everything caught-up on the replicas.
But still looking for the reasons why few partitions are under-replicated whereas the other partitions on the same nodes for the same topic works good, and this issue i see on a random partitions.
Do NOT run reassignment for all topics together, consider running it for small portions.
Find the topic that has under-replicated partitions and where reassignment process can't be completed.
Set unclean.leader.election.enable to true for this topic.
Find under-replicated partition that stuck for this topic. Check its leader ID.
Stop the broker (just the service, not the instance).
Execute Preferred Replica Election (in yahoo/kafka-manager or manually).
Start the broker back.
Repeat for the rest of topics that have the same problem.
Also I tried this advice, it didn't help me: https://stackoverflow.com/a/51063607/1929406

Kafka 0.10.0.1 partition reassignment after broker failure

I'm testing kafka's partition reassignment as a precursor to launching a production system. I have several topics with 9 partitions each and a replication factor of 3. I've killed one of the brokers to simulate a failure condition and verified that some topics became under replicated (verification done via a fork of yahoo's kafka manager modified to allow adding a version 0.10.0.1 cluster).
I then started a new broker with a different id. I would now like to distribute partitions to this new broker. I attempted to use kafka manager's reassign partitions functionality however that did not work (possibly due to an improperly modified fork).
I saw that kafka comes with a bin/kafka-reassign-partitions.sh script but the docs say that I have to manually write out the partition reassignments for each topic in json format. Is there a way to handle this without manually deciding on which brokers partitions must go?
Hmm what a coincidence that I was doing exactly the same thing today. I don't have an answer you're probably going to like but I achieved what I wanted in the end.
Ultimately, what I did was executed the kafka-reassign-partitions command with what the same tool proposed for a reassignment. But whatever it generated I just replaced the new broker id with the old failed broker id. For some reason the generated json moved everything around.
This will fail (or rather never complete) because the old broker has passed on. I then had to delete the reassignment operation in zookeeper (znode: admin/reassign_partitions or something).
Then I restarted kafka on the new broker and it magically picked up as leader of the partition that was looking for a new replacement leader.
I'll let you know if everything is still working tomorrow and if I still have a job ;-)