Apache Kafka: kafka.common.OffsetsOutOfOrderException when reassigning __consumer_offsets to new brokers - apache-kafka

we have upgraded Kafka from 2.2 to 2.6, and added 4 new brokers to the existing 4 brokers. This went fine.
After that we started to reassign topic data to the new brokers. Most topics went ok, but on one of the 50 partitions of __consumer_offsets, the reassignment hangs. 49 of the partitions were successfully moved from the old brokers (id 3,4,5,6) to the new (ids 10,11,12,13).
But on __consumer_offsets-18 we consistently get this error (in the server.log of the new brokers)
[2020-10-24 15:04:54,528] ERROR [ReplicaFetcher replicaId=10, leaderId=3, fetcherId=0] Unexpected error occurred while processing data for partition __consumer_offsets-18 at offset 1545264631 (kafka.server.ReplicaFetcherThread)
kafka.common.OffsetsOutOfOrderException: Out of order offsets found in append to __consumer_offsets-18: ArrayBuffer(1545264631, 1545264632,
... thousands of other ids
1545272005, 1545272006, 1545272007)
at kafka.log.Log.$anonfun$append$2(Log.scala:1126)
at kafka.log.Log.append(Log.scala:2340)
at kafka.log.Log.appendAsFollower(Log.scala:1036)
at kafka.cluster.Partition.doAppendRecordsToFollowerOrFutureReplica(Partition.scala:939)
at kafka.cluster.Partition.appendRecordsToFollowerOrFutureReplica(Partition.scala:946)
at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:168)
at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:332)
at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6(AbstractFetcherThread.scala:320)
at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$6$adapted(AbstractFetcherThread.scala:319)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
at scala.collection.AbstractIterable.foreach(Iterable.scala:920)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:319)
at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:135)
at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:134)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:117)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
[2020-10-24 15:04:54,534] INFO [ReplicaFetcher replicaId=10, leaderId=4, fetcherId=0] Truncating partition __consumer_offsets-31 to local high watermark 0 (kafka.server.ReplicaFetcherThread)
[2020-10-24 15:04:54,547] WARN [ReplicaFetcher replicaId=10, leaderId=3, fetcherId=0] Partition __consumer_offsets-18 marked as failed (kafka.server.ReplicaFetcherThread)
Any idea what is wrong here? The whole cluster seems to process data nicely. It's just that we can't move over this particular partition. We tried various things (cancelling the reassignment, restarting it with all partitions, restarting it with just partition 18, restarting all brokers) to no avail
Help very much appreciated, this is happening in PROD only after it worked successfully on all test environments.
EDIT: we actually found that in the huge list of message offsets in the list of the exception, there is actually a descrepancy. The relevant part of this list is
... 1545271418, 1545271419, 1, 1, 1545271422, 1545271423,
Obviously the two '1' entries there look really wrong! They should be 1545271420/1545271421. Could it be that the leader really has some kind of data corruption?

We were ultimately able to solve this issue - mainly by sitting and waiting
The issue was indeed that somewhen, somehow, the data on the leader of this __consumer_offset-18 partition got corrupted. This probably happened during the upgrade from Kafka 2.2 -> 2.6. We were doing this in a rather dangerous way, as we know now: we simply stopped all brokers, updated the SW on all, and restarted them. We thought since we can afford this outage on the weekend, this would be a safe way. But we will certainly never do that again. At least not unless we know 100% that all producers and all consumers are really stopped. This was NOT the case during that upgrade, we overlooked one consumer and left that running, and that consumer (group) was storing their offsets in the __consumer_offset-18 partition. So that action - taking all brokers down and upgrade them - while consumers/producers are still running, did probably cause the corruption
Lesson learnt: never do that again, always do rolling upgrades, even if they take a lot longer.
The issue was then actually solved through the nature of the compacting topic. The default settings for compaction are to run it every week (or when a segment gets bigger than 1GB, which will not happen in our case). Compaction of that __consumer_offset-18 partition kicked in yesterday evening. And by that the 2 corrupted offsets were purged away. After that it was uphill, the reassign of that partition to the new brokers then worked like a charm.
We could certainly have speeded up this recovery by setting the topic parameters in a way that compaction would kick in earlier. But it was only on Wednesday when we reached the understanding that the problem could actually be resolved that way. We decided to leave everything as it was and wait another day.

Related

Kafka Streams shutdown after IllegalStateException: No current assignment for partition

I have a Kafka Streams application that launches and runs successfully. We have 4 instances of the application running. Occasionally one of our instance of the application is legitimately killed which causes several rounds of rebalancing until the old node is replaced.
Sometimes during the rebalance, one ore more previously healthy nodes fail. The logs are indicating that the Streams application transitions into a PENDING_SHUTDOWN state directly after receiving the following exception:
java.lang.IllegalStateException: No current assignment for partition public.chat.message-28
at org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:256)
at org.apache.kafka.clients.consumer.internals.SubscriptionState.resetFailed(SubscriptionState.java:418)
at org.apache.kafka.clients.consumer.internals.Fetcher$2.onFailure(Fetcher.java:621)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177)
at org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147)
at org.apache.kafka.clients.consumer.internals.RequestFutureAdapter.onFailure(RequestFutureAdapter.java:30)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onFailure(RequestFuture.java:209)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireFailure(RequestFuture.java:177)
at org.apache.kafka.clients.consumer.internals.RequestFuture.raise(RequestFuture.java:147)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:571)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:389)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:297)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:292)
at org.apache.kafka.clients.consumer.internals.Fetcher.getAllTopicMetadata(Fetcher.java:275)
at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1849)
at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1827)
at org.apache.kafka.streams.processor.internals.StoreChangelogReader.refreshChangelogInfo(StoreChangelogReader.java:259)
at org.apache.kafka.streams.processor.internals.StoreChangelogReader.initialize(StoreChangelogReader.java:133)
at org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:79)
at org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:328)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:866)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:804)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:773)
Prior to this error we often seem to also recieve some informational logs reporting a disconnect exception:
Error sending fetch request (sessionId=568252460, epoch=7) to node 4: org.apache.kafka.common.errors.DisconnectException
I have a feeling the two are related but I'm unable to reason why at present.
Is anyone able to give me some hints as to what may be causing this issue and any possible solutions?
Additional Info:
Kafka 2.2.1
32 partitions spread evenly across the 4 worker nodes
StreamsConfig settings:
kafkaStreamProps.put(StreamsConfig.REPLICATION_FACTOR_CONFIG, 2);
kafkaStreamProps.put(StreamsConfig.NUM_STANDBY_REPLICAS_CONFIG, 1);
kafkaStreamProps.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 4);
kafkaStreamProps.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 120000);
kafkaStreamProps.put(StreamsConfig.TOPOLOGY_OPTIMIZATION, StreamsConfig.OPTIMIZE);
This looks like it could be related to https://issues.apache.org/jira/browse/KAFKA-9073, which has been fixed in Kafka Streams 2.3.2.
If you can't wait for that release, you could try creating a private build using the changeset from this pull request: https://github.com/apache/kafka/pull/7630/files

How to solve a problem with checkpointed invalid __consumer_offsets and producer epoch on partitions of __transaction_state

I have two kinds of log entries in server.log
First kind:
WARN Resetting first dirty offset of __consumer_offsets-6 to log start offset 918 since the checkpointed offset 903 is invalid. (kafka.log.LogCleanerManager$)
Second kind:
INFO [TransactionCoordinator id=3] Initialized transactionalId Source: AppService Kafka consumer -> Not empty string filter -> CDMEvent mapper -> (NonNull CDMEvent filter -> Map -> Sink: Kafka CDMEvent producer, Nullable CDMEvent filter -> Map -> Sink: Kafka Error producer)-bddeaa8b805c6e008c42fc621339b1b9-2 with producerId 78004 and producer epoch 23122 on partition __transaction_state-45 (kafka.coordinator.transaction.TransactionCoordinator)
I have found some suggestion that mentions that removing the checkpoint file might help:
https://medium.com/#anishekagarwal/kafka-log-cleaner-issues-80a05e253b8a
"What we gathered was to:
stop the broker
remove the log cleaner checkpoint file
( cleaner-offset-checkpoint )
start the broker
that solved the problem for us."
Is it safe to try that with all checkpoint files (cleaner-offset-checkpoint, log-start-offset-checkpoint, recovery-point-offset-checkpoint, replication-offset-checkpoint) or is it not recommendable at all with any of them?
I have stopped each broker and moved cleaner-offset-checkpoint to a backup location and started it without that file, brokers neatly started, deleted a lot of excessive segments and they don't log:
WARN Resetting first dirty offset of __consumer_offsets to log start offset since the checkpointed offset is invalid
any more, obviously, this issue/defect https://issues.apache.org/jira/browse/KAFKA-6266 is not solved yet, even in 2.0. 2. However, that didn't compact the consumer offset according to expectations, namely offsets.retention.minutes default is 10080 (7 days), and I tried to set it explicitely to 5040, but it didn't help, still there are messages more than one month old, since log.cleaner.enable is by default true, they should be compacted, but they are not, the only possible try is to set the cleanup.policy to delete again for the __consumer_offsets topic, but that is the action that triggered the problem, so I am a bit reluctant to do that. The problem that I described here No Kafka Consumer Group listed by kafka-consumer-groups.sh is also not resolved by that, obviously there is something preventing kafka-consumer-groups.sh to read the __consumer_offsets topic (when issued with --bootstrap-server option, otherwise it reads it from zookeeper) and display results, that's something that Kafka Tool does without problem, and I believe these two problems are connected.
And the reason why I think that topic is not compacted, is because it has messages with exactly the same key (and even timestamp), older than it should, according to broker settings. Kafka Tool also ignores certain records and doesn't interpret them as Consumer Groups in that display. Why kafka-consumer-groups.sh ignores all, that is probably due to some corruption of these records.

Kafka UNKNOWN_PRODUCER_ID exception

I sometimes find UNKNOWN_PRODUCER_ID exception when using kafka streams.
2018-06-25 10:31:38.329 WARN 1 --- [-1-1_0-producer] o.a.k.clients.producer.internals.Sender : [Producer clientId=default-groupz-7bd94946-3bc0-4400-8e73-7126b9b9c0d4-StreamThread-1-1_0-producer, transactionalId=default-groupz-1_0] Got error produce response with correlation id 1996 on topic-partition default-groupz-mplat-five-minute-stat-urlCount-counts-store-changelog-0, retrying (2147483646 attempts left). Error: UNKNOWN_PRODUCER_ID
Referred to official documents:
This exception is raised by the broker if it could not locate the
producer metadata associated with the producerId in question. This
could happen if, for instance, the producer's records were deleted
because their retention time had elapsed. Once the last records of the
producerId are removed, the producer's metadata is removed from the
broker, and future appends by the producer will return this exception.
It says one possibility is that a producer is idle for more than retention time (by default a week) so the producer's metadata will be removed from broker. Are there any other reasons that brokers could not locate producer metadata?
You might be experiencing https://issues.apache.org/jira/browse/KAFKA-7190. As it says in that ticket:
When a streams application has little traffic, then it is possible that consumer purging would delete
even the last message sent by a producer (i.e., all the messages sent by
this producer have been consumed and committed), and as a result, the broker
would delete that producer's ID. The next time when this producer tries to
send, it will get this UNKNOWN_PRODUCER_ID error code, but in this case,
this error is retriable: the producer would just get a new producer id and
retries, and then this time it will succeed.
This issue is also being tracked at https://cwiki.apache.org/confluence/display/KAFKA/KIP-360%3A+Improve+handling+of+unknown+producer
Two reasons might delete your producer's metadata:
The log segments are deleted due to hitting retention time.
The producer state might get expired due to inactivity which is controlled by the setting transactional.id.expiration.ms which defaults to 7 days
So if your Kafka is < 2.4 you can workaround this by increasing the retention time(considering that your system allows that) of your topic's log(e.g 30 days) and to increase the transactional.id.expiration.ms setting( to 24 days) until KIP-360 is released:
log.retention.hours=720
transactional.id.expiration.ms=2073600000
This shall guarantee that for low-traffic topics(messages written rarely than 7 days), your producer's metadata state will remain stored in broker's memory for a longer period, thus decreasing the risk of getting UnknownProducerIdException.

UnknownProducerIdException in Kafka streams when enabling exactly once

After enabling exactly once processing on a Kafka streams application, the following error appears in the logs:
ERROR o.a.k.s.p.internals.StreamTask - task [0_0] Failed to close producer
due to the following error:
org.apache.kafka.streams.errors.StreamsException: task [0_0] Abort
sending since an error caught with a previous record (key 222222 value
some-value timestamp 1519200902670) to topic exactly-once-test-topic-
v2 due to This exception is raised by the broker if it could not
locate the producer metadata associated with the producerId in
question. This could happen if, for instance, the producer's records
were deleted because their retention time had elapsed. Once the last
records of the producerId are removed, the producer's metadata is
removed from the broker, and future appends by the producer will
return this exception.
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:125)
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:48)
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:180)
at org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1199)
at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204)
at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187)
at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627)
at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:596)
at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:557)
at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:481)
at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:692)
at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:482)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:474)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.UnknownProducerIdException
We've reproduced the issue with a minimal test case where we move messages from a source stream to another stream without any transformation. The source stream contains millions of messages produced over several months. The KafkaStreams object is created with the following StreamsConfig:
StreamsConfig.PROCESSING_GUARANTEE_CONFIG = "exactly_once"
StreamsConfig.APPLICATION_ID_CONFIG = "Some app id"
StreamsConfig.NUM_STREAM_THREADS_CONFIG = 1
ProducerConfig.BATCH_SIZE_CONFIG = 102400
The app is able to process some messages before the exception occurs.
Context information:
we're running a 5 node Kafka 1.1.0 cluster with 5 zookeeper nodes.
there are multiple instances of the app running
Has anyone seen this problem before or can give us any hints about what might be causing this behaviour?
Update
We created a new 1.1.0 cluster from scratch and started to process new messages without problems. However, when we imported old messages from the old cluster, we hit the same UnknownProducerIdException after a while.
Next we tried to set the cleanup.policy on the sink topic to compact while keeping the retention.ms at 3 years. Now the error did not occur. However, messages seem to have been lost. The source offset is 106 million and the sink offset is 100 million.
As explained in the comments, there currently seems to be a bug that may cause problems when replaying messages older than the (maximum configurable?) retention time.
At time of writing this is unresolved, the latest status can always be seen here:
https://issues.apache.org/jira/browse/KAFKA-6817

Cannot create more than 15 topics in Kafka

Me and my colleague were testing out Kafka on a 3 nodes cluster and we encountered this problem were trying to test the performance of sending message to multiple topics. We can't create more than 15 topics. The first 15 topics works fine. But when trying to create the 16th topic(and topics onward), a lot of errors started to appear in the 2 follower servers.
One with a lot of errors like this:
ERROR [ReplicaFetcherThread-0-1], Error for partition [__consumer_offsets,36] to broker 1:org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error when processing the request (kafka.server.ReplicaFetcherThread)
The other with errors like this:
[2017-06-16 18:44:07,146] ERROR [KafkaApi-1] Error when handling request {replica_id=2,max_wait_time=500,min_bytes=1,max_bytes=10485760,topics=[{topic=__consumer_offsets,partitions=[{partition=6,fetch_offset=5,max_bytes=1048576},{partition=36,fetch_offset=3,max_bytes=1048576},{partition=18,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-12,partitions=[{partition=1,fetch_offset=1,max_bytes=1048576}]},{topic=multi-test-11,partitions=[{partition=2,fetch_offset=1,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=0,fetch_offset=5,max_bytes=1048576},{partition=45,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-16,partitions=[{partition=0,fetch_offset=0,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=27,fetch_offset=0,max_bytes=1048576},{partition=12,fetch_offset=0,max_bytes=1048576},{partition=9,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-10,partitions=[{partition=0,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-9,partitions=[{partition=2,fetch_offset=0,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=39,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-4,partitions=[{partition=1,fetch_offset=1,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=21,fetch_offset=10,max_bytes=1048576}]},{topic=multi-test-3,partitions=[{partition=2,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-13,partitions=[{partition=0,fetch_offset=1,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=3,fetch_offset=10,max_bytes=1048576},{partition=48,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-8,partitions=[{partition=0,fetch_offset=0,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=33,fetch_offset=0,max_bytes=1048576},{partition=30,fetch_offset=15,max_bytes=1048576},{partition=15,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-1,partitions=[{partition=1,fetch_offset=1,max_bytes=1048576}]},{topic=multi-test-0,partitions=[{partition=2,fetch_offset=0,max_bytes=1048576}]},{topic=multi-test-2,partitions=[{partition=1,fetch_offset=1,max_bytes=1048576}]},{topic=__consumer_offsets,partitions=[{partition=42,fetch_offset=3,max_bytes=1048576},{partition=24,fetch_offset=0,max_bytes=1048576}]}]} (kafka.server.KafkaApis)
kafka.common.NotAssignedReplicaException: Leader 1 failed to record follower 2's position -1 since the replica is not recognized to be one of the assigned replicas for partition multi-test-16-0.
at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:246)
at kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:920)
at kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:917)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:917)
at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:462)
at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:530)
at kafka.server.KafkaApis.handle(KafkaApis.scala:81)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:62)
at java.lang.Thread.run(Thread.java:748)
We assigned a replication factor of 2 and a partition of 3 for each topic and every topic is created in the same way.I deleted and recreated each topic manually just to make sure that 15-16 is the exact number that everything went wrong.
Well, weird problems seems to always have weird answers.
Turns out, the problem is that one of our node is using a x32 cpu, switching it to a x64 machine solved the problem.