Kafka offset topic compaction is not happening - apache-kafka

Kafka 0.11.0.0 has been running in production. We see that log compaction of the consumer offsets topic is not happening. In the consumer offset partitions, we see log segments remaining there for the last 3 months. Log cleaner logs showed that it failed building the map for compaction due to "CorruptRecordException".
Since there were a lot of segment files each of size 100mb in the partitions, instead of taking a DumpLogSegements and finding the bad segment, we decided to go ahead and delete the old segment files and keep only the ones from the last 3 days. After this, we restarted kafka and it seemed to work fine.
But in 2 days of doing this, we are seeing the logs getting built up again, just as it did before. We no longer see a corruptRecord Exception in the logs, but the offsets are not getting compacted and its been 7 days since.
None of the default values for compaction or retention were changed. preallocate is also set to false. Can anybody give me any insight of what could be going on here?
Edit:
The CorruptRecordException that I was running into seems to originate from AbstractLegacyRecordBatch.java
long offset = offsetAndSizeBuffer.getLong(Records.OFFSET_OFFSET);
int size = offsetAndSizeBuffer.getInt(Records.SIZE_OFFSET);
if (size < LegacyRecord.RECORD_OVERHEAD_V0)
throw new CorruptRecordException(String.format("Record size is less than the minimum record overhead (%d)", LegacyRecord.RECORD_OVERHEAD_V0));
Any idea about when this can occur and why the compaction is not happeneing even after the old segments are deleted.

Related

Kafka: Messages disappearing from topics, largestTime=0

We have messages disappearing from topics on Apache Kafka with versions 2.3, 2.4.0, 2.4.1 and 2.5.0. We noticed this when we make a rolling deployment of our clusters and unfortunately it doesn't happen every time, so it's very inconsistent.
Sometimes we lose all messages inside a topic, other times we lose all messages inside a partition. When this happens the following log is a constant:
[2020-04-27 10:36:40,386] INFO [Log partition=test-lost-messages-5, dir=/var/kafkadata/data01/data] Deleting segments List(LogSegment(baseOffset=6, size=728, lastModifiedTime=1587978859000, largestTime=0)) (kafka.log.Log)
There is also a previous log saying this segment hit the retention time breach of 24 hours. In this example, the message was produced ~12 minutes before the deployment.
Notice, all messages that are wrongly deleted have largestTime=0 and the ones that are properly deleted have a valid timestamp in there. From what we read from documentation and code it looks like the largestTime is used to calculate if a given segment reached the time breach or not.
Since we can observe this in multiple versions of Kafka, we think this might be related to anything external to Kafka. E.g Zookeeper.
Does anyone have any ideas of why this could be happening? We are using Zookeeper 3.6.0.
We found out that the cause was not related to Kafka itself but to the volume where we stored the logs. Still, the following explanation might be useful for educational purposes:
In detail, it was a permission problem where Kafka was not able to read the .timeindex files when the log cleaner was triggered. This caused largestTime to be 0 and lead to some messages being deleted way before the retention time.
Each topic partition is divided into several segments and the last are then stored into different .log files that contain the actual messages. For each .log file there is a .timeindex file containing a map between offset and lastModifiedTime.
When Kafka needs to check if a segment is deletable, it searches for the most recent offset lastModifiedTime and stores it as largestTime. Then, checks if the retention limit was reached: currentTime - largestTime > retentionTime.
If so, it deletes the segment and the respective messages.
Since Kafka was not able to read the file, largestTime was 0 and the check currentTime > retentionTime was always true for our 1-day retention.
Ensure date is synced between all Kafka brokers and ZooKeeper nodes.
Bash command: date.
Compare year, day, hour and minute.

Delete a specific record in a Kafka topic using compaction

I am trying to delete a specific message or record from a Kafka topic. I understand that Kafka was not build to do that. But is it possible to use topic compaction with the ability to replace a record with an empty record using a specific Kafka key? How can this be done?
Thank you
Yes, you could get rid of a particular message if you have a compacted topic.
In that case your message key becomes the identifier. If you then want to delete a particular message you need to send a message with the same key and an empty value to the topic. This is called a tombstone message. Kafka will keep this tombstone around for a configurable amount of time ( so your consumers can deal with the deletion). After this set amount of time, the cleaner thread will remove the tombstone message, and the key will be gone from the partition in Kafka.
In general, please note, that the old (to be deleted) message will not disappear immediately. Depending on the configurations, it could take some time before the replacement of the individual message is happening.
I found this summary on the configurations quite helpful (link to blog)
1) To activate compaction cleanup policy cleanup.policy=compact should be placed
2) The consumer sees all tombstones as long as the consumer reaches head of a log in a period less than the topic config delete.retention.ms (the default is 24 hours).
3) The number of these threads are configurable through log.cleaner.threads config
4) The cleaner thread then chooses the log with the highest dirty ratio.
dirty ratio = the number of bytes in the head / total number of bytes in the log(tail + head)
5) Topic config min.compaction.lag.ms gets used to guarantee a minimum period that must pass before a message can be compacted.
6) To set delay to start compacting records after they are written use topic config log.cleaner.min.compaction.lag.ms. Records won’t get compacted until after this period. The setting gives consumers time to get every record.
The log compaction is introduced as
Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition.
Its guarantees are listed here:
Log compaction is handled by the log cleaner, a pool of background threads that recopy log segment files, removing records whose key appears in the head of the log. Each compactor thread works as follows:
1) It chooses the log that has the highest ratio of log head to log tail
2) It creates a succinct summary of the last offset for each key in the head of the log
3) It recopies the log from beginning to end removing keys which have a later occurrence in the log. New, clean segments are swapped into the log immediately so the additional disk space required is just one additional log segment (not a fully copy of the log).
4)The summary of the log head is essentially just a space-compact hash table. It uses exactly 24 bytes per entry. As a result with 8GB of cleaner buffer one cleaner iteration can clean around 366GB of log head (assuming 1k messages).

kafka partition has lots of log segments

One topic has 20 partitions, almost everyone has more than 20,000 log segment files, most of them are created months ago. Even after I config the retention.ms to very short, the segments are not deleted. While other topics can recycle normal.
I am wondering what's the issue inside, and how to solve it. Because I'm worry about the number of total segments will keep increasing that larger than OS vm.max_map_count, which will damage kafka process itself. Following image is the describe about the abnormal topic.
Not sure what the issue is exactly, but some things to consider:
Broker vs topic-specific configs. Check to make sure your topic actually has the configs you think it has, and is not inheriting them from the broker settings.
Configs related to retention. As mentioned by Girogos Myrianthous, you can look at log.retention.check.interval.ms and log.cleanup.policy. I would also look at the roll related settings, like log.roll.hours. I believe that in some cases, Kafka will not delete a segment until its partition rolls, even if the segment is old. And rolling follows the following behavior:
The log rolling time is no longer depending on log segment create time. Instead it is now based on the timestamp in the messages. More specifically. if the timestamp of the first message in the segment is T, the log will be rolled out when a new message has a timestamp greater than or equal to T + log.roll.ms (http://kafka.apache.org/20/documentation.html)
So make sure to consider the record timestamps, not just the segment files' age.
Finally:
What version of Kafka are you using?
Have you looked carefully at the broker logs? Broker logs is how I've solved all such problems that I've encountered.

Cannot make all messages in a Kafka topic expired with retention

I often clean up all current messages in a Kafka topic by updating retention.ms to 10. That makes all messages will be expired after 10ms. However, sometimes, the messages cannot be cleaned up by that way. I had to
drop and re-create the topic in order to clean up all messages.
I'm not sure it's related to the issue or not, but it often happens after all consumers of that topic have been stopped working by some reason.
What could be the root cause for this?
The retention.ms field is a minimum time for the log cleaner. The log cleaner only runs once every so often (the Kafka docs state 300000 ms), and only on closed log segments (default size of 1GB), so you may have to wait for it to run or need more data in the topic

Kafka vm.max_map_count

We have a Kafka cluster for Kafka stream application.
After some hours our broker went down and we got OutOfMemory exception.
We saw the vm.max_map_count is not enough and maps memory of the process is above 40K.
Can someone explain what can be the problem or what influence on that parameter?
The number always increases and never goes down.
Based on the pull request at https://github.com/apache/kafka/pull/4358/files (both the change being proposed and the comments reacting to it), it appears that each log segment (i.e. file) in each partition on each topic on the broker consumes two maps.
I would expect the value to rise until you reach a steady-state where all topics have logs that are old enough to start being deleted due to the retention interval. At that point, each new file would be expected to occur at around the same time as an older one is deleted (assuming roughly constant message rates). I would expect the value to drop if topics were deleted or if you changed the configuration of an existing topic or the full broker (e.g. reduce the log retention time or cause the logs to roll over less frequently), and to go up if you change the configuration in the opposite direction.