Kafka Log Retention does not work for topic - apache-kafka

For a Kafka topic I set segment.ms and retention.ms to 86400000ms (1 day).
With this topics config my assumption is that Kafka will roll a log segment after 1 day and also delete it, because retention.ms is also set to one day.
However, nothing happened. The segments are not rolled and therefore not deleted. At least I see nothing in the server logs and the free space is shrinking continuously.
Mysteriously, if I set segment.ms and retention.ms to a smaller value, for example 1800000ms (30 minutes), everything is working perfectly.

Related

Topic Retention Time not taking affect

My Topic Configs: min.insync.replicas=2,cleanup.policy=delete,segment.bytes=268435456,retention.ms=172800000,file.delete.delay.ms=60000,max.message.bytes=10485772,delete.retention.ms=86400000,segment.ms=1000
They are same in broker level except segment.bytes which is 1GB not 256mb like in the topic level, and log.roll.ms is 1000 on broker level but log.roll.hours is 168.
In doc it says log.roll.ms is above hours.
https://docs.confluent.io/platform/current/installation/configuration/broker-configs.html#log-roll-hours
PS: On the same cluster, MM2 is working.
I tried to change some parameters and tested that it works but now it doesn't.
Only closed segments will be deleted. If you have less than your configure bytes (256MB), or segments that have not been rolled (every 1 second is very excessive, btw), then they will remain available. And for any segment that will be deleted, it will be delayed by file.delete.delay.ms.
The problem was, I couldn't wait enought time... It worked just like before. You can use the upper parameters.

kafka topic with compact,delete does not remove old data

I have a kafka changelog topic which has a cleanup policy COMPACT_DELETE .
It has a retention.ms of 86520000 which is approximately 1 day.
However I've observed that partitions in this topic has data which is over a month old .
I would expect that since DELETE is also part of the cleanup policy there should be no messages beyond 1 day in any of the partitions in this topic .
The major problem here is that this topic is constantly growing and is never settling down which is causing disk issues in the kafka broker side.
I'd like to understand why retention.ms isnt kicking in for COMPACT_DELETE topics.

Undelete messages from kafka

I have mixed retention.ms and delete.retention.ms properties in kafka, with the result that some messages that I'm interested are now in a deleted status. The messages are not deleted from the disk, as the delete.retention.ms property is big enough.
So, I can see that the segment files are on disk, but for kafka the earlies message is only 1 day ago even with files of 4 or 5 months on the directory.
Is there a way to tell kafka to move backwards the "earliest offset" and make those messages available again and not subjected to a possible flush?

Kafka - Compact and Time Based Retention

I have tried creating a Kafka topic configuration that uses compaction and deletion, to achieve the following:
Within the retention period, retain the latest version of the key
After the retention period, any message older than the timestamp to be removed
For this, I have tried the following topic specific config:
cleanup.policy=[compact,delete]
retention.ms=864000000 (10 days)
min.compaction.lag.ms=3600000 (1 hour)
min.cleanable.dirty.ratio=0.1
segment.ms=3600000 (1 hour)
The broker configuration is as following:
log.retention.hours=7 days
log.segment.bytes=1.1gb
log.cleanup.policy=delete
delete.retention.ms=1 day
When I set this to a smaller amount in test, e.g. 20mins, 1hr etc, I can correctly see the data is pruned after the retention period, only adjusting retention.ms on the topic.
I can see that the data is correctly being compacted as expected, but after the 10 day retention period if I read the topic from the beginning, data much older than 10 days is still there. Is this a problem with such a long retention period?
Am I missing any configuration here? I have checked the kafka logs and see the broker is rolling the segments and compacting as expected, but can't see anything about deletes?
Kafka Version is
5.1.2-1
It might be the case that your topic and broker configuration override each other and eventually one with higher importance is evaluated.

Cannot make all messages in a Kafka topic expired with retention

I often clean up all current messages in a Kafka topic by updating retention.ms to 10. That makes all messages will be expired after 10ms. However, sometimes, the messages cannot be cleaned up by that way. I had to
drop and re-create the topic in order to clean up all messages.
I'm not sure it's related to the issue or not, but it often happens after all consumers of that topic have been stopped working by some reason.
What could be the root cause for this?
The retention.ms field is a minimum time for the log cleaner. The log cleaner only runs once every so often (the Kafka docs state 300000 ms), and only on closed log segments (default size of 1GB), so you may have to wait for it to run or need more data in the topic