Altering kafka topic - deleting config results in "Invalid config(s): retention.bytes" - apache-kafka

I'm trying to delete a topic-level configuration retention.bytes that was applied on my topic. But when I try to delete the config with the command as described in Kafka documentation I am getting the following message:
kafka-0:/opt/bitnami/kafka/bin$ kafka-configs.sh --bootstrap-server kafka:9092 --entity-type topics --entity-name foo --alter --delete-config retention.bytes
Invalid config(s): retention.bytes
I've already dug into Kafka's source code but the only thing it mentions is that it throws this error if "the command if any of the configs to be deleted does not exist". However when I describe my topic, I can see the config there:
kafka-0:/opt/bitnami/kafka/bin$ kafka-topics.sh --bootstrap-server kafka:9092 --describe --topic foo
Topic: foo PartitionCount: 1 ReplicationFactor: 3 Configs: cleanup.policy=compact,delete,flush.ms=1000,segment.bytes=1073741824,retention.ms=7776000000,flush.messages=10000,max.message.bytes=1000012,retention.bytes=1073741824
Topic: foo Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 2,1,0
So obviously it does exist...
Could anyone pinpoint what the problem could be here?

I needed to do the same thing today, looks like the same issue.
After some tinkering, I figured it out.
First, I was able to add the retention.bytes (I increased it 10 times compare to default 10737418240). After that, I was able to delete it, and it fell back to the default.
So, I think, you can only delete it if you override it yourself. Otherwise it does not have that config, and uses default value. Which means, you cannot make retention size to be unlimited.
Edit:
Overriding the default policy with -1 will enable unlimited retention (do so at your own risk).

Related

How to view topic properties in Apache Kafka

i want to view topic level properties something like
"message.timestamp.type": "LogAppendTime",
"cleanup.policy":"compact"
is it possible to view what all properties is set at topic level?
is there any command where i can view my topic level properties as mentioned above i google lot and found a cmd but its not work for me
cmd is as follow
kafka-configs.bat --describe --zookeeper localhost:2181 --entity-type topics --entity-name test
An alternative could be
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test
that lists some basic information about the provided Kafka topic, along with all non-default configurations (Configs) on topic-level. For example,
Topic:my-topic PartitionCount:1 ReplicationFactor:3 Configs: compression.type=gzip,segment.bytes=1073741824,retention.ms=100,max.message.bytes=100001200,delete.retention.ms=100000
Topic: my-topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
It would be helpful to know why you think kafka-configs script is not working.
Remember that the script only print the configurations with a non default value.
If the script only prints something like:
Configs for topic 'test' are
Probably, the topic configuration has not been changed.
If you are familiar with docker, i recommend you to run a container
with landoop/kafka-topics-ui image.
https://hub.docker.com/r/landoop/kafka-topics-ui/
It is a usefull docker image that shows all the data of the topics, messages, offsets, configurations ...
Here you can see and example of configuration view:

Kafka configuration min.insync.replicas not working

Its my early days in learning kafka. And I am checking out every kafka property/concept in my local machine.
So I came across this property min.insync.replicas and here is my understanding. Please correct me if I've misunderstood anything.
Once a message is sent to a topic, the message must be written to at least min.insync.replicas number of followers.
min.insync.replicas also includes the leader.
If number of available live brokers( indirectly, in sync replicas ) are less than the specified min.insync.replicas , then producer will raise an exception failing to publish the message.
Following are the steps I followed to create the above scenario
Started 3 brokers in local with broker Ids 0, 1 and 2
created the topic insync and set min.insync.replicas to 2
using the following command
sudo ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic insync --config min.insync.replicas=2
Describe the topic resulted in the following
Topic:insync PartitionCount:1 ReplicationFactor:3 Configs:min.insync.replicas=2
Topic: insync Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 1,2,0
At this point, I made sure the property I've provided is picked by kafka
I started sending messages and consuming them from terminal using following command
Producer: ./kafka-console-producer.sh --broker-list localhost:9092 --topic insync --producer.config ../config/producer.properties
Consumer: ./kafka-console-consumer.sh --zookeeper localhost:2181 --topic insync
At this point, I was able to send and receive messages successfully.
Bought down 2 brokers (0 and 2) and described the topic and resulted in following
Topic:insync PartitionCount:1 ReplicationFactor:3 Configs:min.insync.replicas=2
Topic: insync Partition: 0 Leader: 1 Replicas: 2,0,1 Isr: 1
At this point, the In Sync Replicas are just 1(Isr: 1)
Then I tried to produce the message and it worked. I was able to send messages from console-producer and I could see those messages in console consumer.
My Kafka version: kafka_2.10-0.10.0.0
following are the producer properties:
bootstrap.servers=localhost:9092
compression.type=none
batch.size=20
acks=all
I expected the producer to fail with NotEnoughReplicasException as mentioned in this.
public class NotEnoughReplicasException
extends RetriableException
Number of insync replicas for the partition is lower than >min.insync.replicas
but it worked normally.
Am I missing something? How can I create the scenario?
*************** EDIT **********************
Instead of producing the messages from console producer, I tried to generate messages from java code. This time, I got the expected exception in the kafka broker. Although I expected it in the producer (java code). As this experiment is raising more questions, I've posted another question.
is acks set to "all"? if not, try setting it to all
I believe that error is for transactional producer, you may need to add this config:
transactional.id=TID-TEST
if still not working, please check your replicator factor and min insync isr for the internal topic: __transaction_state

set config retention.ms=3600000 still data not delete from Kafka

I have set the retention.ms=3600000 by below command but still there is lots of data on disk after 1 hour. My disk got full due to huge data coming to Kafka.
./bin/kafka-topics.sh --zookeeper zookeeper:2181 --alter --topic topic_1 --config retention.ms=3600000
Describe command
./bin/kafka-topics.sh --zookeeper zookeeper:2181 --describe --topics-with-overrides
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic:topic_1 PartitionCount:3 ReplicationFactor:3 Configs:retention.ms=3600000
Topic:topic_2 PartitionCount:3 ReplicationFactor:3 Configs:retention.ms=3600000
Topic:topic_3 PartitionCount:3 ReplicationFactor:3 Configs:retention.ms=3600000,retention.bytes=104857600
Can anyone give advice why kafka not delete the data after 1 hours.?
From the describe command result, topic retention policy is set to compact which will enable log compaction instead of deleting and will keep the latest data for each key. To delete all the data older than the retention period, you need to set retention policy to delete.
./bin/kafka-topics.sh --zookeeper zookeeper:2181 --alter --topic topic_1 --config cleanup.policy=delete
Check the value of log.retention.check.interval.ms.
This value affects the Log cleaner. It will check whether any log is eligible for deletion with this interval.
As the documentation suggests, retention.ms controls the maximum time kafka will retain a log before it will discard old log segments to free up space if we are using the "delete" retention policy.
Looks like your cleanup.policy is set to compact instead of delete
bin/kafka-configs.sh --zookeeper 2181 --entity-type topics
--entity-name topic_1 --alter --add-config cleanup.policy=delete
PS:Altering topic configuration from the kafka-topics.sh script (kafka.admin.TopicCommand) has been deprecated. Going forward, please use the kafka-configs.sh script (kafka.admin.ConfigCommand) for this functionality.

Kafka per topic retention.bytes and global log.retention.bytes not working

We are running a 6 node cluster of kafka 0.11.0. We have set a global as well as a per-topic retention in bytes, neither of which is being applied. There are no errors that I can see in the logs, just nothing being deleted (by size; the time retention does seem to be working)
See relevant configs below:
./config/server.properties :
# global retention 75GB or 60 days, segment size 512MB
log.retention.bytes=75000000000
log.retention.check.interval.ms=60000
log.retention.hours=1440
log.cleanup.policy=delete
log.segment.bytes=536870912
topic configuration (30GB):
[tstumpges#kafka-02 kafka]$ bin/kafka-topics.sh --zookeeper zk-01:2181/kafka --describe --topic stg_logtopic
Topic:stg_logtopic PartitionCount:12 ReplicationFactor:3 Configs:retention.bytes=30000000000
Topic: stg_logtopic Partition: 0 Leader: 4 Replicas: 4,5,6 Isr: 4,5,6
Topic: stg_logtopic Partition: 1 Leader: 5 Replicas: 5,6,1 Isr: 5,1,6
...
And, disk usage showing 910GB usage for one partition!
[tstumpges#kafka-02 kafka]$ sudo du -s -h /data1/kafka-data/*
82G /data1/kafka-data/stg_logother3-2
155G /data1/kafka-data/stg_logother2-9
169G /data1/kafka-data/stg_logother1-6
910G /data1/kafka-data/stg_logtopic-4
I can see there are plenty of segment log files (512MB each) in the partition directory... what is going on?!
Thanks in advance,
Thunder
Found the answer to this via the kafka user mailing list. We were apparently hitting kafka bug KAFKA-6030 (Integer overflow in log cleaner cleanable ratio computation)
Upgrading to v1.0.0 has fixed this for us!

Can different Kafka topics have different retention lengths?

I'm looking to have a master topic (with log retention 7 days) and several smaller topics with a filtered corpus with a smaller log retention (2 days). Is this possible?
NOTE: I'm using Kafka v0.10.1.1.
log.retention.ms, whose default value is 7 days, is at the global level for all topics, whereas you could override it using a topic-level config retention.ms when creating the topic as below:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --topic test
--partitions 1 --replication-factor 1 --config retention.ms=172800000
log.retention.hours is a property of a broker which is used as a default value when a topic is created. When you change configurations of currently running topic using kafka-topics.sh, you should specify a topic-level property.
A topic-level property for log retention time is retention.ms.
From Topic-level configuration in Kafka 0.10.1 documentation:
Property: retention.ms
Default: 7 days
Server Default Property: log.retention.minutes
Description: This configuration controls the maximum time we will retain a log before we will discard old log segments to free up space if we are using the "delete" retention policy. This represents an SLA on how soon consumers must read their data.
So the correct command is
$ bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic as-access --config retention.ms=172800000
You can check whether the configuration is properly applied with the following command.
$ bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic as-access
Then you will see something like below.
Topic:as-access PartitionCount:3 ReplicationFactor:3 Configs:retention.ms=172800000