What happen if kafka files are deleted? - apache-kafka

This is definitively not the way to do it, and it should probably be handled by the cleanup policy but that's not the point. Let's imagine the files in log.dirs has been deleted, what's the impact ?
The broker would crash ?
The offset would start over at 0 after restarting the service ?
Would it be necessary to do anything to fix ?

If you delete the files from log.dirs, the data will be deleted but topic will still exist in zookeeper metadata. The broker won't crash. Once you restart the brokers, it will read the topic as an empty one and you can produce new data.
If you delete the topic from zookeeper metadata as well, it will delete the topic from broker.
In order to check the offsets you can use below command:
// Before deleting the log.dirs directory for topic 'test1'
kafka_2.12-1.1.1 % bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test1
test1:0:6
// After deleting the directory and restarting the broker
kafka_2.12-1.1.1 % bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test1
test1:0:0

In fact it will depend how many brokers you have in your cluster, and from how many of them you delete the files at the same time. Luckily if you delete the files from one broker in a 3-broker cluster, and you have defined a replication factor of 3 for your topics, you will not lose anything and the files will be recreated on the broker where you deleted them.

Related

Kafka topics not created empty

I have a Kafka cluster consisting on 3 servers all connected through Zookeeper. But when I delete a topic that has some information and create the topic again with the same name, the offset does not start from zero.
I tried restarting both Kafka and Zookeeper and deleting the topics directly from Zookeeper.
What I expect is to have a clean topic When I create it again.
I found the problem. A consumer was consuming from the topic and the topic was never actually deleted. I used this tool to have a GUI that allowed me to see the topics easily https://github.com/tchiotludo/kafkahq. Anyway, the consumers can be seen running this:
bin/kafka-consumer-groups.sh --list --bootstrap-server localhost:9092

Deleted Kafka topic can't be recreated with the same name

I marked a topic for deletion and it sat there forever not deleting (even though delete.topic.enable is set to true). So I followed the instructions and shelled into one of the zookeepers and ran the following to get it deleted:
rmr /brokers/topics/topicname
rmr /admin/delete_topics/topicname
The topic then appeared to be deleted (would not come back on a list command). But then when I tried to recreate it with new configuration (compaction turned on), the in-sync-replicas are empty and I can't consume from the topic. Consumption comes back with 'UNKNOWN_TOPIC_OR_PARTITION' errors even though the list command shows the topic as being there.
Is there a log somewhere I can look at to see why it is not able to get the topic setup properly after deletion and recreation? Am I missing a step and not properly deleting the topic to begin with? Why is the recreated topic not getting properly initialized?
What I ran to delete the topic initially before running the two commands above (this left the topic in 'marked for deletion' for a long time):
./kafka-topics.sh --zookeeper $KAFKAZKHOSTS --delete --topic topicname
What I ran to recreate the topic:
/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --create --zookeeper $KAFKAZKHOSTS --replication-factor 3 --partitions 3 --topic topicname --config cleanup.policy=compact
Kafka version: 1.1.0.2.6.5.3005-27
So I read somewhere that you should restart the brokers and that may solve it. So I tried that and sure enough after the restart the ISRs are in the right state and the topic is consumable again.
I would still like to know under what circumstances this can happen and if there's a way to fix it without restarting brokers since in a production environment I would like to avoid doing that.

Kafka partition directories not deleted in data dir

I am using bin/kafka-topics.sh --zookeeper --delete --topic and i see in kafka logs of that indicate that the partitions for that topic are marked for deletion. However, I am still seeing the directories for those partitions present in the data dir.
Is this something expected and I am have manually delete them?
The topics haven't been removed from the zookeeper also. I still see the topics in zookeeper. Is this also expected?
Thanks!
There could be several reasons for topics not being deleted automatically.
In order to delete a topic delete.topic.enable should be set to true.
If it is set to true, it should ideally delete the directories from Zookeeper and kafka data.dir . But in case, if it doesn't, you should check the logs to make sure if there is any problem with kafka brokers or zookeeper due to some LEADER selection issue.
So in that case, you have to cleanup the dirs manually.

Zookeeper client cannot rmr /brokers/topics/MY_TOPIC

I'm trying to remove a Kafka topic with 8 partitions and 2 replications. First I delete that topic using kafka-topic.sh --delete command. Then I used zkCli.sh -server slave1.....slave3, and rmr /brokers/topics/MY_TOPIC.
However I still see that topic in /brokers/topics/. And I tried restart Kafka, everything still the same.
Btw, topic with 1 partition and 1 replica can be deleted successfully.
You can set server properties to enable delete of kafka topic
Add line mentioned below in service.properties
delete.topic.enable = true
If you removing manually using rmr /brokers/topics/MY_topic then you also need to remove topic related metadata from other nodes in zookeeper ex- consumer information about that topic. Also need to remove kafka topic director on kafka server.
It is cleaner to enable topic delete property and execute kafka-topics.sh --delete

Kafka topic is marked for deletion but not getting deleted in kafka 0.9

I am trying to delete my kafka topic which following command.
bin/kafka-topics.sh --zookeeper <zkserver>:2181 --delete --topic test1
My kafka version is 0.9 and I have also set delete.topic.enable flag to true. Still when I fire above command my topic is only marked for deletion not actually getting deleted.
logic topic are composed of multiple partition, and each partition may have multiple copy. In a word, your topic are physically distributed in multiple instance.
If any instance is down, your topic deletion will not able to finish.
There was an orphan producer process running on that topic which was spawned by my java Kafka producer program. That I eventually came to know when I started a console consumer on the same topic. After manually killing that process I was able to delete the topic.