After reboot KAFKA topic appears to be lost - apache-kafka

Having installed KAFKA and having looked at these posts:
kafka loses all topics on reboot
Kafka topic no longer exists after restart
and thus moving kafka-logs to /opt... location, I still note that when I reboot:
I can re-create the topic again.
the kafka-logs directory contains information on topics, offsets etc. but it gets corrupted.
I am wondering how to rectify this.
Testing of new topics prior to reboot works fine.

There can be two potential problems
If it is kafka running in docker, then docker image restart always cleans up the previous state and creates a new cluster hence all topics are lost.
Check the log.dir or Zookeeper data path. If either is set to /tmp directory, it will be cleaned on each reboot. Hence you will lose all logs and topics will be lost.

In this VM I noted the Zookeeper log was defined on /tmp. Changed that to /opt (presume it should be /var though) and the clearing of Kafka data when instance terminated was corrected. Not sure how to explain this completely.

Related

Cannot delete topic successfully

As far as I know version 1.0x and up, the config delete.topic.enable=true is the default configuration. Meaning, if I execute the delete command it should work as it expected.
However, I tried deleting the topic but it still there and was not deleted (when I describe the said topic in all brokers)
I read in some forums that I need to restart the broker servers and the zookeepers in order the deletion take effect.
Is there any other way to delete the topic without restarting the zookeeper and the brokers?

Kafka brokers shuts down because log dirs have failed

I have a 3 broker Kafka clusters with the Kafka logs in the /tmp directory. I am running Debezium Source Connector to MongoDB which polls data from 4 collections.
However within 5 mins after starting the connector, the Kafka brokers are shutting down with the following error:
[2020-04-16 18:25:08,642] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs-1 have failed (kafka.log.LogManager)
I have tried the different suggestions viz. Deleting the Kafka logs and cleaning out the Zookeeper logs. But I ran into the same problem again.
I have also noticed that the kafka logs occupy 100% of the /tmp directory when this happens. So I have also changed the log retention policy based on size.
log.retention.hours=168
log.retention.bytes=1073741824
log.segment.bytes=1073741824
log.retention.check.interval.ms=10000
This also turned up to be futile.
I would like to have some assistance regarding this. Thanks in advance!
Your log files got corrupted probably because you've ran out of storage.
I would suggest to change log.dirs in server.properties. Also make sure that you don't use the tmp/ location, as this is going to be purged once your machine turns off. Once you have changed log.dirs you can restart Kafka.
Note that the older messages will be lost.

mapR Kafka cannot start second time round

To date I have either used an existing professional installation for Hadoop with components running, or, installed Kafka and used the also-supplied Zookeeper in a native VM.
I am trying to get the mapR Community Edition Sandbox to run now.
There is a KAFKA library on mapR, but here is no kafka shown when using jps. Seems odd? I managed to get KAFKA to start once.
There is a Zookeeper service on mapR but it uses port 5181, not 2181.
Kafka uses port 9092.
The log.dirs for kafka was set to /tmp/kafka-logs, I changed that to /opt/kafka-logs
The dataDir was also set to /tmp/zookeeper, I changed that to /opt/zookeeper
I also changed the Zookeeper port to 5181 as that is what mapR uses.
It ran once, and then I re-started and I still get this type of error:
java.io.FileNotFoundException: /tmp/kafka-logs/.lock (Permission denied)
I have done chmod 777 where required I think, but I changed the paths to /opt/... from /tmp. So why is it picking /tmp up again?
I have the impression that it keeps on point to /tmp regardless of the updates to the configurations.
I also see a warning - although I do not think this is an issue:
[2019-01-14 13:26:46,355] WARN No meta.properties file under dir /tmp/kafka-logs/meta.properties (kafka.server.BrokerMetadataCheckpoint)
May be because of the mapR Streams I cannot influence it so as to run natively?
OK, I could delete the question as I solved it, but for those on mapR I deduced:
You need to update the port 2181 to 5181 on server.properties immediately. In this case we integrate with an existing zookeeper instance.
Likewise, update the log.dirs for Kafka from /tmp/kafka-logs asap to /opt/kafka-logs.
Likewise, update the dataDir from /tmp/zookeeper asap to /opt/zookeeper.
Trying to fix latterly otherwise leads to all sorts of issues. I ended up just re-installing and doing it right from scratch.
mapR has a faster version called mapR Streams which implements Kafka. I was not wanting to use that for what I was wanting to do, but mapR Sandbox has a lot of up-to-date items straight out of the box -certainly compared to Cloudera.

How to save a kafka topic at shutdown

I'm configuring my first kafka network. I can't seem to find any support on saving a configured topic. I know I can configure a topic from the quickstart guide here, but how do I save it? I thought I could add the topic info to a .properties file inside the config dir, but I don't see any support for that.
If I shutdown my machine, my topic is deleted. How do I save the configuration?
Could the topic be deleted because you are using the default broker config? With the default config, Kafka logs are stored under /tmp folder. This folder gets wiped out during a machine reboot. You could change the broker config and pick another location for Kafka logs.

How to create a durable topic in kafka

I am new to kafka and am still learning the basics of the same. I want to create a durable topic which is preserved even after the zoopkeeper and/or kafka server shutdown.
What I notice it this - I have a zookeeper and kafka server running on my local macbook. When I shutdown the zookeeper server and again bring it up quickly I can see the previously created topics. But If I restart the system and then restart the zookeeper server - I dont see the topic that I had created earlier.
I am running kafka_2.9.2-0.8.1.1 on my local system.
It happens because /tmp is getting cleaned after reboot resulting in loss of your data.
To fix this modify your Zookeeper dataDir property (in config/zookeeper.properties) and Kafka log.dirs (in config/server.properties) to be somewhere NOT in /tmp.