When does Zookeeper change Kafka cluster ID? - apache-kafka

Sometimes I enforce the following error when trying to run Kafka broker:
ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.common.InconsistentClusterIdException: The Cluster ID m1Ze6AjGRwqarkcxJscgyQ doesn't match stored clusterId Some(1TGYcbFuRXa4Lqojs4B9Hw) in meta.properties. The broker is trying to join the wrong cluster. Configured zookeeper.connect may be wrong.
at kafka.server.KafkaServer.startup(KafkaServer.scala:220)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
at kafka.Kafka$.main(Kafka.scala:84)
at kafka.Kafka.main(Kafka.scala)
[2020-01-04 15:58:43,303] INFO shutting down (kafka.server.KafkaServer)
I know what the error message means and it can be solved with removing meta.properties in the Kafka log
dir.
I now want to know when this happens exactly to prevent this from happening. Why/when does sometimes the cluster ID that Zookeeper looks for changes?

Kafka requires Zookeeper to start. When you run zk with out-of-the-box setup, it contains data/log dir set as the tmp directory.
When your instance/machine restarts tmp directory is flushed, and zookeeper looses its identity and creates a new one. Kafka cannot connect to a new zk as meta.properties file contains reference of the previous cluster_id
To avoid this configure your $KAFKA_HOME/config/zookeeper.properties
change dataDir to a directory that will persist after Zookeeper shuts down / restarts.
This would avoid zk to lose its cluster_id everytime machine restarts.

Related

Delete the kafka connect topics without stopping the process

I was running a Kafka connect worker in distributed mode. (it's a test cluster), I wanted to reset the default connect-* topics,so without stopping the worker I removed, then After the worker restart, I'm getting this error.
ERROR [Worker clientId=connect-1, groupId=debezium-cluster1] Uncaught exception in herder work thread, exiting: (org.apache.kafka.connect.runtime.distributed.DistributedHerder:324)
org.apache.kafka.common.config.ConfigException:
Topic 'connect-offsets' supplied via the 'offset.storage.topic' property is required to have 'cleanup.policy=compact' to guarantee consistency and durability of source connector offsets,
but found the topic currently has 'cleanup.policy=delete'.
Continuing would likely result in eventually losing source connector offsets and problems restarting this Connect cluster in the future.
Change the 'offset.storage.topic' property in the Connect worker configurations to use a topic with 'cleanup.policy=compact'.
Deleting the internal topics while the workers are still running sounds risky. The workers have internal state, which now no longer matches the state in the Kafka brokers.
A safer approach would be to shut down the workers (or at-least shut down all the connectors), delete the topics, and restart the workers/connectors.
It looks like the topics got auto-created, perhaps by the workers when you deleted them mid-flight.
You could manually apply the configuration change to the topic as suggested, or you could also specify a new set of topics for the worker to use (connect01- for example) and let the workers recreate them correctly.

Kafka brokers shuts down because log dirs have failed

I have a 3 broker Kafka clusters with the Kafka logs in the /tmp directory. I am running Debezium Source Connector to MongoDB which polls data from 4 collections.
However within 5 mins after starting the connector, the Kafka brokers are shutting down with the following error:
[2020-04-16 18:25:08,642] ERROR Shutdown broker because all log dirs in /tmp/kafka-logs-1 have failed (kafka.log.LogManager)
I have tried the different suggestions viz. Deleting the Kafka logs and cleaning out the Zookeeper logs. But I ran into the same problem again.
I have also noticed that the kafka logs occupy 100% of the /tmp directory when this happens. So I have also changed the log retention policy based on size.
log.retention.hours=168
log.retention.bytes=1073741824
log.segment.bytes=1073741824
log.retention.check.interval.ms=10000
This also turned up to be futile.
I would like to have some assistance regarding this. Thanks in advance!
Your log files got corrupted probably because you've ran out of storage.
I would suggest to change log.dirs in server.properties. Also make sure that you don't use the tmp/ location, as this is going to be purged once your machine turns off. Once you have changed log.dirs you can restart Kafka.
Note that the older messages will be lost.

Getting Exception in thread "main" kafka.zookeeper.ZooKeeperClientTimeoutException when creating topic

I am new to Kafka, have started learning on my own.
I am trying to create a topic in Kafka, my zookeeper is running but every time I am getting below error.
This means your zookeeper instance(s) are not running good.
Please check zookeeper log files for any error. By default zookeeper logs are stored under server.log file

Kafka parcel not installed well

I am using a cloudera cluster (5.5.1-1.cdh5.5.1) and I installed the Kafka parcel (2.0.1-1.2.0.1.p0.5). When I tried to add the kafka service on the cluster, I had the following error on every host I tried to add a role on : Error found before invoking supervisord: User [kafka] does not exist.
I created the user manually on every node needed, restarted the service, and it seemed to work until I tried to modify the kafka config with the cloudera manager. The cloudera's kafka configuration isn't "binded" to the real kafka configuration.
The error when I try to modify the broker.id :
Fatal error during KafkaServerStartable startup. Prepare to shutdown
kafka.common.InconsistentBrokerIdException: Configured broker.id 101
doesn't match stored broker.id 145 in meta.properties. If you moved
your data, make sure your configured broker.id matches. If you intend
to create a new broker, you should remove all data in your data
directories (log.dirs).`

How to create a durable topic in kafka

I am new to kafka and am still learning the basics of the same. I want to create a durable topic which is preserved even after the zoopkeeper and/or kafka server shutdown.
What I notice it this - I have a zookeeper and kafka server running on my local macbook. When I shutdown the zookeeper server and again bring it up quickly I can see the previously created topics. But If I restart the system and then restart the zookeeper server - I dont see the topic that I had created earlier.
I am running kafka_2.9.2-0.8.1.1 on my local system.
It happens because /tmp is getting cleaned after reboot resulting in loss of your data.
To fix this modify your Zookeeper dataDir property (in config/zookeeper.properties) and Kafka log.dirs (in config/server.properties) to be somewhere NOT in /tmp.