How do I restore dead Kafka brokers? - apache-kafka

I had a Kafka cluster with three brokers.
I killed two of them by mistake.
I restarted them with the same server.properties config files that was used in running them the first time, but it is not functioning correctly.
By this I mean when I run bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic <topic_name> --from-beginning, it prints nothing, although when I replace the localhost with the address of broker server that had not been killed in --zookeeper localhost:2181, it has all the messages.
It seems like the two restarted Kafka brokers are not finding which cluster they should belong.
How can I fix this?
Also, how do brokers in a cluster recognize each other's address? By zookeeper.connect field in server.properties file?

Related

Kafka topic creation: Timed out waiting for node assignment

I am trying to set up the following on a single remote AWS EC2 machine.
1 Kafka Zookeeper
3 Kafka brokers
1 Kafka topic
I have done the following:
I have created two copies of the server.properties file and named them server_1.properties and server_2.properties, where I changed the following values:
broker.id=0 to broker.id=1 and broker.id=2 respectively
advertised.listeners=PLAINTEXT://your.host.name:9092 to advertised.listeners=PLAINTEXT://3.72.250.103:9092
advertised.listeners=PLAINTEXT://3.72.250.103:9093
advertised.listeners=PLAINTEXT://3.72.250.103:9094 in the three config files
changed log.dirs=/tmp/kafka-logs to
log.dirs=/tmp/kafka-logs_1 and log.dirs=/tmp/kafka-logs_2 in the respective files
All three brokers start up fine, and
bin/zookeeper.sh localhost:2181 ls /brokers/ids
shows all three brokers
But when I try to create a topic like:
bin/kafka-topics.sh --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --create -- partitions 5 --replication-factor 1 --topic cars
the command times out with the error message: Timed out waiting for node assignment
What have I tried:
I tried creating only one broker - that worked fine
I tried creating the topic with --bootstrap-server localhost:9092 only - that did not work
I tried changing listeners=PLAINTEXT://:9092 to listeners=PLAINTEXT://3.72.250.103:9092 and for the 2nd and 3rd broker respectively - I was not even able to start the brokers with this configuration.
I am not sure what to try next

Kafka multi/two nodes setup with producer and consumer on different server/node

I am struggling to set up a multi node Kafka cluster.
To simplify my request, assuming I have two servers/nodes, node1 and node2. node1 has ip 100.100.100.1 and node2 has ip 100.100.100.2. all the configuration will be under the user kafka#node1 and kakfa#node2.
Currently, I can set up the single node, single broker Kafka on either node by following the quick start example: https://kafka.apache.org/quickstart.
I can also create simulated log/topic (to simplify, producer is just the written message from the console/terminal) and write to the producer on node1. Such topic can also be consumed on node1 as well, let's naming this topic as logtest.
What I want to achieve is that Kafka on node2 can consume the topic logtest produced on node1. However, I do not know how to start. I could not find a good post to guide how to set up such connection. So that node2 can consume topics/producers from other nodes/servers (in this example is node1). The ssh login has been setup such that no password is needed between node1 and node2 to copy files in between.
My question is generally how to set up Kafka on two nodes (with content produced on one node and consumed on another one), via command line (.sh files)?
Since you should never have an even number of Zookeeper servers, start Zookeeper on node1
Set these properties on node1 and start Kafka
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://external.lan.ip.here:9092
Point node2 zookeeper.connect string at node1, give it a different broker.id, and start Kafka on it after setting similar listener properties.
You now have a Kafka cluster of 2 nodes and can use --bootstrap-servers localhost:9092 (also each other's addresses) in shell scripts on either host to create topics, produce and consumer
Repeat to expand the cluster further
A multi-broker Kafka cluster is what you are looking for. A mutli-broker cluster has multiple nodes/brokers(Kafka binaries) working in sync as a cluster to provide fault-tolerance and scalability.
Setting up the cluster:-
After setting up a basic cluster ( 1 zookeeper node with one kafka-broker) as provided here. You need to do the following steps to add another broker to the cluster.
Configure the new broker with distinct id.In the new broker server.properties change the broker id to a unique value.
Adding a new servername it responds to:-
Adding the old zookeeper location to the new broker. Cluster below is being run on localhost(broker-0 on port 9092, broker-1 on port 9093, zookeeper on port 2181). (Zookeeper is one deciding how to maintain leadership and keep the cluster serving data)
Start the new broker now,using the new server-properties.
./kafka-server-start.sh ../config/server.properties
When the server comes up successfully without failing. We can go ahead and check if the new broker has connected to the cluster successfully via the zookeeper cli.(this is important as zookeeper is the one maintaining leadership and broker aliveness)
./zookeeper-shell.sh localhost:2181
As you can see the ids 0,1 show that both the brokers are part of the same cluster.
You now have a 1 zookeeper 2 broker kafka cluster now.
Creating a topic:-
Once the cluster is up, we need to create a topic that is spread across the 2 brokers. Each topic has data split into two parts.(partition) and each partition has duplicate copy to prevent data loss (replication-factor)
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 2 --topic helloWorld
Check how the topic is setup:-
./kafka-topics.sh --describe --zookeeper localhost:2181 --topic helloWorld
Here we can see each partition and their replicas being spread across two kafka brokers.
Topic Consumption:-
To start producing use:-
./kafka-console-producer.sh --broker-list localhost:9092 --topic helloWorld
To start consuming use:-
./kafka-console-consumer.sh --bootstrap-server localhost:9093 --topic helloWorld
Notice how I am producing to :9092 (broker-0) but am consuming from :9093 (broker-1).
This shows how the data is being synced internally within brokers. However, for best practices and high availability, always use all the brokers in the --broker-list parameter

Kafka Topic Creation with --bootstrap-server gives timeout Exception (kafka version 2.5)

When trying to create topic using --bootstrap-server,
I am getting exception "Error while executing Kafka topic command: Timed out waiting for a node" :-
kafka-topics --bootstrap-server localhost:9092 --topic boottopic --replication-factor 3 --partitions
However following works fine, using --zookeeper :-
kafka-topics --zookeeper localhost:2181--topic boottopic --replication-factor 3 --partitions
I am using Kafka version 2.5 and as per knowledge since version >2.2, all the offsets and metadata are stored on the broker itself. So, while creating topic there's no need to connect to zookeeper.
Please help to understand this behaviour
Note - I have set up a Zookeeper quorum and Kafka broker cluster each containing 3 instance on a single machine (for dev purposes)
Old question, but Ill answer anyways for the sake of internet wisdom.
You probably have auth set, when using --bootstrap-server you need to also specify your credentials with --command-config
since version >2.2, all the ... metadata are stored on the broker itself
False. Topic metadata is still stored on Zookeeper until KIP-500 is completed.
The AdminClient.createTopics() method, however that is used internally will delegate to Zookeeper from the Controller broker node in the cluster.
Hard to say what the error is, but most common issue is that Kafka is not running, you have SSL enabled and the certs are wrong, or the listeners are misconfigured.
For example, in the listeners, the default broker port on a Cloudera Kafka installation would be 6667, not 9092
each containing 3 instance on a single machine
Running 3 instances on one machine does not improve resiliency or performance unless you have 3 CPUs and 3 separate HDDs on that one motherboard.
"Error while executing Kafka topic command: Timed out waiting for a
node"
This seems like your broker is down or is inaccessible from where you are running those commands or it hasn't started yet (perhaps still starting).
Sometimes the broker startup takes long because it performs some cleaning operations. You may want to check your Kafka broker startup logs and see if it is ready and then try creating the topics by giving in the bootstrap servers.
There could also be some errors during your Kafka broker startup like Too many open files or wrong zookeeper url, zookeeper not being accessible by your broker, to name a few.
If you are able to create topics by passing in your Zookeeper URL means that zookeeper is up but does not necessarily mean that your Kafka broker(s) are also up and running.
Since a zookeeper can start without a broker but not vice-versa.

Kafka:why topic creation in kafka is done in zookeper host instead of broker

I am trying to learn Kafka,
This is the command I saw for topic creation.
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partition 2 --topic test
why do we pass zookeeper host,port for topic creation, instead to broker(s) information?
what does zookeeper do with topic?
Does topic info is persisted in zookeeper
Which Kafka version are you using? In the latest version, the --zookeeper parameter is deprecated and you can use the --bootstrap-server instead, providing Kafka broker(s) address(s).
Anyway, the topic information will be still stored in Zookeeper. This is the way how Kafka works.
The fact that --zookeeper is deprecated is because the development is going in the direction of making clients less aware of Zookeeper and doing all the operations connecting to the Kafka brokers; it's the broker doing the operation on Zookeeper then.

Kafka: org.apache.zookeeper.KeeperException$NoNodeException while creating topic on multi server setup

I am trying to setup multi node Kafka-0.8.2.2 cluster with 1 Producer, 1 consumer and 3 brokers all on different machines.
While creating topic on producer, I am getting error as org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /brokers/ids. Complete console output is available here. There is no error in Kafka Producer's log.
Command I am using to run Kafka is:
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic edwintest
Note: Zookeeper service is running on all the servers, and all three brokers are having Kafka servers running on them (Only Brokers need Kafka Server. Right?).
Configuration of my producer.properties is as:
metadata.broker.list=<IP.OF.BROKER.1>:9092,<IP.OF.BROKER.2>:9092,<IP.OF.BROKER.3>:9092
producer.type=sync
compression.codec=none
serializer.class=kafka.serializer.DefaultEncoder
Below are some of the many articles I was using as reference:
Zookeeper & Kafka Install : A single node and a multiple broker cluster - 2016
Step by Step of Installing Apache Kafka and Communicating with Spark
At the very first glance it seems like you're calling create topic to a local zookeeper which is not aware of any of your kafka-brookers. You should call ./bin/kafka-topics.sh --create --zookeeper <IP.OF.BROKER.1>:2181
The issue was because I was trying to connect to the zookeeper of localhost. My understanding was zookeeper is needed to be running on producer, consumer, and the Kafka brokers, and the communication is done between producer -> broker and broker -> consumer via zookeeper. But that was incorrect. Actually:
Zookeeper and Kafka servers should be running only on on broker servers. While creating the topic or publishing the content to the topic, public DNS of any of the Kafka broker should be passed with --zookeeper option. There is no need to run Kafka server on producer or consumer instance.
Correct command will be:
./bin/kafka-topics.sh --create --zookeeper <Public-DNS>:<PORT> --replication-factor 1 --partitions 3 --topic edwintest
where: Public-DNS is the DNS of any of the Kafka broker and PORT is the port of the zookeeper service.