How to understand which partition replica kafka broker is down - apache-kafka

There's topic with 22 replicas, 50 partitions and 22 running Kafka brokers.
Topic manual assignment screen in Kafka Manager shows that there's Broker Down in all topic partitions as seen in the image.
How to determine Kafka broker that's down using cli or Kafka Manager?
Currently, i look which broker id is missing in Partition replicas.

This information on brokers is also maintained in Zookeeper. So, you could go onto one of the zookeeper nodes and use the cli to extract this and here's a command sequence you could use:
On the command line, just issue the command zookeeper-client and this should invoke the zookeeper command prompt
On the new prompt, issue the command - ls /brokers/ids and this should return the ids of all the active brokers

Related

Creating a topic in Kafka only in one cluster

I run Docker file in this location which creates 3 clusters with 3 brokers. When I run docker ps, I can see 3 zookeeper and 3 kafka instances.
When I create a topic in one of the Kafka brokers (0.0.0.0:9092), and list broker topics, I see that topic listed in all brokers.
My expectation was to see that topic only under the broker I picked.
What am I missing?

Unable to connect kafka Consumer to Kafka cluster

We have Kafka cluster with 3 broker nodes. When all are up and running, consumer able to read data from Kafka. However if I stop all Kafka server and brings up only 2 Kafka server except the one which stopped last then Consumer unable to connect to Kafka cluster.
What could be the reason behind this? Thanks in advance.
I would guess that the problem could be the offsets.topic.replication.factor in the broker that by default is 3 while you are now running a cluster with 2 brokers only.
This is the topic where the consumers store the offsets when consuming and it was created with replication factor of 3 on the first run.
When, on the second run, you start only 2 broker, it could be the problem now.

Doubts Regarding Kafka Cluster Setup

I have a use case I want to set up a Kafka cluster initially at the starting I have 1 Kafka Broker(A) and 1 Zookeeper Node. So below mentioned are my queries:
On adding a new Kafka Broker(B) to the cluster. Will all data present on broker A will be distributed automatically? If not what I need to do distribute the data.
Not let's suppose somehow the case! is solved my data is distributed on both the brokers. Now due to some maintenance issue, I want to take down the server B.
How to transfer the data of Broker B to the already existing broker A or to a new Broker C.
How can I increase the replication factor of my brokers at runtime
How can I change the zookeeper IPs present in Kafka Broker Config at runtime without restarting Kafka?
How can I dynamically change the Kafka Configuration at runtime
Regarding Kafka Client:
Do I need to specify all Kafka broker IP to kafkaClient for connection?
And each and every time a broker is added or removed does I need to add or remove my IP in Kafka Client connection String. As it will always require to restart my producer and consumers?
Note:
Kafka Version: 2.0.0
Zookeeper: 3.4.9
Broker Size : (2 core, 8 GB RAM) [4GB for Kafka and 4 GB for OS]
To run a topic from a single kafka broker you will have to set a replication factor of 1 when creating that topic (explicitly, or implicitly via default.replication.factor). This means that the topic's partitions will be on a single broker, even after increasing the number of brokers.
You will have to increase the number of replicas as described in the kafka documentation. You will also have to pay attention that the internal __consumer_offsets topic has enough replicas. This will start the replication process and eventually the original broker will be the leader of every topic partition, and the other broker will be the follower and fully caught up. You can use kafka-topics.sh --describe to check that every partition has both brokers in the ISR (in-sync replicas).
Once that is done you should be able to take the original broker offline and kafka will elect the new broker as the leader of every topic partition. Don't forget to update the clients so they are aware of the new broker as well, in case a client needs to restart when the original broker is down (otherwise it won't find the cluster).
Here are the answers in brief:
Yes, the data present on broker A will also be distributed in Kafka broker B
You can set up three brokers A, B and C so if A fails then B and C will, and if B fails then, C will take over and so on.
You can increase the replication factor of your broker
you could create increase-replication-factor.json and put this content in it:
{"version":1,
"partitions":[
{"topic":"signals","partition":0,"replicas":[0,1,2]},
{"topic":"signals","partition":1,"replicas":[0,1,2]},
{"topic":"signals","partition":2,"replicas":[0,1,2]}
]}
To increase the number of replicas for a given topic, you have to:
Specify the extra partitions to the existing topic with below command(let us say the increase from 2 to 3)
bin/kafktopics.sh --zookeeper localhost:2181 --alter --topic topic-to-increase --partitions 3
There is zoo.cfg file where you can add the IP and configuration related to ZooKeeper.

Why do we need to mention Zookeeper details even though Apache Kafka configuration file already has it?

I am using Apache Kafka in (Plain Vanilla) Hadoop Cluster for the past few months and out of curiosity I am asking this question. Just to gain additional knowledge about it.
Kafka server.properties file already has the below parameter :
zookeeper.connect=localhost:2181
And I am starting Kafka Server/Broker with the following command :
bin/kafka-server-start.sh config/server.properties
So I assume that Kafka automatically infers the Zookeeper details by the time we start the Kafka server itself. If that's the case, then why do we need to explicitly mention the zookeeper properties while we create Kafka topics the syntax for which is given below for your reference :
bin/kafka-topics.sh --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic test
As per the Kafka documentation we need to start zookeeper before starting Kafka server. So I don't think Kafka can be started by commenting out the zookeeper details in Kafka's server.properties file
But atleast can we use Kafka to create topics and to start Kafka Producer/Consumer without explicitly mentioning about zookeeper in their respective commands ?
The zookeeper.connect parameter in the Kafka properties file is needed for having each Kafka broker in the cluster connecting to the Zookeeper ensemble.
Zookeeper will keep information about connected brokers and handling the controller election. Other than that, it keeps information about topics, quotas and ACL for example.
When you use the kafka-topics.sh tool, the topic creation happens at Zookeeper level first and then thanks to it, information are propagated to Kafka brokers and topic partitions are created and assigned to them (thanks to the elected controller). This connection to Zookeeper will not be needed in the future thanks to the new Admin Client API which provides some admin operations executed against Kafka brokers directly. For example, there is a opened JIRA (https://issues.apache.org/jira/browse/KAFKA-5561) and I'm working on it for having the tool using such API for topic admin operations.
Regarding producer and consumer ... the producer doesn't need to connect to Zookeeper while only the "old" consumer (before 0.9.0 version) needs Zookeeper connection because it saves topic offsets there; from 0.9.0 version, the "new" consumer saves topic offsets in real topics (__consumer_offsets). For using it you have to use the bootstrap-server option on the command line insteand of the zookeeper one.

Zookeeper client cannot rmr /brokers/topics/MY_TOPIC

I'm trying to remove a Kafka topic with 8 partitions and 2 replications. First I delete that topic using kafka-topic.sh --delete command. Then I used zkCli.sh -server slave1.....slave3, and rmr /brokers/topics/MY_TOPIC.
However I still see that topic in /brokers/topics/. And I tried restart Kafka, everything still the same.
Btw, topic with 1 partition and 1 replica can be deleted successfully.
You can set server properties to enable delete of kafka topic
Add line mentioned below in service.properties
delete.topic.enable = true
If you removing manually using rmr /brokers/topics/MY_topic then you also need to remove topic related metadata from other nodes in zookeeper ex- consumer information about that topic. Also need to remove kafka topic director on kafka server.
It is cleaner to enable topic delete property and execute kafka-topics.sh --delete