Find broker id used in the Kafka cluster - apache-kafka

I want to know the list of taken broker ids in a kafka cluster. For example, in a cluster with 10 nodes if I create a topic with 10 partitions(or more) I can see from the output of a describe topic command, the brokers to which it has been assigned.
./bin/kafka-topics --describe --zookeeper <zkconnect>:2181 --topic rbtest3
Can I collect this information without creating a topic?

You can get list of used broker ids using zookeeper cli.
zookeeper-3.4.8$ ./bin/zkCli.sh -server zookeeper-1:2181 ls /brokers/ids | tail -1
[0]

You also can use the zookeeper-shell.sh script that ships with the Kafka distribution, like this:
linux$ ./zookeeper-shell.sh zookeeper-IPaddress:2181 <<< "ls /brokers/ids"
Just add the IP address of any of your Zookeeper servers (and/or change the port if necessary, for example when running multiple Zookeeper instances on the same server).
This alternative can be useful when, for example, you find yourself inside a container (Docker, LXC, etc.) that is exclusively running a Kafka client; but Zookeeper itself is somewhere else (say, in a different container).
I hope it helps. =:)

#kafka broker id
cat $KAFKA_HOME/logs/meta.properties

If you want to know what is the broker ID of a specific broker - the easiest way is to look at its controller.log, I found:
cat /var/log/kafka/controller.log
[2021-02-18 13:20:22,639] INFO [ControllerEventThread controllerId=1003] Starting (kafka.controller.ControllerEventManager$ControllerEventThread)
[2021-02-18 13:20:22,646] DEBUG [Controller id=1003] Broker 1002 has been elected as the controller, so stopping the election process. (kafka.controller.KafkaController)
controllerId=1003 ---> this is your brokerID (1003)
[substitute your path to the kafka logs, of course ...]

you can use kafka-manager, an open-source tool powered by yahoo.

You can run the following command:
bin/kafka-run-class.sh kafka.tools.GetOffsetShell --topic=<your topic> --broker-list=<your broker list> --time=-2
This will list all of the brokers with their id and the beginning offset.

Using the Zookeeper CLI
sh /bin/zkCli.sh -server zookeeper-1:2181 ls /brokers/ids
Then to get details of the broker, you can use the "get" with the ids that was the output generated from the previous command
sh /bin/zkCli.sh -server zookeeper-1:2181 get /brokers/ids/<broker-id>

Related

Get total partition count in each Kafka broker

I would like to calculate the number of partitions in each of my broker. We have a muli-DC distributed architecture; and would like to get the partition count per broker for maintenance and admin tasks
This is what was suggested in one of the blogposts; and works fine and this is at cluster level; however I need a similar script for per broker
zookeeper="ZK_SERVER1:2181,ZK_SERVER2:2181,ZK_SERVER3:2181"
sum=0
for i in $(/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --list --zookeeper $zookeeper ); do count=$(/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --describe --zookeeper $zookeeper --topic $i |grep Leader | wc -l); sum=`expr $sum + $count` ; echo 'total partitions is ' $sum; done
Partition count is exposed as a JMX Mbean.
Install some agent such as Prometheus JMX Exporter, Datadog, New Relic, etc. on each broker, then collect and aggregate that information, adding tags for DC for further grouping, as necessary
Otherwise, I don't see why you couldn't add another loop to your script for a list of different Zookeeper endpoints for each Kafka cluster.
You need to parse that output to find per broker
You could use Admin interface, list the topics, describe their metadata (that should contain the hosting broker IDs), then describe the cluster and match the IDs.
This is more or less what kafka-topics does underneath with different commands, as it's just a wrapper for underlying Java application.

How to understand which partition replica kafka broker is down

There's topic with 22 replicas, 50 partitions and 22 running Kafka brokers.
Topic manual assignment screen in Kafka Manager shows that there's Broker Down in all topic partitions as seen in the image.
How to determine Kafka broker that's down using cli or Kafka Manager?
Currently, i look which broker id is missing in Partition replicas.
This information on brokers is also maintained in Zookeeper. So, you could go onto one of the zookeeper nodes and use the cli to extract this and here's a command sequence you could use:
On the command line, just issue the command zookeeper-client and this should invoke the zookeeper command prompt
On the new prompt, issue the command - ls /brokers/ids and this should return the ids of all the active brokers

Kafka Topic Creation with --bootstrap-server gives timeout Exception (kafka version 2.5)

When trying to create topic using --bootstrap-server,
I am getting exception "Error while executing Kafka topic command: Timed out waiting for a node" :-
kafka-topics --bootstrap-server localhost:9092 --topic boottopic --replication-factor 3 --partitions
However following works fine, using --zookeeper :-
kafka-topics --zookeeper localhost:2181--topic boottopic --replication-factor 3 --partitions
I am using Kafka version 2.5 and as per knowledge since version >2.2, all the offsets and metadata are stored on the broker itself. So, while creating topic there's no need to connect to zookeeper.
Please help to understand this behaviour
Note - I have set up a Zookeeper quorum and Kafka broker cluster each containing 3 instance on a single machine (for dev purposes)
Old question, but Ill answer anyways for the sake of internet wisdom.
You probably have auth set, when using --bootstrap-server you need to also specify your credentials with --command-config
since version >2.2, all the ... metadata are stored on the broker itself
False. Topic metadata is still stored on Zookeeper until KIP-500 is completed.
The AdminClient.createTopics() method, however that is used internally will delegate to Zookeeper from the Controller broker node in the cluster.
Hard to say what the error is, but most common issue is that Kafka is not running, you have SSL enabled and the certs are wrong, or the listeners are misconfigured.
For example, in the listeners, the default broker port on a Cloudera Kafka installation would be 6667, not 9092
each containing 3 instance on a single machine
Running 3 instances on one machine does not improve resiliency or performance unless you have 3 CPUs and 3 separate HDDs on that one motherboard.
"Error while executing Kafka topic command: Timed out waiting for a
node"
This seems like your broker is down or is inaccessible from where you are running those commands or it hasn't started yet (perhaps still starting).
Sometimes the broker startup takes long because it performs some cleaning operations. You may want to check your Kafka broker startup logs and see if it is ready and then try creating the topics by giving in the bootstrap servers.
There could also be some errors during your Kafka broker startup like Too many open files or wrong zookeeper url, zookeeper not being accessible by your broker, to name a few.
If you are able to create topics by passing in your Zookeeper URL means that zookeeper is up but does not necessarily mean that your Kafka broker(s) are also up and running.
Since a zookeeper can start without a broker but not vice-versa.

Kafka Consumer: No entry found for connection

I am trying to check the kafka consumer by consuming the data from a topic on a remote Kafka cluster. I am getting the following error when I use the kafka-console-consumer.sh:
ERROR Error processing message, terminating consumer process: (kafka.tools.ConsoleConsumer$)
java.lang.IllegalStateException: No entry found for connection 2147475658
at org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:330)
at org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:134)
at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:885)
at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:276)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.tryConnect(ConsumerNetworkClient.java:548)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$FindCoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:655)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$FindCoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:635)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:204)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:575)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:389)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:297)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:231)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:316)
at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1214)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1179)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1164)
at kafka.tools.ConsoleConsumer$ConsumerWrapper.receive(ConsoleConsumer.scala:436)
at kafka.tools.ConsoleConsumer$.process(ConsoleConsumer.scala:104)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:76)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:54)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
Processed a total of 0 messages
Here is the command that I use:
./bin/kafka-console-consumer.sh --bootstrap-server SSL://{IP}:{PORT},SSL://{IP}:{PORT},SSL://{IP}:{PORT} --consumer.config ./config/consumer.properties --topic MYTOPIC --group MYGROUP
Here is the ./config/consumer.properties file:
bootstrap.servers=SSL://{IP}:{PORT},SSL://{IP}:{PORT},SSL://{IP}:{PORT}
# consumer group id
group.id=MYGROUP
# What to do when there is no initial offset in Kafka or if the current
# offset does not exist any more on the server: latest, earliest, none
auto.offset.reset=earliest
#### Security
security.protocol=SSL
ssl.key.password=test1234
ssl.keystore.location=/opt/kafka/config/certs/keystore.jks
ssl.keystore.password=test1234
ssl.truststore.location=/opt/kafka/config/certs/truststore.jks
ssl.truststore.password=test1234
Do you have any idea what the problem is?
I have found the problem. It was a DNS problem at the end. I was reaching out the Kafka brokers by the IP addresses, but the broker replies with DNS name. After setting the DNS names on the consumer side, it started working again.
I had this problem (with consumers and producers) when running Kafka and Zookeeper as Docker containers.
The solution was to set advertised.listeners in the config/server.properties file of the Kafka brokers, so that it contains the IP address of the container, e.g.
advertised.listeners=PLAINTEXT://172.15.0.8:9092
See https://github.com/maxant/kafkaplayground/blob/master/start-kafka.sh for an example of a script used to start Kafka inside the container after setting up the properties file correctly.
It seems the Kafka cluster listener property is not configured in server.properties.
In the remote kafka cluster, this property should be uncommented with the proper host name.
listeners=PLAINTEXT://0.0.0.0:9092
In my case I was receiving that while trying to connect to my Kafka container, I had to pass the following:
-e KAFKA_LISTENERS=PLAINTEXT://0.0.0.0:9092
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092
Hope it helps someone
Are you sure the remote kafka is running. I would suggest running nmap -p PORT HOST in order to verify the port is open (unless it is configured differently the port should be 9092). If that is ok, then you can use kafkacat which makes things easier. Create a consumer running kafkacat -b HOST:PORT -t YOUR_TOPIC -C -o beginning or create a producer running kafkacat -b HOST:PORT -t YOUR_TOPIC -P

kafka-topics.sh --describe don't return anything

I am running a kafka cluster composed by 3 nodes.
One of the nodes crashed and it has been behaving oddly since then...
The following does not return anything on the malfunctioning node:
kafka-topics.sh --describe --zookeeper mynode01:2181
However, querying the topics on the other nodes return the expected topics.
Another thing I saw is that zookeeper seems to be missing some directories:
./zkCli.sh -server mynode01
[zk: localhost:2181(CONNECTED) 1] ls /
[controller, zookeeper]
Whereas if I check any other node it comes back with:
[zk: localhost:2181(CONNECTED) 0] ls /
[isr_change_notification, zookeeper, admin, consumers, config, controller, brokers]
The logs report the following entry:
Error for partition [myqueue-1,0] to broker 1:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)
I tried a couple of things already to sort this out, with no joy:
Restart the kafka cluster, so that other node becomes leader.
Assign a different leader for the topics affected by running ./kafka-reassign-partitions.sh
Stop kafka and zookeeper services on the affected node, remove kafka-logs and zkdata and start them back up.
Although the cluster seems to be able to treat this node as any other and switch the roles of leader/follower with no issues... it looks like it got out of sync at some point and is not able to recover itself.
Any idea?
Thanks in advance
I was able to solve the issue by stopping zookeeper and kafka services in the affected node and removing the snapshots available in zkdata and the associated transaction logs available in zklog directories.
After starting zookeeper back up on the the affected node, the znodes missing were re-synced back.