Kafka Static IP and Service Discovery - apache-kafka

I have a three node Kakfa cluster that also has a three node Zookeeper ensemble managing it. My configuration for this cluster looks like
Node 1
IP -
Kafka Port - 9092
Zookeeper Port - 2181
Node 2
IP -
Kafka Port - 9092
Zookeeper Port - 2181
Node 3
IP -
Kafka Port - 9092
Zookeeper Port - 2181
For each of these nodes I have both the Zookeeper and Kakfa configuration files. My sample Zookeeper config file looks like
# Zookeeper server config
since each Zookeeper instance needs to know about each other Zookeeper instance and generally from what I have seen, even when managing massive Kafka clusters, there is usually less than 10 Zookeeper nodes. So here we would only need to keep track of 10 IPs. Also from my understanding, these IPs are not as volatile and usually do not change often if ever.
For my Kafka configuration file I have the following on each node
# Kafka server properties file
broker.id=<ID for this node>
listeners=PLAINTEXT://<IP of this node>:9092
Now it makes sense to me that each Kafka node we introduce into our cluster has to be aware of all the Zookeeper nodes so it can be managed. But the issue for me is that as we scale the Kafka nodes up or down, we are less certain about their IPs. For example, if I wanted to create a new Kafka topic, I would use the kafka-topics.sh shell file that they provide and type something like
kafka-topics.sh --create --topic MyTopic --bootstrap-server <IP of one of the Kafka nodes>
# Could also use the broker-list option instead of bootstrap-server to allow multiple IPs
The problem for me is, we never know which Kafka IPs are up and running, so passing the IPs to --bootstrap-server seems like a guessing game, or I need to manually check a working node for its IP.
So for Kafka, how do I configure a static IP (maybe virtual IP?) so that other services that use my Kafka cluster can always connect to it? How do I perform service discovery for a cluster with changing IPs?

there is usually less than 10 Zookeeper nodes
According to Kafka Definitely Guide, 7 is generally the max size of a Zookeeper cluster for large Kafka clusters. Personally, I've not seen more than 5 on a Kafka cluster serving millions of events a day...
You could make a DNS record that resolves to the healthy instances
However, if IPs aren't static, then clients, in general, would have issues because partition leaders are hosted by IP and broker ID. If an ID moves to a new IP or an IP no longer resolves to a (healthy) Kafka broker, your clients start experiencing errors
Note: both bootstrap-server and broker-list accept multiple addresses, but only the console producer uses broker-list param
There are also other ways to create topics, such as Terraform where you could statically store the Kafka addresses as a variable in source code and rarely ever change it. In particular, you don't need to list every IP each time you use a Kafka client, only a handful


Kafka broker setup

To connect to a Kafka cluster I've been provided with a set of bootstrap servers with name and port :
Kafka and Zookeeper are running on the instance s4. From reading https://jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-properties-bootstrap-servers.html, it states:
bootstrap server is a comma-separated list of host and port pairs that
are the addresses of the Kafka brokers in a "bootstrap" Kafka cluster
that a Kafka client connects to initially to bootstrap itself.
I reference the above bootstrap server definition as I'm trying to understand the relationship between the kafka brokers s1,s2,s3 and kafka,zookeeper running on s4.
To connect to the Kafka cluster, I set the broker to a CSV list of 's1,s1,s3'. When I send messages to the CSV list of brokers, to verify the messages are added to the topic, I ssh onto the s4 box and view the messages on the topic.
What is the link between the Kafka brokers s1,s2,s3 and s4? I cannot ssh onto any of the brokers s1,s2,s3 as these brokers do not seem accessible using ssh, should s1,s2,s3 be accessible?
The individual responsible for the setup of the Kafka box is no longer available, and I'm confused as to how this configuration works. I've searched for config references of the brokers s1,s2,s3 on s4 but there does not appear to be any configuration.
When Kafka is being set up and configured what allows the linking between the brokers (in this case s1,s2,s3) and s4?
I start Kafka and Zookeeper on the same server, s4.
Should Kafka and Zookeeper also be running on s1,s2,s3?
What is the link between the Kafka brokers s1,s2,s3 and s4?
As per the Kafka documentation about adding nodes to a cluster, each server must share the same zookeeper.connect string and have a unique broker.id to be part of the cluster.
You may check which nodes are in the cluster via zookeeper-shell with an ls /brokers/ids, or via the Kafka AdminClient API, or kafkacat -L
should s1,s2,s3 be accessible?
Via SSH? They don't have to be.
They should respond to TCP connections from your Kafka client machines on their Kafka server ports, though
Should Kafka and Zookeeper also be running on s1,s2,s3?
You should not have 4 Zookeeper servers in a cluster (odd numbers, only)
Otherwise, you've at least been given some ports for Kafka on those machines, therefore Kafka should be

multiple kafka clusters on single zookeeper ensemble

I currently have a 3 node Kafka cluster which connects to base chroot path in my zookeeper ensemble.
Now, I want to add a new 5 node Kafka cluster which will connect to some other chroot path in the same zookeeper ensemble.
Will these configurations work as in the relative paths for the two chroots? I understand that the original Kafka cluster should have been connected on some path other than the base chroot path for better isolation.
Also, is it good to have same zookeeper ensemble across Kafka clusters? The documentation says that it is generally better to have isolated zookeeper ensembles for different clusters.
If you're only limited to a single Zookeeper cluster, then it should work out fine with a unique chroot that doesn't collide with the other cluster's znodes.
It is not "good" to share, no, because Zookeeper losing quorum causes two clusters to be down, but again if you're limited on hardware, then it'll still work
Note: You can only afford to lose one ZK server with 3 nodes in the cluster, which is why a cluster of 5 is recommended

How to retrieve zookeeper host details from Kafka brokerslist

I have list of Brokers for my Kafka cluster. How can I get the zookeeper host using Brokerslist?
If I got your question right you want to register your brokers at a zookeeper cluster. This actually works the other way round: You have to tell each broker where your zookeeper-server (or cluster) can be found. Have a look at the broker configuration setting zookeeper.connect. Together with the broker.id it will register each broker at the zookeeper cluster.
Hope that answers your question.
You cannot.
Zookeeper is intended to be abstracted away. There is no such API or method to get Zookeepers connected to a broker.
You'll need to SSH to a broker in that list (which you could do from Java}

kafka with multiple zookeeper config

A bit confused about clustering setup:
Zookeeper could be setup as a cluster by configuring myid (1,2,3...) in the file and having for example zookeeper1:2888:3888, zookeeper2:2889:3889 in the zoo.cfg file
For Kafka, in the server.properties file, is it must to specify the full list of zookeeper server for parameter zookeeper.connect, or just 1 is enough? Is there any differences?
I've seen practices of specifying the full list of zookeeper server even when creating a topic, e.g. /opt/kafka/bin/kafka-topics.sh --create --zookeeper x.x.x.x:2181,x.x.x.x:2181,x.x.x.x:2181 --replication-factor 1 --partitions 1 --topic sample_test
---Production and DR setup (large latency is expected between production and dr)---
Let's say, having 1 Kafka (kafka1) and 1 zookeeper server (zookeeper1) in production, 1 kafka (kafka2) and 1 zookeeper server (zookeeper2) in DR, and form those 2 zookeepers into a cluster;
running uReplicator to replicate data in production to DR; from uReplicator example, it seems the configuration is like: kafka1 (in production) is connecting to "zookeeper1:2181/cluster1", and kafka2 (in DR) is connecting to "zookeeper1:2181/cluster2", what's the meaning of "/cluster1", "/cluster2"? what's the right config for this scenario, what's the idea of having kafka2 in DR connects to zookeeper1 in prod?
is it must to specify the full list of zookeeper server for parameter zookeeper.connect
It is good practice to put at least 3 or 5. If you only put one, and that goes down, Kafka will likely not work as expected, or fail out.
in DR, and form those 2 zookeepers into a cluster
It's not generally encouraged to share Zookeepers clusters between Kafka clusters, as Kafka puts a reasonable amount of load on Zookeeper for high volume Kafka clusters.
Though, as you point out
connecting to "zookeeper1:2181/cluster1", and kafka2 (in DR) is connecting to "zookeeper1:2181/cluster2", what's the meaning of "/cluster1", "/cluster2"?
This is called a Chroot in Zookeeper. Think of it like a directory, or namespace for each unique Kafka cluster within the Zookeeper cluster.
what's the idea of having kafka2 in DR connects to zookeeper1 in prod?
Well, you wouldn't. If Kafka2 has its own unique topic data that is not being replicated to Kafka1, then pointing at the Zookeeper data that says those topics existed on Kafka2, but not Kafka1 will only result in confusion and error.
Also, I am unaware of how uReplicator works other than MirrorMaker, but you'll also want to prepare a DR strategy for Zookeeper, not only Kafka
You have two questions in there. I'll try to tackle the first one at least:
Specifying only one zookeeper server:port is usually enough, but in production instances/properties, you always want to configure all of them. If one of the servers is down, but the cluster is still up and running (say, 2 out of 3 Zookeeper servers are up), Kafka will try the next server in the config, until it finds one it can talk to. However, if the only one you chose to put happens to be down at that exact time, the server won't be able to talk to Zookeeper at all. It's best to always include the entire list of zookeeper servers in configuration.

I can't produce data to Kafka when use the script, But I can list topics with the script

everybody,there is a virtual server in the local area network which ip is, and my machine ip is
Today, I try to use my machine ( to send some messages to my virtual server(, with the Kafka console producer
$ bin/kafka-console-producer.sh --broker-list --topic test
but there is something wrong. The description of the problem is :
[2017-04-10 17:25:40,396] ERROR Error when sending message to topic test with key: null, value: 6 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for test-0 due to 1568 ms has passed since batch creation plus linger time
But when I use the kafka-topics script to list topics, it works:
$ bin/kafka-topics.sh --list --zookeeper
This problem confused me a very long period, can any body help me to solve it?
If you have a zookeeper instance which is running, you can of course ask the list of topics. However, it seems that you have no Kafka broker available.
You maybe have a zookeeper running but not Kafka.
Your Kafka producer might be running on a machine which cannot access your virtual machine where your Kafka broker is running.
Also, not only should the broker port be open, but also it must be answerable by the broker i.e, the (advertised) listeners of your Kafka broker must have your virtual machine IP (IP accessible from where your Kafka producer is running, because a VM can have multiple IPs and there is no rule that all IPs will be accessible).
For example, your virtual machine have two IPs and and your producer on another machine points to, you must be able first to ping and telnet to this IP.
Next, you must have this IP in the advertised listeners in your Kafka broker.
You should set this IP only as your bootstrap.servers in your Kafka Producer.
You should also ensure that the port is not just open to localhost or like for example, when you do netstat, it should not have just localhost:9092 or, it should use any local address or your IP