Kafka: org.apache.zookeeper.KeeperException$NoNodeException while creating topic on multi server setup - apache-kafka

I am trying to setup multi node Kafka-0.8.2.2 cluster with 1 Producer, 1 consumer and 3 brokers all on different machines.
While creating topic on producer, I am getting error as org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /brokers/ids. Complete console output is available here. There is no error in Kafka Producer's log.
Command I am using to run Kafka is:
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic edwintest
Note: Zookeeper service is running on all the servers, and all three brokers are having Kafka servers running on them (Only Brokers need Kafka Server. Right?).
Configuration of my producer.properties is as:
metadata.broker.list=<IP.OF.BROKER.1>:9092,<IP.OF.BROKER.2>:9092,<IP.OF.BROKER.3>:9092
producer.type=sync
compression.codec=none
serializer.class=kafka.serializer.DefaultEncoder
Below are some of the many articles I was using as reference:
Zookeeper & Kafka Install : A single node and a multiple broker cluster - 2016
Step by Step of Installing Apache Kafka and Communicating with Spark

At the very first glance it seems like you're calling create topic to a local zookeeper which is not aware of any of your kafka-brookers. You should call ./bin/kafka-topics.sh --create --zookeeper <IP.OF.BROKER.1>:2181

The issue was because I was trying to connect to the zookeeper of localhost. My understanding was zookeeper is needed to be running on producer, consumer, and the Kafka brokers, and the communication is done between producer -> broker and broker -> consumer via zookeeper. But that was incorrect. Actually:
Zookeeper and Kafka servers should be running only on on broker servers. While creating the topic or publishing the content to the topic, public DNS of any of the Kafka broker should be passed with --zookeeper option. There is no need to run Kafka server on producer or consumer instance.
Correct command will be:
./bin/kafka-topics.sh --create --zookeeper <Public-DNS>:<PORT> --replication-factor 1 --partitions 3 --topic edwintest
where: Public-DNS is the DNS of any of the Kafka broker and PORT is the port of the zookeeper service.

Related

Not able to read messages from kafka consumer in kafka cluster setup

I have created two kafka brokers in a kafka cluster. When one broker is down I am not able to get any data to kafka consumer.
I am using this command to read messages from consumer:
bin/kafka-console-consumer.sh --topic test_kafka_cluster \
--bootstrap-server 127.0.0.1:9092,127.0.0.2:9092 --from-beginning
Here as per your console consumer configuration, IP address used here are 127.0.0.1 and 127.0.0.2 and two bootstrap servers are configured as 9092.
Verify both the ip's are reachable
bin/kafka-console-consumer.sh --topic test_kafka_cluster \
--bootstrap-server 127.0.0.1:9092,127.0.0.2:9092 --from-beginning
Ideally when we run tow kafka broker instance it will be running in two different ports.
Presuming Kafka is running in local
Eg: localhost:9092 localhost:9093
Kafka instances running on two different host:
Eg: 127.0.0.3:9092, 127.0.0.2:9092
If Kafka is running on docker/docker toolbox:
Console consumer on Docker toolbox:
docker exec <container-name> kafka-console-consumer --bootstrap-server 192.168.99.100:9093 --topic <topic-name> --from-beginning
Console consumer on Docker:
docker exec <container-name> kafka-console-consumer --bootstrap-server localhost:9093 localhost 9092 --topic <topic-name> --from-beginning
There are two parameters that affect topics availability for a consumer:
min.insync.replicas (minISR): Minimum number of in-sync partition's replicas.
replication.factor (RF): Total number of partition's replicas.
If you want your consumer to survive a broker outage, then you must have RF > minISR. With RF=2 and minISR=2 you can't tolerate any broker down, with RF=3 and minISR=2 you can tolerate 1 broker down, with RF=5 and minISR=2 you can tolerate 3 brokers down, and so on.
Note that the internal __consumer_offsets topic is used to store consumer offsets and has a default RF value of 3, which is not achievable in your cluster of 2 nodes. So you also need to set offsets.topic.replication.factor=1 at the cluster level.

Kafka:why topic creation in kafka is done in zookeper host instead of broker

I am trying to learn Kafka,
This is the command I saw for topic creation.
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partition 2 --topic test
why do we pass zookeeper host,port for topic creation, instead to broker(s) information?
what does zookeeper do with topic?
Does topic info is persisted in zookeeper
Which Kafka version are you using? In the latest version, the --zookeeper parameter is deprecated and you can use the --bootstrap-server instead, providing Kafka broker(s) address(s).
Anyway, the topic information will be still stored in Zookeeper. This is the way how Kafka works.
The fact that --zookeeper is deprecated is because the development is going in the direction of making clients less aware of Zookeeper and doing all the operations connecting to the Kafka brokers; it's the broker doing the operation on Zookeeper then.

Consuming and Producing Kafka messages on different servers

How can I produce and consume messages from different servers?
I tried the Quickstart tutorial, but there is no instructions on how to setup for multi server clusters.
My Steps
Server A
1)bin/zookeeper-server-start.sh config/zookeeper.properties
2)bin/kafka-server-start.sh config/server.properties
3)bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor
1 --partitions 1 --topic test
4)bin/kafka-console-producer.sh --broker-list SERVER-a.IP:9092 --topic test
Server B
1A)bin/kafka-console-consumer.sh --bootstrap-server SERVER-a.IP:9092 --topic
test --from-beginning
1B)bin/kafka-console-consumer.sh --bootstrap-server SERVER-a.IP:2181 --topic
test --from-beginning
When I run 1A) consumer and enter messages into the producer, there is no messages appearing in the consumer. Its just blank.
When I run 1B consumer instead, I get a huge & very fast stream of error logs in Server A until I Ctrl+C the consumer. See below
Error log on Server A streaming at hundreds per second
WARN Exception causing close of session 0x0 due to java.io.EOFException (org.apache.zookeeper.server.NIOServerCnxn)
O Closed socket connection for client /188.166.178.40:51168 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
Thanks
Yes, if you want to have your producer on Server A and your consumer on server B, you are in the right direction.
You need to run a Broker on server A to make it work.
bin/kafka-server-start.sh config/server.properties
The other commands are correct.
If anyone is looking for a similar topic for kafka-steams application, it appears that multiple kafka cluster is not supported yet:
Here is a documentation from kafka: https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#bootstrap-servers
bootstrap.servers
(Required) The Kafka bootstrap servers. This is the same setting that is used by the underlying producer and consumer clients to connect to the Kafka cluster. Example: "kafka-broker1:9092,kafka-broker2:9092".
Tip:
Kafka Streams applications can only communicate with a single Kafka
cluster specified by this config value. Future versions of Kafka
Streams will support connecting to different Kafka clusters for
reading input streams and writing output streams.

Kafka bootstrap-servers vs zookeeper in kafka-console-consumer

I'm trying to test run a single Kafka node with 3 brokers & zookeeper. I wish to test using the console tools. I run the producer as such:
kafka-console-producer --broker-list localhost:9092,localhost:9093,localhost:9094 --topic testTopic
Then I run the consumer as such:
kafka-console-consumer --zookeeper localhost:2181 --topic testTopic --from-beginning
And I can enter messages in the producer and see them in the consumer, as expected. However, when I run the updated version of the consumer using bootstrap-server, I get nothing. E.g
kafka-console-consumer --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic testTopic --from-beginning
This worked fine when I had one broker running on port 9092 so I'm thoroughly confused. Is there a way I can see what zookeeper is providing as the bootstrap server? Is the bootstrap server different from the broker list? Kafka compiled with Scala 2.11.
I have no idea what was wrong. Likely I put Kafka or Zookeeper in a weird state. After deleting the topics in the log.dir of each broker AND the zookeeper topics in /brokers/topics then recreating the topic, Kafka consumer behaved as expected.
Bootstrap servers are same as kafka brokers. And if you want to see the list of bootstrap server zookeeper is providing, you can query ZNode information via any ZK client. All active brokers are registered under /brokers/ids/[brokerId]. All you need is zkQuorum address. Below command will give you
list of active bootstrap servers :
./zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
I experienced the same problem when using mismatched versions of:
Kafka client libraries
Kafka scripts
Kafka brokers
In my exact scenario I was using Confluent Kafka client libraries version 0.10.2.1 with Confluent Platform 3.3.0 w/ Kafka broker 0.11.0.0. When I downgraded my Confluent Platform to 3.3.2 which matched my client libraries the consumer worked as expected.
My theory is that the latest kafka-console-consumer using the new Consumer API was only retrieving messages using the latest format. There were a number of message format changes introduced in Kafka 0.11.0.0.

relationship zookeeper and kafka

I am learning kafka CLI. And have a question.
why we have to use --zookeeper option when creating topics and consuming message, however when producing messages, we just use --broker-list ,which just refers to Kafka itself,
1.create topics
./kafka-topics.sh --create --zookeeper `docker-machine ip bigdata` --replication-factor 1 --partitions 1 --topic bigdata
2.produce message
./kafka-console-producer.sh --broker-list `docker-machine ip bigdata`:9092 --topic bigdata
3.consume message
./kafka-console-consumer.sh --zookeeper `docker-machine ip bigdata`:2181 --topic bigdata
I know kafka must use zookeeper for coordination. But I still don't get it very clear from the CLI commmand
You have to distinguish between topic management and topic consumption. ZK in not only used for broker coordination but also for topic management.
For topic management, ZK is used to store topic metadata and brokers do (currently, Kafka 0.10.1) not offer an API for topic management. Thus, the admin CLI tools do actually talk directly to ZK (and not to the broker). This will change in the future, when the new "admin client" is fully implemented (c.f. https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations)
For topic consumption, ZK is not required and consumer and producer clients only talk to the brokers.