Consuming and Producing Kafka messages on different servers - apache-kafka

How can I produce and consume messages from different servers?
I tried the Quickstart tutorial, but there is no instructions on how to setup for multi server clusters.
My Steps
Server A
1)bin/zookeeper-server-start.sh config/zookeeper.properties
2)bin/kafka-server-start.sh config/server.properties
3)bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor
1 --partitions 1 --topic test
4)bin/kafka-console-producer.sh --broker-list SERVER-a.IP:9092 --topic test
Server B
1A)bin/kafka-console-consumer.sh --bootstrap-server SERVER-a.IP:9092 --topic
test --from-beginning
1B)bin/kafka-console-consumer.sh --bootstrap-server SERVER-a.IP:2181 --topic
test --from-beginning
When I run 1A) consumer and enter messages into the producer, there is no messages appearing in the consumer. Its just blank.
When I run 1B consumer instead, I get a huge & very fast stream of error logs in Server A until I Ctrl+C the consumer. See below
Error log on Server A streaming at hundreds per second
WARN Exception causing close of session 0x0 due to java.io.EOFException (org.apache.zookeeper.server.NIOServerCnxn)
O Closed socket connection for client /188.166.178.40:51168 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
Thanks

Yes, if you want to have your producer on Server A and your consumer on server B, you are in the right direction.
You need to run a Broker on server A to make it work.
bin/kafka-server-start.sh config/server.properties
The other commands are correct.

If anyone is looking for a similar topic for kafka-steams application, it appears that multiple kafka cluster is not supported yet:
Here is a documentation from kafka: https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#bootstrap-servers
bootstrap.servers
(Required) The Kafka bootstrap servers. This is the same setting that is used by the underlying producer and consumer clients to connect to the Kafka cluster. Example: "kafka-broker1:9092,kafka-broker2:9092".
Tip:
Kafka Streams applications can only communicate with a single Kafka
cluster specified by this config value. Future versions of Kafka
Streams will support connecting to different Kafka clusters for
reading input streams and writing output streams.

Related

Can i have kafka connect service as a standalone service

I am using apache kafka in 1st server and apache zookeeper in 2nd server.
I want to have kafka connect service in other server.Is it possible to use a standalone service. I need to have apache kafka connect or confluent kafka connect
There is no such thing as "Confluent Kafka (Connect)"
Kafka Connect is Apache 2.0 Licensed and released as part of Apache Kafka. Confluent and other vendors write plugins (free, or enterprise-licensed) for Kafka Connect.
Yes, it it recommended to run Connect as a separate set of servers than either then brokers or zookeepers. In order to run it, you will need to download all of Kafka, then use bin/connect-distributed, or you can run it via Docker containers.
You can easily run a Kafka Connect Standalone (Single Process) service from any server, provided you have configured both connector and workers properties correctly.
A gist of it here, if you are interested.
In standalone mode all work is performed in a single process. It is easy to setup and get started but it does not benefit from some of the features of Kafka Connect such as fault tolerance.
You can start a standalone process with the following command:
bin/connect-standalone.sh config/connect-standalone.properties connector1.properties connector2.properties
The first parameter is the configuration for the worker, it includes connection parameters, serialization format, and how frequently to commit offsets.
All workers (both standalone and distributed) require a few configs:
bootstrap.servers - List of Kafka servers used to bootstrap connections to Kafka
Key/Value.converter - Converter class used to convert between Kafka Connect format and the serialized form that is written to Kafka.
offset.storage.file.filename - File to store offset data
A Simple standalone connecter (Import Data from a file into KAFKA)
kafka-topics --create --zookeeper localhost:2181 --replication-factor 2 --partitions 10 --topic connect-test to create a topic called connect-test with 10 partitions (Up to us) and replication factor of 2
To start a standalone Kafka Connector, we need following three configuration files located under C:\kafka_2.11-1.1.0\config. Update these configuration files as following
connect-standalone.properties
offset.storage.file.filename=C:/kafka_2.11-1.1.0/connectConfig/connect.offsets
connect-file-source.properties
file=C:/kafka/Workspace/kafka.txt
connect-file-sink.properties
file=test.sink.txt
Execute the following command
Connect-standalone.bat C:\kafka_2.11-1.1.0\config\connect-standalone.properties C:\kafka_2.11-1.1.0\config\connect-file-source.properties C:\kafka_2.11-1.1.0\config\connect-file-sink.properties
Once the Connector is started, initially the data in kafka.txt would be synced to test.sync.txt and the data is published to the Kafka Topic named, connect-test. Then any changes to the kafka.txt file would be synced to test.sync.txt and published to connect-test topic.
Create a Consumer
kafka-console-consumer.bat --bootstrap-server localhost:9096 --topic connect-test --from-beginning
CLI Output
kafka-console-consumer --bootstrap-server localhost:9096 --topic connect-test --from-beginning
{"schema":{"type":"string","optional":false},"payload":"This is the stream data for the KAFKA Connector"}
Add a new line into “Consuming the events now” into the kafka.txt
CLI Output
kafka-console-consumer --bootstrap-server localhost:9096 --topic connect-test --from-beginning
{"schema":{"type":"string","optional":false},"payload":"This is the stream data for the KAFKA Connector"}
{"schema":{"type":"string","optional":false},"payload":"Consuming the events now"}

Consuming Kafka: the basics

I am very new to Kafka. Following a few tutorials, I have the following questions regarding consuming actual Kafka topics.
The situation: there is a server in my workplace that is streaming Kafka topics. I have the topic name. I would like to consume this topic from my machine (Windows WSL2 Ubuntu). From this tutorial, I am able to
Run zookeeper with this command:
bin/zookeeper-server-start.sh config/zookeeper.properties
Create a broker with:
bin/kafka-server-start.sh config/server.properties
Run a producer console, with a fake topic named quickstart-events at port localhost:9092:
bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
Run a consumer console listening to localhost:9092 and receive the streaming data from the producer:
bin/kafka-console-consumer.sh --topic quickstart-events --from-beginning --bootstrap-server localhost:9092
Now for my real situation: if I know the topic name, what else do I need in order to apply the same steps above to listen to it as a consumer? What are the steps involved? I read in other threads about tunnelling with Jumphost. How to do that?
I understand this question is rather generic. Appreciate any pointers to any relevant readings or guidance.
Based on your company nameserver the next procedure should be done in your wsl instance
To gain outside connection
unable to access network from WSL2
You need to set bootstrap-server to your company server
--bootstrap-server my.company.com:9092

Apache kafka broker consuming messages intended for someone else

I've a local Apache Kafka setup and there are total 2 broker (id - 0 and 1) on port 9092 and 9093.
I created a topic and published the messages using this command:
bin/kafka-console.-producer.sh --broker-list localhost:9092 --topic test
Then I consumed the messages on other terminal using the command:
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test
Till now everything is fine.
But when i type command -
bin/kafka-console.-producer.sh --broker-list localhost:9093 --topic test
and write some messages it is showing in the 2nd terminal where I've typed this command -
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test
Why port 9093 messages are publishing to 9092?
Your cluster contains of two brokers. It is not important which host you use for initial connection. Using kafka client you don't specify from which broker you consume or to which your produce messages. Those hostname are only to discover whole list of kafka brokers (cluster)
According to documentation:
https://kafka.apache.org/documentation/#producerconfigs
https://kafka.apache.org/documentation/#consumerconfigs
bootstrap.servers::
A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. The client will make use of all servers irrespective of which servers are specified here for bootstrapping—this list only impacts the initial hosts used to discover the full set of servers.

Kafka bootstrap-servers vs zookeeper in kafka-console-consumer

I'm trying to test run a single Kafka node with 3 brokers & zookeeper. I wish to test using the console tools. I run the producer as such:
kafka-console-producer --broker-list localhost:9092,localhost:9093,localhost:9094 --topic testTopic
Then I run the consumer as such:
kafka-console-consumer --zookeeper localhost:2181 --topic testTopic --from-beginning
And I can enter messages in the producer and see them in the consumer, as expected. However, when I run the updated version of the consumer using bootstrap-server, I get nothing. E.g
kafka-console-consumer --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic testTopic --from-beginning
This worked fine when I had one broker running on port 9092 so I'm thoroughly confused. Is there a way I can see what zookeeper is providing as the bootstrap server? Is the bootstrap server different from the broker list? Kafka compiled with Scala 2.11.
I have no idea what was wrong. Likely I put Kafka or Zookeeper in a weird state. After deleting the topics in the log.dir of each broker AND the zookeeper topics in /brokers/topics then recreating the topic, Kafka consumer behaved as expected.
Bootstrap servers are same as kafka brokers. And if you want to see the list of bootstrap server zookeeper is providing, you can query ZNode information via any ZK client. All active brokers are registered under /brokers/ids/[brokerId]. All you need is zkQuorum address. Below command will give you
list of active bootstrap servers :
./zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
I experienced the same problem when using mismatched versions of:
Kafka client libraries
Kafka scripts
Kafka brokers
In my exact scenario I was using Confluent Kafka client libraries version 0.10.2.1 with Confluent Platform 3.3.0 w/ Kafka broker 0.11.0.0. When I downgraded my Confluent Platform to 3.3.2 which matched my client libraries the consumer worked as expected.
My theory is that the latest kafka-console-consumer using the new Consumer API was only retrieving messages using the latest format. There were a number of message format changes introduced in Kafka 0.11.0.0.

Kafka: org.apache.zookeeper.KeeperException$NoNodeException while creating topic on multi server setup

I am trying to setup multi node Kafka-0.8.2.2 cluster with 1 Producer, 1 consumer and 3 brokers all on different machines.
While creating topic on producer, I am getting error as org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /brokers/ids. Complete console output is available here. There is no error in Kafka Producer's log.
Command I am using to run Kafka is:
./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 3 --topic edwintest
Note: Zookeeper service is running on all the servers, and all three brokers are having Kafka servers running on them (Only Brokers need Kafka Server. Right?).
Configuration of my producer.properties is as:
metadata.broker.list=<IP.OF.BROKER.1>:9092,<IP.OF.BROKER.2>:9092,<IP.OF.BROKER.3>:9092
producer.type=sync
compression.codec=none
serializer.class=kafka.serializer.DefaultEncoder
Below are some of the many articles I was using as reference:
Zookeeper & Kafka Install : A single node and a multiple broker cluster - 2016
Step by Step of Installing Apache Kafka and Communicating with Spark
At the very first glance it seems like you're calling create topic to a local zookeeper which is not aware of any of your kafka-brookers. You should call ./bin/kafka-topics.sh --create --zookeeper <IP.OF.BROKER.1>:2181
The issue was because I was trying to connect to the zookeeper of localhost. My understanding was zookeeper is needed to be running on producer, consumer, and the Kafka brokers, and the communication is done between producer -> broker and broker -> consumer via zookeeper. But that was incorrect. Actually:
Zookeeper and Kafka servers should be running only on on broker servers. While creating the topic or publishing the content to the topic, public DNS of any of the Kafka broker should be passed with --zookeeper option. There is no need to run Kafka server on producer or consumer instance.
Correct command will be:
./bin/kafka-topics.sh --create --zookeeper <Public-DNS>:<PORT> --replication-factor 1 --partitions 3 --topic edwintest
where: Public-DNS is the DNS of any of the Kafka broker and PORT is the port of the zookeeper service.