I'm trying to test run a single Kafka node with 3 brokers & zookeeper. I wish to test using the console tools. I run the producer as such:
kafka-console-producer --broker-list localhost:9092,localhost:9093,localhost:9094 --topic testTopic
Then I run the consumer as such:
kafka-console-consumer --zookeeper localhost:2181 --topic testTopic --from-beginning
And I can enter messages in the producer and see them in the consumer, as expected. However, when I run the updated version of the consumer using bootstrap-server, I get nothing. E.g
kafka-console-consumer --bootstrap-server localhost:9092,localhost:9093,localhost:9094 --topic testTopic --from-beginning
This worked fine when I had one broker running on port 9092 so I'm thoroughly confused. Is there a way I can see what zookeeper is providing as the bootstrap server? Is the bootstrap server different from the broker list? Kafka compiled with Scala 2.11.
I have no idea what was wrong. Likely I put Kafka or Zookeeper in a weird state. After deleting the topics in the log.dir of each broker AND the zookeeper topics in /brokers/topics then recreating the topic, Kafka consumer behaved as expected.
Bootstrap servers are same as kafka brokers. And if you want to see the list of bootstrap server zookeeper is providing, you can query ZNode information via any ZK client. All active brokers are registered under /brokers/ids/[brokerId]. All you need is zkQuorum address. Below command will give you
list of active bootstrap servers :
./zookeeper-shell.sh localhost:2181 <<< "ls /brokers/ids"
I experienced the same problem when using mismatched versions of:
Kafka client libraries
Kafka scripts
Kafka brokers
In my exact scenario I was using Confluent Kafka client libraries version 0.10.2.1 with Confluent Platform 3.3.0 w/ Kafka broker 0.11.0.0. When I downgraded my Confluent Platform to 3.3.2 which matched my client libraries the consumer worked as expected.
My theory is that the latest kafka-console-consumer using the new Consumer API was only retrieving messages using the latest format. There were a number of message format changes introduced in Kafka 0.11.0.0.
Related
I am using apache kafka in 1st server and apache zookeeper in 2nd server.
I want to have kafka connect service in other server.Is it possible to use a standalone service. I need to have apache kafka connect or confluent kafka connect
There is no such thing as "Confluent Kafka (Connect)"
Kafka Connect is Apache 2.0 Licensed and released as part of Apache Kafka. Confluent and other vendors write plugins (free, or enterprise-licensed) for Kafka Connect.
Yes, it it recommended to run Connect as a separate set of servers than either then brokers or zookeepers. In order to run it, you will need to download all of Kafka, then use bin/connect-distributed, or you can run it via Docker containers.
You can easily run a Kafka Connect Standalone (Single Process) service from any server, provided you have configured both connector and workers properties correctly.
A gist of it here, if you are interested.
In standalone mode all work is performed in a single process. It is easy to setup and get started but it does not benefit from some of the features of Kafka Connect such as fault tolerance.
You can start a standalone process with the following command:
bin/connect-standalone.sh config/connect-standalone.properties connector1.properties connector2.properties
The first parameter is the configuration for the worker, it includes connection parameters, serialization format, and how frequently to commit offsets.
All workers (both standalone and distributed) require a few configs:
bootstrap.servers - List of Kafka servers used to bootstrap connections to Kafka
Key/Value.converter - Converter class used to convert between Kafka Connect format and the serialized form that is written to Kafka.
offset.storage.file.filename - File to store offset data
A Simple standalone connecter (Import Data from a file into KAFKA)
kafka-topics --create --zookeeper localhost:2181 --replication-factor 2 --partitions 10 --topic connect-test to create a topic called connect-test with 10 partitions (Up to us) and replication factor of 2
To start a standalone Kafka Connector, we need following three configuration files located under C:\kafka_2.11-1.1.0\config. Update these configuration files as following
connect-standalone.properties
offset.storage.file.filename=C:/kafka_2.11-1.1.0/connectConfig/connect.offsets
connect-file-source.properties
file=C:/kafka/Workspace/kafka.txt
connect-file-sink.properties
file=test.sink.txt
Execute the following command
Connect-standalone.bat C:\kafka_2.11-1.1.0\config\connect-standalone.properties C:\kafka_2.11-1.1.0\config\connect-file-source.properties C:\kafka_2.11-1.1.0\config\connect-file-sink.properties
Once the Connector is started, initially the data in kafka.txt would be synced to test.sync.txt and the data is published to the Kafka Topic named, connect-test. Then any changes to the kafka.txt file would be synced to test.sync.txt and published to connect-test topic.
Create a Consumer
kafka-console-consumer.bat --bootstrap-server localhost:9096 --topic connect-test --from-beginning
CLI Output
kafka-console-consumer --bootstrap-server localhost:9096 --topic connect-test --from-beginning
{"schema":{"type":"string","optional":false},"payload":"This is the stream data for the KAFKA Connector"}
Add a new line into “Consuming the events now” into the kafka.txt
CLI Output
kafka-console-consumer --bootstrap-server localhost:9096 --topic connect-test --from-beginning
{"schema":{"type":"string","optional":false},"payload":"This is the stream data for the KAFKA Connector"}
{"schema":{"type":"string","optional":false},"payload":"Consuming the events now"}
I am trying to learn Kafka,
This is the command I saw for topic creation.
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partition 2 --topic test
why do we pass zookeeper host,port for topic creation, instead to broker(s) information?
what does zookeeper do with topic?
Does topic info is persisted in zookeeper
Which Kafka version are you using? In the latest version, the --zookeeper parameter is deprecated and you can use the --bootstrap-server instead, providing Kafka broker(s) address(s).
Anyway, the topic information will be still stored in Zookeeper. This is the way how Kafka works.
The fact that --zookeeper is deprecated is because the development is going in the direction of making clients less aware of Zookeeper and doing all the operations connecting to the Kafka brokers; it's the broker doing the operation on Zookeeper then.
Looking through the instructions --
https://www.cloudera.com/documentation/kafka/latest/topics/kafka_command_line.html
I'm running these test command lines and one set works, but the other set doesn't.
Following instructions, it works, but noticed it has "zookeeper" as a parameter and I thought it was discontinued.
Producer:
/usr/bin/kafka-console-producer --broker-list local-ip:9092 --topic test
Consumer:
/usr/bin/kafka-console-consumer --bootstrap-server local-ip:9092 --topic test --from-beginning
the above doesn't work on the Cloudera version, but works on my standalone Kafka installs.
This works on Cloudera:
/usr/bin/kafka-console-consumer --zookeeper local-ip:2181 --topic test --from-beginning
Trying to understand what the difference between the Cloudera's Kakfa version (3.0.0-1.3.0.0.p0.40?) and mine (2.11-0.11.0.1) or there has to be something turned on or off.
I see some similar topic, and tried following them to no avail. I think it's something to do with Cloudera.
Updated answer:
In my case, I have two brokers configured and the kafka's config value for offsets.topic.replication.factor set to 3. So, when Kafka tries to build a topic with more replicas than available brokers, an exception is thrown and the topic is not created.
The solution is to set offsets.topic.replication.factor = 2 and try again. Maybe you need to remove and deploy the brokers again.
I don't know why, maybe is a bug in Cloudera's Kafka release, but I solved it with a local kafka test.
I've downloaded latest version of Kafka from https://kafka.apache.org/downloads and updated the broker config file config\server.properties to use remote zookeeper server. With this, I had a mix configuration broker cluster:
brokers in my laptop
zookeeper in the cloudera cluster
With this configuration, I created a topic and run the kafka-console-consumer and kafka-console-producer from my laptop but against remote zookeeper:
$ kafka-topics --create --zookeeper zookeeper.cloudera-cluster:2181 --replication-factor 1 --partitions 1 --topic test
$ kafka-console-consumer --broker-list localhost:9092 --topic test
$ kafka-console-producer --broker-list localhost:9092 --topic test
this works propertly. Furthermore, using this the topic __consumer_offsets has been created automatically and now the new-consumer version works perfectly. At this point, you can remove the topic created and stop the local brokers and start to use kafka cluster normally.
Is this a bug from Cloudera's release?
Maybe the version of Cloudera is not able to build __consumer_offsets automatically?
Kafka version downloaded: kafka_2.11-1.0.0.tgz
Cloudera's kafka version: 3.0.0-1.3.0.0.p0.40
How can I produce and consume messages from different servers?
I tried the Quickstart tutorial, but there is no instructions on how to setup for multi server clusters.
My Steps
Server A
1)bin/zookeeper-server-start.sh config/zookeeper.properties
2)bin/kafka-server-start.sh config/server.properties
3)bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor
1 --partitions 1 --topic test
4)bin/kafka-console-producer.sh --broker-list SERVER-a.IP:9092 --topic test
Server B
1A)bin/kafka-console-consumer.sh --bootstrap-server SERVER-a.IP:9092 --topic
test --from-beginning
1B)bin/kafka-console-consumer.sh --bootstrap-server SERVER-a.IP:2181 --topic
test --from-beginning
When I run 1A) consumer and enter messages into the producer, there is no messages appearing in the consumer. Its just blank.
When I run 1B consumer instead, I get a huge & very fast stream of error logs in Server A until I Ctrl+C the consumer. See below
Error log on Server A streaming at hundreds per second
WARN Exception causing close of session 0x0 due to java.io.EOFException (org.apache.zookeeper.server.NIOServerCnxn)
O Closed socket connection for client /188.166.178.40:51168 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
Thanks
Yes, if you want to have your producer on Server A and your consumer on server B, you are in the right direction.
You need to run a Broker on server A to make it work.
bin/kafka-server-start.sh config/server.properties
The other commands are correct.
If anyone is looking for a similar topic for kafka-steams application, it appears that multiple kafka cluster is not supported yet:
Here is a documentation from kafka: https://kafka.apache.org/10/documentation/streams/developer-guide/config-streams.html#bootstrap-servers
bootstrap.servers
(Required) The Kafka bootstrap servers. This is the same setting that is used by the underlying producer and consumer clients to connect to the Kafka cluster. Example: "kafka-broker1:9092,kafka-broker2:9092".
Tip:
Kafka Streams applications can only communicate with a single Kafka
cluster specified by this config value. Future versions of Kafka
Streams will support connecting to different Kafka clusters for
reading input streams and writing output streams.
I am new to kafka!!
Is it possible to exclude a topic from a broker in apache kafka?? How?
To run a producer for a topic we give the command:
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic kafkatopic
we do not have the option of broker in this command. Can we exclude broker anyways, either command line or java API??
Also is there any documentation of java API for Kafka? Any tutorials??
Thanks in advance :)