Kafka client can't receive messages - apache-kafka

I have kafka and zookeeper set up on a remote machine. On that machine I'm able to see below working using the test method on official website.
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic listings-incoming
This is a message
This is another message
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --listings-incoming --from-beginning
This is a message
This is another message
but when I use my local consumer script it is not working:
bin/kafka-console-consumer.sh —bootstrap-server X.X.X.X:9092 —listings-incoming —from-beginning —consumer-property group.id=group2
Haven't seen messages showing up but what is showing is:
[2017-08-11 14:39:56,425] WARN Auto-commit of offsets {listings-incoming-4=OffsetAndMetadata{offset=0, metadata=''}, listings-incoming-2=OffsetAndMetadata{offset=0, metadata=''}, listings-incoming-3=OffsetAndMetadata{offset=0, metadata=''}, listings-incoming-0=OffsetAndMetadata{offset=0, metadata=''}, listings-incoming-1=OffsetAndMetadata{offset=0, metadata=''}} failed for group group1: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
*****************update**********************
My zookeeper and kafka is running on the same machine, right now my configuration on advertised.listeners is this:
advertised.listeners=PLAINTEXT://the.machine.ip.address:9092
I tried to change it to:
advertised.listeners=PLAINTEXT://my.client.ip.address:9092
and then run the client side consumer script, it gives error:
[2017-08-11 15:49:01,591] WARN Error while fetching metadata with
correlation id 3 : {listings-incoming=LEADER_NOT_AVAILABLE}
(org.apache.kafka.clients.NetworkClient) [2017-08-11 15:49:22,106]
WARN Bootstrap broker 10.161.128.238:9092 disconnected
(org.apache.kafka.clients.NetworkClient) [2017-08-11 15:49:22,232]
WARN Error while fetching metadata with correlation id 7 :
{listings-incoming=LEADER_NOT_AVAILABLE}
(org.apache.kafka.clients.NetworkClient) [2017-08-11 15:49:22,340]
WARN Error while fetching metadata with correlation id 8 :
{listings-incoming=LEADER_NOT_AVAILABLE}
(org.apache.kafka.clients.NetworkClient) [2017-08-11 15:49:40,453]
WARN Bootstrap broker 10.161.128.238:9092 disconnected
(org.apache.kafka.clients.NetworkClient) [2017-08-11 15:49:40,531]
WARN Error while fetching metadata with correlation id 12 :
{listings-incoming=LEADER_NOT_AVAILABLE}
(org.apache.kafka.clients.NetworkClient)

You probably have not configured your advertised.listeners properly in the brokers server.properties file.
From https://kafka.apache.org/documentation/
advertised.listeners Listeners to publish to ZooKeeper for clients to
use, if different than the listeners above. In IaaS environments, this
may need to be different from the interface to which the broker binds.
If this is not set, the value for listeners will be used.
and in the same documentation
listeners Listener List - Comma-separated list of URIs we will listen
on and the listener names. If the listener name is not a security
protocol, listener.security.protocol.map must also be set. Specify
hostname as 0.0.0.0 to bind to all interfaces. Leave hostname empty to
bind to default interface. Examples of legal listener lists:
PLAINTEXT://myhost:9092,SSL://:9091
CLIENT://0.0.0.0:9092,REPLICATION://localhost:9093
So if advertised.listeners is not set and listeners is just listening to localhost:9092 or 127.0.0.1:9092 or 0.0.0.0:9092 then the clients will be told to connect to localhost when they make a meta-data request to the bootstrap server. That works when the client is actually running in the same machine as the broker but it will fail when you connect remotely.
You should set advertised.listeners to be a fully qualified domain name or public IP address for the host that the broker is running on.
For example
advertised.listeners=PLAINTEXT://kafkabrokerhostname.confluent.io:9092
or
advertised.listeners=PLAINTEXT://192.168.1.101:9092

Related

Kafka producer does not signal that all brokers are unreachable

When all brokers/node of a cluster are unreachable, the error in the Kafka producer callback is a generic "Topic XXX not present in metadata after 60000 ms".
When I activate the DEBUG log level, I can see that all attempts to deliver the message to any node are failing:
DEBUG org.apache.kafka.clients.NetworkClient - Initialize connection to node node2.url:443 (id: 2 rack: null) for sending metadata request
DEBUG org.apache.kafka.clients.NetworkClient - Initiating connection to node node2.url:443 (id: 2 rack: null) using address node2.url:443/X.X.X.X:443
....
DEBUG org.apache.kafka.clients.NetworkClient - Disconnecting from node 2 due to socket connection setup timeout. The timeout value is 16024 ms.
DEBUG org.apache.kafka.clients.NetworkClient - Initialize connection to node node0.url:443 (id: 0 rack: null) for sending metadata request
DEBUG org.apache.kafka.clients.NetworkClient - Initiating connection to node node0.url:443 (id: 0 rack: null) using address node0.url:443/X.X.X.X:443
....
DEBUG org.apache.kafka.clients.NetworkClient - Disconnecting from node 0 due to socket connection setup timeout. The timeout value is 17408 ms.
and so on, until, after the deliver timeout, the send() Callback gets the error:
ERROR my.kafka.SenderClass - Topic XXX not present in metadata after 60000 ms.
Unlike bootstrap url, all nodes could be unreachable for example for wrong DNS entries or whatever.
How can the application understand that all nodes were not reachable? This is traced just as DEBUG info and is not avialable to the producer send() callback.
Such an error detail at application level would speed up troubleshoooting.
This error is usually signaled by standard webservice SOAP/REST interface.
The producer only cares about the cluster Controller for bootstrapping and the leaders of the partitions it needs to write to (one of those leaders could be the Controller). That being said, it doesn't need to know about "all" brokers.
How can the application understand that all nodes were not reachable?
If you set acks=1 or acks=all, then the callback should know at least one broker had the data written. If not, there was some error.
You can use an AdminClient outside of the Producer client to describe the topic(s) and fetch metadata about the leader partitions, then use standard TCP socket network requests to try and ping those advertised listeners from Java
FWIW, port 443 should ideally be reserved for HTTPS traffic, not Kafka. Kafka is not a REST/SOAP service.

Not able to access kafka(confluent) installed on Azure VM using public IP

I have installed confluent-oss-5.0.0 on Azure VM and exposed all necessary ports to access using public IP Address.
I tried to change the etc/kafka/server.properties below things to achieve but no luck
Approach - 1
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://<publicIP>:9092
--------------------------------------
Approach - 2
advertised.listeners=PLAINTEXT://<publicIP>:9092
--------------------------------------
Approach - 3
listeners=PLAINTEXT://<publicIP>:9092
I experienced below error
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-producer --broker-list <publicIp>:9092 --topic pj_test123>dfsds
[2019-03-25 19:13:38,784] WARN [Producer clientId=console-producer] Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-producer --broker-list <publicIp>:9092 --topic pj_test123
>message1
>message2
>[2019-03-25 19:20:13,216] ERROR Error when sending message to topic pj_test123 with key: null, value: 3 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.TimeoutException: Expiring 2 record(s) for pj_test123-0: 1503 ms has passed since batch creation plus linger time
[2019-03-25 19:20:13,218] ERROR Error when sending message to topic pj_test123 with key: null, value: 3 bytes with error: (org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-consumer --bootstrap-server <publicIp>:9092 --topic pj_test123 --from-beginning
[2019-03-25 19:29:27,742] WARN [Consumer clientId=consumer-1, groupId=console-consumer-42352] Error while fetching metadata with correlation id 2 : {pj_test123=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
pj#pj-HP-EliteBook-840-G1:~/confluent-kafka/confluent-oss-5.0.0/bin$ kafka-console-consumer --bootstrap-server <publicIp>:9092 --topic pj_test123 --from-beginning
[2019-03-25 19:27:06,589] WARN [Consumer clientId=consumer-1, groupId=console-consumer-33252] Connection to node 0 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
All other service like zookeeper, kafka-connect and restAPI are working fine using the <PublicIP>:<port>
kafka-topics --zookeeper 13.71.115.20:2181 --list --- This is working
Ref:
Not able to access messages from confluent kafka on EC2
https://kafka.apache.org/documentation/#brokerconfigs
Why I cannot connect to Kafka from outside?
Solutions
Thanks, #Robin Moffatt, It works for me. I do below changes along with allowing all Kafka related ports on Azure networking
kafka#kafka:~/confluent-oss-5.0.0$ sudo vi etc/kafka/server.properties
listeners=INTERNAL://0.0.0.0:9092,EXTERNAL://0.0.0.0:19092
listener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
advertised.listeners=INTERNAL://<privateIp>:9092,EXTERNAL://<publicIp>:19092
inter.broker.listener.name=INTERNAL
You need to configure both internal and external listeners for your broker. This article details how: https://rmoff.net/2018/08/02/kafka-listeners-explained/.
You will also have to give public access to port 9092 (your broker). TO do that,
Go to your Virtual machine in Azure portal
Select Networking under settings in the left menu
Add inbound port rule
Add port 9092 to be accessbile from anywhere

How to fix the JAVA Kafka Producer Error "Received invalid metadata error in produce request on partition" and Out of Memory when broker is down

I have been creating a Kafka Producer example using Java. I have been
sending normal data which is just "Test" + Integer as value to Kafka. I
have used the below properties and after I have started the Producer
Client and messages are on the way, during this I am killing the broker
and suddenly receiving the below error message instead of retrying.
Using 3 brokers and topic with 3 partitions and replication factor as 3
and no min-insync-replicas
Below are the properties configured config.put(ProducerConfig.ACKS_CONFIG, "all");
config.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "1");
config.put(CommonClientConfigs.RETRIES_CONFIG, 60);
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
config.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG ,10000);
config.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG ,30000);
config.put(ProducerConfig.MAX_BLOCK_MS_CONFIG ,10000);
config.put(ProducerConfig.MAX_REQUEST_SIZE_CONFIG , 1048576);
config.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
config.put(ProducerConfig.LINGER_MS_CONFIG, 0);
config.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 1073741824); // 1GB
and the result when I have killed all my brokers or sometimes one of the
broker is as below
**Error:**
WARN org.apache.kafka.clients.producer.internals.Sender - [Producer
clientId=producer-1] Got error produce response with correlation id 124
on topic-partition testing001-0, retrying (59 attempts left). Error:
NETWORK_EXCEPTION
27791 [kafka-producer-network-thread | producer-1] WARN
org.apache.kafka.clients.producer.internals.Sender - [Producer
clientId=producer-1] Received invalid metadata error in produce request
on partition testing001-0 due to
org.apache.kafka.common.errors.NetworkException: The server disconnected
before a response was received.. Going to request metadata update now
28748 [kafka-producer-network-thread | producer-1] ERROR
org.apache.kafka.common.utils.KafkaThread - Uncaught exception in thread
'kafka-producer-network-thread | producer-1':
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.<init>(Unknown Source)
at java.nio.ByteBuffer.allocate(Unknown Source)
at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate
(MemoryPool.java:30)
at org.apache.kafka.common.network.NetworkReceive.readFrom
(NetworkReceive.java:112)
at org.apache.kafka.common.network.KafkaChannel.receive
(KafkaChannel.java:335)
at org.apache.kafka.common.network.KafkaChannel.read
(KafkaChannel.java:296)
at org.apache.kafka.common.network.Selector.attemptRead
(Selector.java:560)
at org.apache.kafka.common.network.Selector.pollSelectionKeys
(Selector.java:496)
at org.apache.kafka.common.network.Selector.poll(Selector.java:425)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
at org.apache.kafka.clients.producer.internals.Sender.run
(Sender.java:239)
at org.apache.kafka.clients.producer.internals.Sender.run
(Sender.java:163)
at java.lang.Thread.run(Unknown Source)
I assume you are testing the producer. When a producer connect to the Kafka cluster you will pass all broker IPs and ports as a comma separated string. In your case there are three brokers. When producer try to connect to cluster, as part of initialization cluster controller responds with cluster metadata. Assume your producer only populating message to a single topic. Cluster maintains a leader among brokers for each and every topic. After identify the leader for the topic, your producer only going to communicate to the leader until it is live.
In your testing scenario, you are deliberately killing the broker instances. When it happens kafka cluster need to identify a new leader for your topic and controller has to pass the new meta data to your producer. If the metadata change quite frequently( in your case you may kill another broker mean while) producer may receive invalid metadata.

Correlation Id errors for Kafka console producer and consumer

I have 2 Kafkas backed by 3 ZK nodes. I want to test the Kafka nodes by running the kafka-console-producer and -consumer locally on each node.
So I SSH into one of my Kafka brokers using 2 different terminals. In terminal #1 I run the consumer like so:
/opt/kafka/bin/kafka-console-consumer.sh --zookeeper a.b.c.d:2181 --topic test1
Where a.b.c.d is the private IP of one of my 3 ZK nodes.
Then in terminal #2 I run the producer like so:
/opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test1
I am able to start both the consumer and producer just fine without any issues.
However, in the producer terminal, if I "fire" a message at the test1 topic by entering some text (such as "hello") and hitting the ENTER key, I immediately begin seeing this:
[2017-01-17 19:45:57,353] WARN Error while fetching metadata with correlation id 0 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-01-17 19:45:57,372] WARN Error while fetching metadata with correlation id 1 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-01-17 19:45:57,477] WARN Error while fetching metadata with correlation id 2 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2017-01-17 19:45:57,582] WARN Error while fetching metadata with correlation id 3 : {test1=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
...and it keeps going!
And, in the consumer terminal, even though I don't get any errors when I start the consumer, after about 30 seconds I get the following warning message:
[2017-01-17 19:46:07,292] WARN Fetching topic metadata with correlation id 1 for topics [Set(test1)] from broker [BrokerEndPoint(1,ip-x-y-z-w.ec2.internal,9092)] failed (kafka.client.ClientUtils$)
java.nio.channels.ClosedChannelException
at kafka.network.BlockingChannel.send(BlockingChannel.scala:110)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:80)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:79)
at kafka.producer.SyncProducer.send(SyncProducer.scala:124)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:94)
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
Interestingly, ip-x-y-z-w.ec2.internal is the private DNS for the other Kafka broker, so perhaps this is some kind of failure during interbroker communication?
Any ideas as to what is going on here and what I can do to troubleshoot?
Update
Here's my entire server.properties file for both Kafkas nodes:
listeners=PLAINTEXT://0.0.0.0:9092
advertised.host.name=<private-aws-ec2-ip-addr>.ec2.internal
advertised.listeners=PLAINTEXT://0.0.0.0:9092
broker.id=1
port=9092
num.partitions=4
zookeeper.connect=zkA:2181,zkB:2181,zkC:2181
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
log.dirs=/tmp/kafka-logs
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connection.timeout.ms=6000
offset.metadata.max.bytes=4096
Please let me know if anything looks like config smell.

Kafka 0.8 All Good & rocks! .... Kafka 0.7 not able to make it happen

Kafka 0.8 works great. I am able to use CLI as well as write my own producers/consumers!
Checking Zookeeper... and I see all the topics and partitions created successfully for 0.8.
Kafka 0.7 does not work!
Why Kafka 0.7? I am using Kafka Spout from Storm which is made for Kafka 0.7.
First I just want to run CLI based producer/consumer for Kafka 0.7, which I am unable to. I carry out the following steps:
I delete all the topics/partitions etc. in Zookeeper that were created from my Kafka 0.8
I change the dataDir in zoo.cfg to point to different location.
Now I start the kafka server 0.7. It starts successfully. However I don’t know why it again registers the broker topics I deleted?
Now I start the Kafka Producer :
bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic topicime
& it starts successfully:
[2013-06-28 14:06:05,521] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2013-06-28 14:06:05,606] INFO Creating async producer for broker id = 0 at 0:0 (kafka.producer.ProducerPool)
Time to send some messages & oops I get this error:
[2013-06-28 14:07:19,650] INFO Disconnecting from 0:0 (kafka.producer.SyncProducer)
[2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, next attempt in 1 ms (kafka.producer.SyncProducer)
java.net.ConnectException: Connection refused
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:364)
at sun.nio.ch.Net.connect(Net.java:356)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623)
at kafka.producer.SyncProducer.connect(SyncProducer.scala:173)
at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:196)
at kafka.producer.SyncProducer.send(SyncProducer.scala:92)
at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135)
at kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:58)
at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44)
at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:116)
at scala.collection.immutable.Stream.foreach(Stream.scala:254)
at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:70)
at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:41)
Note that Zookeeper is already running.
Any help would really be appreciated.
EDIT:
I don't even see the topic being created in zookeeper. I am running the following command:
bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic topicime
After the command everything is fine & I get the following message:
[2013-06-28 14:30:17,614] INFO Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13f805c6673004b, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2013-06-28 14:30:17,615] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2013-06-28 14:30:17,700] INFO Creating async producer for broker id = 0 at 0:0 (kafka.producer.ProducerPool)
However now when i type a string to send I get the above error (Connection refused!)
INFO Disconnecting from 0:0 (kafka.producer.SyncProducer)
The above line has the error hidden in it. 0:0 is not a valid host and port. The solution is to explicitly set the host ip to be registered in Zookeeper by setting the "hostname" property in server.properties.
Consider checking out the storm-kafka fork, available at https://github.com/wurstmeister/storm-kafka-0.8-plus
I'm installing it right now for our servers =).