Node disconnected errors in Kafka Streams API - apache-kafka

I have creates Kafka Consumer code using Kafka Streams API and I'm able to fetch the records from Kafka Topic successfully and able to process those.
I'm seeing below error in the application logs that is showing as "Node -2 disconnected", but still there is no impact and kafka streams API is fetching transactions successfully from kafka topic.
org.apache.kafka.clients.NetworkClient : [AdminClient clientId=consumer-5322b972-1ef9-4976-b7fa-39a934374757-admin] Node -2 disconnected.
Can someone let me know what this error means and is there any way we can avoid these errors. I created Kafka Consumer code using Spring cloud Streams and Kafka Streams

It simply means your client disconnected from a broker. Could happen for many reasons, such as a temporary network blip.
If you have more than one replica of your data that's in sync, then the consumer will continue to work

Related

Kafka Stream app goes down when one of the broker goes down and other brokers are running

We have a Kafka Stream app using Spring Cloud Stream connecting to 3 brokers with replication factor 3 using AWS MSK.
Now there is a regular patching once in every month where at a time only single broker is down. But the Kafka stream app goes down (SHUTDOWN state) even if one broker goes down and the other 2 brokers are up with exception
org.apache.kafka.clients.NetworkClient : [AdminClient clientId=blabla] Connection to node 1 could not be established. Broker may not be available.
Can someone guide me to ensure Kafka stream doesnt go down even if other 2 brokers are up during this interval.

Does kafka broker always check if its the leader while responding to read/write request

I am seeing org.apache.kafka.common.errors.NotLeaderForPartitionException on my producer which I understand happens when producer tries to produce messages to a broker which is not a leader for the partition.
Does that mean each time a leader fulfills a write request it first checks if its the leader or not?
If yes does that translates to a zookeeper request for every write request to know if the node is the leader?
How Producer Get MetaData About Brokers
The producer sends a meta request with a list of topics to one of the brokers you supplied when configuring the producer.
The response from the broker contains a list of partitions in those topics and the leader for each partition. The producer caches this information and therefore, it knows where to redirect the messages.
When Producer Will Refresh MetaData
I think this depends what kafka client you used.There are some small differents between ruby, java or other kafka client.for example, in java:
producer will fetch metadata when client initialize,then period update it depends on expiration time.
producer also will force update metadata when request error occured,such as InvalidMetadataException.
But in ruby-kafka client, it usually refresh metadata when error occured or initialize.

A Kafka broker is gracefully shutdown, and incorrect metadata was passed to the Kafka connect client

To maintain the server, one of the 20 brokers was shutdown gracefully, but all kafka-connect cluster (sink) died with the following NPE error. Replication-factor of all topics was more than 2, there were 50 topics and 200 partitions. Checking up the error and the Kafka library source code, it seems that the error occurred when the Connect client cached the metadata including the broker node id set and partition info set information received from the broker.
How can this happen, and how to deal with it in the future?
(the Version of Broker and Client is v2.3.1)
This is a bug. The Connect cluster should not be negatively impacted by a broker shutting down and it should not throw an NPE.
Please open a ticket in https://issues.apache.org/jira/projects/KAFKA/issues/. It's also best it you paste the stack trace as text instead of an image.

Kafka Streams application stops working after no message have been read for a while

I have noticed that my Kafka Streams application stops working when it has not read new messages from the Kafka topic for a while. It is the third time that I have seen this happen.
No messages have been produced to the topic since 5 days. My Kafka Streams application, which also hosts a spark-java webserver, is still responsive. However, the messages I produce to the Kafka topic are not being read by Kafka Streams anymore. When I restart the application, all messages will be fetched from the broker.
How can I make my Kafka Streams Application more durable to this kind of scenario? It feels that Kafka Streams has an internal "timeout" after which it closes the connection to the Kafka broker when no messages have been received. I could not find such a setting in the documentation.
I use Kafka 1.1.0 and Kafka Streams 1.0.0
Kafka Streams do not have an internal timeout to control when to permanently close a connection to the Kafka broker; Kafka broker, on the other hand, does have some timeout value to close idle connections from clients. But Streams will keep trying to re-connect once it has some processed result data that is ready to be sent to the brokers. So I'd suspect your observed issue came from some other causes.
Could you share your application topology sketch and the config properties you used, for me to better understand your issue?

Sending only one mesage per broker on kafka cluster

I am using a single cluster multi-broker kafka cluster with spark streaming to fetch data. the problem i am facing is under one topic same message get sent across all brokers.
How do I limit kafka to send only one message per broker so that there is no duplication of messages under one topic?