How Do I determine request time out of my kafka producer? - apache-kafka

if I have a kafka producer on a live machine which writes its data to a kafka topic. What happens when there is a network issue or kafka goes down which would simultaneously affect the performance of the live machine by request-timeout set as kafka producer would time out after request timeout millisecs set for each send request requested by the live machine.

Related

Do we need kafka connection pool from client?

I have a REST webservice with traffic about 1 mil requests per day. In that REST service, for each request, we send message to remote kafka topic (Confluent platform). Do I have to set up Kafka connection pool to improve the performance?
No, you don't need to keep the Kafka connection pooling. Kafka clients keep the connections with the Kafka cluster and it will manage all the connections. As long as you have enough partitions configured for the Kafka topic, it should be alright.

Kafka Consumer as a service which continuously polls messages from a Kafka Topic

I need to create an independent server which continuously polls data from a Kafka Topic. Till now I have created a vertx server and a Kafka Consumer. The issue is when we start the server it drains the queue and polls all the messages currently present but when new messages will arrive later it wont be able to consume them. I need to implement some logic in my server which able it to continuously poll the data and drain the queue.

Does kafka broker always check if its the leader while responding to read/write request

I am seeing org.apache.kafka.common.errors.NotLeaderForPartitionException on my producer which I understand happens when producer tries to produce messages to a broker which is not a leader for the partition.
Does that mean each time a leader fulfills a write request it first checks if its the leader or not?
If yes does that translates to a zookeeper request for every write request to know if the node is the leader?
How Producer Get MetaData About Brokers
The producer sends a meta request with a list of topics to one of the brokers you supplied when configuring the producer.
The response from the broker contains a list of partitions in those topics and the leader for each partition. The producer caches this information and therefore, it knows where to redirect the messages.
When Producer Will Refresh MetaData
I think this depends what kafka client you used.There are some small differents between ruby, java or other kafka client.for example, in java:
producer will fetch metadata when client initialize,then period update it depends on expiration time.
producer also will force update metadata when request error occured,such as InvalidMetadataException.
But in ruby-kafka client, it usually refresh metadata when error occured or initialize.

Kafka Streams application stops working after no message have been read for a while

I have noticed that my Kafka Streams application stops working when it has not read new messages from the Kafka topic for a while. It is the third time that I have seen this happen.
No messages have been produced to the topic since 5 days. My Kafka Streams application, which also hosts a spark-java webserver, is still responsive. However, the messages I produce to the Kafka topic are not being read by Kafka Streams anymore. When I restart the application, all messages will be fetched from the broker.
How can I make my Kafka Streams Application more durable to this kind of scenario? It feels that Kafka Streams has an internal "timeout" after which it closes the connection to the Kafka broker when no messages have been received. I could not find such a setting in the documentation.
I use Kafka 1.1.0 and Kafka Streams 1.0.0
Kafka Streams do not have an internal timeout to control when to permanently close a connection to the Kafka broker; Kafka broker, on the other hand, does have some timeout value to close idle connections from clients. But Streams will keep trying to re-connect once it has some processed result data that is ready to be sent to the brokers. So I'd suspect your observed issue came from some other causes.
Could you share your application topology sketch and the config properties you used, for me to better understand your issue?

Storm Kafka Spout not commit offset in local cluster , spout retrieves same message repeatedly

I have set storm topology which gets input data from kafka server. I used kafka-storm package to get data. I have implemented the connection between kafka server and storm topology succesfully in local cluster, but i am facing some issues in retrieving data from kafka server.
kafka Spout retrieves same message repeatedly at runtime even i set spoutconfig.forceFromStart=false and spoutconfig.startOffsetTime =-1
Note : When i stop and restart the cluster the data is sent correctly based on the lastest offset.
I figured out by myself, the issue is with outputcollector ack() method. I have implemented the bolt collector with BaseBasicBolt, it didn't acknowledge the kafkaspout. I have replace with BaseRichBolt and made this.collector.ack(tuple) manually.
Now its work fine