KAFKA commit issue - apache-kafka

I am very new to KAFKA broker and as per the requirement producer has to COMMIT messages.(Using librdkafka c/c++ libraries)
So, Firstly in my producer.c i used rd_kafka_commit (rk,NULL,0) but i got below error
RD_KAFKA_RESP_ERR__UNKNOWN_GROUP = -179 (Unknown client group)
Now I am using this method rd_kafka_commit_transaction() in my producer.c but getting compilation error below.
undefined reference to `rd_kafka_init_transactions'
Please help me how to proceed further.

Related

Kafka Producer Retry and Failed record handling

My requirement as follows -
apart from broker metadata related error -I try to simulate a RecordTooLargeException while sending the message to the Kafka Topic.
For the producer configuration I add acks: all and retries: 5
Also I use addCallback method to send the message.
I received org.apache.kafka.common.errors.RecordTooLargeException: The message is 2000103 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.
but I did not notice any retry ( 5 times ) in the log.
My requirement is retry 5 times , then marked the record as permanent failure and send back to the call back handler - for further reprocess the failed record( ex. send to DLT or DB)
How can I achieve this kind of retry and handling?
It's simple. As per theory KAFKA Producer API doesn't retry on RecordTooLargeException, that means it is a non-retriable exception. If you still want to break this and retry irrespectively, then you can catch that Exception string through the Search String when error returned from the broker and retry from the catch block as many as times you want.
KafkaProducer has two types of errors. Retriable errors are those that can be resolved by sending the message again. For example, a connection error can be resolved because the connection may get reestablished. A “not leader for partition” error can be resolved when a new leader is elected for the partition and the client metadata is refreshed. KafkaProducer can be configured to retry those errors automatically, so the application code will get retriable exceptions only when the number of retries was exhausted and the error was not resolved. Some errors will not be resolved by retrying — for example, “Message size too large.” In those cases, KafkaProducer will not attempt a retry and will return the exception immediately.
-- Kafka: The Definitive Guide 2nd Edition, Chapter 3
RecordTooLargeException is a non-retriable exception, retrying makes no sense if the max.request.size configuration does not change. Therefore, Kafka producer will not attempt a retry and will return the exception immediately. The callback handler will be triggered for further reprocess.

Getting error while publishing message to kafka topic

I am new to Kafka. I have written a simple JAVA program to generate a message using avro schema. I have generated a specific record. The record is generated successfully. My schema is not yet registered with my local environment. It is currently registered with some other environment.
I am using the apache kafka producer library to publish the message to my local environment kafka topic. Can I publish the message to the local topic or the schema needs to be registered with the local schema registry as well.
Below are the producer properties -
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, KafkaAvroSerializer.class);
properties.put(KafkaAvroSerializerConfig.SCHEMA_REGISTRY_URL_CONFIG, "https://schema-registry.xxxx.service.dev:443");```
Error I am getting while publishing the message -
``` org.apache.kafka.common.errors.SerializationException: Error registering Avro schema:
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: User is denied operation Write on Subject: xxx.avro-value; error code: **40301**
The issue was kafka producer by default tries to register the schema on topic. So, I added the below - properties.put(KafkaAvroSerializerConfig.AUTO_REGISTER_SCHEMAS, false);
and it resolved the issue.

UnknownProducerIdException in Kafka streams when enabling exactly once

After enabling exactly once processing on a Kafka streams application, the following error appears in the logs:
ERROR o.a.k.s.p.internals.StreamTask - task [0_0] Failed to close producer
due to the following error:
org.apache.kafka.streams.errors.StreamsException: task [0_0] Abort
sending since an error caught with a previous record (key 222222 value
some-value timestamp 1519200902670) to topic exactly-once-test-topic-
v2 due to This exception is raised by the broker if it could not
locate the producer metadata associated with the producerId in
question. This could happen if, for instance, the producer's records
were deleted because their retention time had elapsed. Once the last
records of the producerId are removed, the producer's metadata is
removed from the broker, and future appends by the producer will
return this exception.
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:125)
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:48)
at org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:180)
at org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1199)
at org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204)
at org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187)
at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627)
at org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:596)
at org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:557)
at org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:481)
at org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74)
at org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:692)
at org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101)
at org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:482)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:474)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.kafka.common.errors.UnknownProducerIdException
We've reproduced the issue with a minimal test case where we move messages from a source stream to another stream without any transformation. The source stream contains millions of messages produced over several months. The KafkaStreams object is created with the following StreamsConfig:
StreamsConfig.PROCESSING_GUARANTEE_CONFIG = "exactly_once"
StreamsConfig.APPLICATION_ID_CONFIG = "Some app id"
StreamsConfig.NUM_STREAM_THREADS_CONFIG = 1
ProducerConfig.BATCH_SIZE_CONFIG = 102400
The app is able to process some messages before the exception occurs.
Context information:
we're running a 5 node Kafka 1.1.0 cluster with 5 zookeeper nodes.
there are multiple instances of the app running
Has anyone seen this problem before or can give us any hints about what might be causing this behaviour?
Update
We created a new 1.1.0 cluster from scratch and started to process new messages without problems. However, when we imported old messages from the old cluster, we hit the same UnknownProducerIdException after a while.
Next we tried to set the cleanup.policy on the sink topic to compact while keeping the retention.ms at 3 years. Now the error did not occur. However, messages seem to have been lost. The source offset is 106 million and the sink offset is 100 million.
As explained in the comments, there currently seems to be a bug that may cause problems when replaying messages older than the (maximum configurable?) retention time.
At time of writing this is unresolved, the latest status can always be seen here:
https://issues.apache.org/jira/browse/KAFKA-6817

Kafka Producer is not retring when Broker is Down

I have setup up Kafka using version 0.9 with the basic configuration as
1 Broker 1 Topic and 1 Partition.
Below are Producer Configurations that I have added to enable the retry from Producer.
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ProducerConfig.RETRIES_CONFIG, 5);
props.put(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG, 500);
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.MAX_BLOCK_MS_CONFIG, 500);
props.put(ProducerConfig.METADATA_MAX_AGE_CONFIG, 50);
I understand from the documents that
Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error.
Both my Broker & Zookeeper are down and the retry operation is not working.
ERROR o.s.k.s.LoggingProducerListener - Exception thrown when sending a message to topic TestTopic1|
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 500 ms.
I need to know if I am missing anything here for the retry to work.
Resend (retry) works only if you have connection to the Broker and something happened during sending a message.
So, if your Broker is dead, there is no any reason to send message at all - no connection. And that is an exception about.
I think retries should work anyway, even if the broker is down. This is the whole reason to have retries in the first place. Could be a temporary network issue after all.
There is a bug in the Kafka 0.9.0.1 producer which causes retries not to work. See here.
Fixed in 0.9.0.2 (which is not released yet) and 0.10. I'd upgrade the broker to 0.10 and try again.
As #artem answered Kafka producer config is not designed to retry when broker is down. It only retries during transient errors which is pretty much useless to be honest. It beats me why spring-Kafka did not take care of it.
Anyways to solve the situation I handled this with #Retry config with springboot. Checkin this SO answer for details : https://stackoverflow.com/a/65248428/6621377

Kafka : Error from SyncGroup, The request timed out

Recently we are experiencing "Error from SyncGroup: The request timed out" frequently with the Java Kafka APIs.
This issue usually happens with few topic or consumer group in Kafka cluster. Does anyone can provide some pointers about this error?
As a workaround, if I change the consumer group name I don't see the error.
Broker Version : 0.9.0
Kafka client version : 0.9.0.1
Exception in thread "main" org.apache.kafka.common.KafkaException: Unexpected error from SyncGroup: The request timed out.
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupRequestHandler.handle(AbstractCoordinator.java:444)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$SyncGroupRequestHandler.handle(AbstractCoordinator.java:411)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:665)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:644)
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167)
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133)
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:380)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:274)
#zer0Id0l
We have had the same problem recently. It happens because some Kafka Streams messages have meta information footprint which is more than a regular one (when you don't use Kafka Streams). To fix the issue, go to __consumer_offsets topic settings and set max.message.bytes param higher than it is by default. For example, in our case we have max.message.bytes = 20971520. That will completely solve your problem.