Kafka Producer : Handle Exception in Async Send with Callback - apache-kafka

I need to catch the exceptions in case of Async send to Kafka. The Kafka producer Api comes with a fuction send(ProducerRecord record, Callback callback). But when I tested this against following two scenarios :
Kafka Broker Down
Topic not pre created
The callbacks are not getting called. Rather I am getting warning in the code for unsuccessful send (as shown below).
Questions :
So are the callbacks called only for specific exceptions ?
When does Kafka Client try to connect to Kafka broker while async send : on every batch send or periodically ?
Kafka Warning Image
Note : I am also using linger.ms setting of 25 sec to batch send my records.
public class ProducerDemo {
static KafkaProducer<String, String> producer;
public static void main(String[] args) throws IOException {
final Logger logger = LoggerFactory.getLogger(ProducerDemo.class);
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.ACKS_CONFIG, "1");
properties.setProperty(ProducerConfig.LINGER_MS_CONFIG, "30000");
producer = new KafkaProducer<String, String>(properties);
String topic = "first_topic";
for (int i = 0; i < 5; i++) {
String value = "hello world " + Integer.toString(i);
String key = "id_" + Integer.toString(i);
ProducerRecord<String, String> record = new ProducerRecord<String, String>(topic, key, value);
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
//execute everytime a record is successfully sent or exception is thrown
if(e == null){
// No Exception
}else{
//Exception Handling
}
}
});
}
producer.close();
}

You will get those warning for non-existing topic as a resilience mechanism provided with KafkaProducer. If you wait a bit longer(should be 60 seconds by default), the callback will be called eventually:
Here's my snippet:
So, when something goes wrong and async send is not successful, it will eventually fail with a failed future or/and a callback with exception.
If you are not running it transactionally, it can still mean that some messages from the batch have found their way to the broker, while others haven't.
It will most certainly be a problem if you need a blocking-style acknowledgement to the upstream system(like http ingestion interface, etc.) per every message that is sent to Kafka. The only way to do that is by blocking every message with the future's get, as described in the documentation:
In general, I've noticed a lot of question related to KafkaProducer delivery semantics and guarantees. It can definitely be documented better.
One more thing, since you mentioned linger.ms:
Note that records that arrive close together in time will generally
batch together even with linger.ms=0 so under heavy load batching will
occur regardless of the linger configuration

For the first question, here is the answer.
As per the apache kafka documentation, you can capture below exceptions using onCompletion method when you are implementing Callback interface
https://kafka.apache.org/25/javadoc/org/apache/kafka/clients/producer/Callback.html
For the second question, the combination of below properties control when to send the records and as far as i understand, it's same for synchronous or asynchronous call.
linger.ms
max.block.ms
https://kafka.apache.org/documentation/#linger.ms

So are the callbacks called only for specific exceptions ?
Yes, that's how it works. From documentation (2.5.0):
* Fully non-blocking usage can make use of the {#link Callback} parameter to provide a callback that
* will be invoked when the request is complete.
Notice the important part: when the request is complete, what means that the producer must have accepted the record and sent the ProduceRequest to Kafka Broker. Without digging too deep into internals, this means that broker metadata must be present and the partition must exist.
When it comes to formal specification, you'd need to take a good look at send()'s Javadoc and possibly at KafkaProducer's implementation of doSend method. Out there you're going to see that multiple exceptions can be thrown at the in submitting call (instead of returning a future and invoking callback), e.g. :
if broker metadata is not available in timeout given,
if data could not be serialized,
if serialized form was too large, etc.

Related

Quarkus/Smallrye reactive kafka - Endpoint success/failure response from Message

I'm looking to respond to a REST endpoint with a Success/Failure response that dynamically accepts a topic as a query param. In Quarkus with smallrye reactive messaging the code would look something like below wrapping the payload with OutgoingKafkaRecordMetadata
i.e. https://myendpoint/publishToKafka?topic=myDynamicTopic
#Channel("test")
Emitter<byte []> kafkaEmitter;
#POST
#Path("/publishToKafka")
public CompletionStage<Void> publishRecord(#QueryParam("topic") String topic, byte [] payload){
kafkaEmitter.send(Message.of(payload).addMetadata(OutgoingKafkaRecordMetadata.<String>builder()
.withKey("my-key")
.withTopic("myDynamicTopic")
.build()));
}
From the Quarkus doco "If the endpoint does not return a CompletionStage, the HTTP response may be written before the message is sent to Kafka, and so failures won’t be reported to the user." The example here describes this process when you send a payload directly (i.e. emitter.send(payload) which returns a CompletionStage but emitter.send(message) returns void) but this requires configuring the topic in advance. Is it possible to specify metadata with a Message and still respond to the calling client with a success/failure response? (I don't mind if it's with Emitter and CompletionStage or MunityEmitter and Uni).
Any advice or suggestions would be appreciated.
Because you use a Message (as you need to specify the topic), you need something a bit more convoluted:
#Channel("test")
Emitter<byte []> kafkaEmitter;
#POST
#Path("/publishToKafka")
public CompletionStage<Void> publishRecord(#QueryParam("topic") String topic, byte [] payload){
CompletableFuture<Void> future = new CompletableFuture<>();
Message<byte[]> message = Message.of(payload).addMetadata(OutgoingKafkaRecordMetadata.
<String>builder()
.withKey("my-key")
.withTopic("myDynamicTopic")
.build()));
message = message.withAck(() -> {
future.complete(null));
return CompleteableFuture.completedFuture(null);
}
.withNack(t -> {
future.completeExceptionnaly(t));
return CompleteableFuture.completedFuture(null);
});
kafkaEmitter.send(message);
return future;
}
In this snippet, I also attach the ack and nack handlers called when the message is either acknowledged (accepted by the broker) or rejected (something wrong happened).
These callbacks report to future, a CompletableFuture created in the method. This is the object to return, as it will do what you want: indicate the outcome.
I know the callbacks are slightly complicated. This is mainly due to the spec: We have to return CompleteableFuture.completedFuture(...); to acknowledge that the nack-process was successful. If we were to return future; instead (which we have set to future.completeExceptionnaly(t));), this would be interpreted as a failure during the nack-process. This would basically be the equivalent to a throw within a catch-block in the imperative world.
Fortunately, an easier version will be available soonish (no worries, we won't break).

Error handling in Spring Cloud Kafka Streams

I'm using Spring Cloud Stream with Kafka Streams. Let's say I have a processor which is a Function which converts a KStream of Strings to a KStream of CityProgrammes. It invokes an API to find the City by name and an other transformation which finds any events near that city.
Now the problem is that any error happens during the transformation, the whole application stops. I want to send that one particular message to a DLQ and move along. I've been reading for days and everyone suggests to handle errors within the called services but that is a nonesense in my opinion, plus I still need to return a KStream: how do I do that within a catch?
I also looked at UncaughtExeptionHandler but it is not aware of the message and only able to restart the processing which won't skip this invalid message.
This might sound like an A-B problem so the question rephrased: how do I maintain the flow in a KStream when an exception occurs and send the invalid item to the DLQ?
When it comes to the application-level errors you have, it is up to the application itself how the error is handled. Kafka Streams and the Spring Cloud Stream binder mainly support deserialization and serialization errors at the framework level. Although that is the case, I think your scenario can be handled. If you are using Kafka Client prior to 2.8, here is an SO answer I gave before on something similar: https://stackoverflow.com/a/66749750/2070861
If you are using Kafka/Streams 2.8, here is an idea that you can use. However, the code below should only be used as a starting point. Adjust it according to your use case. Read more on how branching works in Kafka Streams 2.8. The branching API is significantly refactored in 2.8 from the prior versions.
public Function<KStream<?, String>, KStream<?, Foo>> convert() {
Foo[] foo = new Foo[0];
return input -> {
final Map<String, ? extends KStream<?, String>> branches =
input.split(Named.as("foo-")).branch((key, value) -> {
try {
foo[0] = new Foo(); // your API call for CitiProgramme converion here, possibly.
return true;
}
catch (Exception e) {
Message<?> message = MessageBuilder.withPayload(value).build();
streamBridge.send("to-my-dlt", message);
return false;
}
}, Branched.as("bar"))
.defaultBranch();
final KStream<?, String> kStream = branches.get("foo-bar");
return kStream.map((key, value) -> new KeyValue<>("", foo[0]));
};
}
}
The default branch is ignored in this code because that only contains the records that threw exceptions. Those were handled by the catch statement above in which we send the records to a DLT programmatically. Finally, we get the good records and map them to a new KStream and send it through the outbound.

Artemis message routing

I'm using ActiveMQ Artemis 2.17.0 and I'm facing routing issues.
I've implementing a plugin that logs the before message route and I see that some message are routed from topic.private.abc.task.V1 to topic.abc.rawmessage.V1.
There is no divert setup and topic and queue are created dynamically by the producers and consumers. There is a setup to map destination clustered.*.> to virtual topics
private TransportConfiguration getServerTransportConfiguration() {
Map<String, Object> extraProps = new HashMap<>();
extraProps.put("virtualTopicConsumerWildcards", "clustered.*.>;2");
Map<String, Object> params = new HashMap<>();
params.put("scheme", "tcp");
params.put("port", port);
params.put("host", hostname);
return new TransportConfiguration("org.apache.activemq.artemis.core.remoting.impl.netty.NettyAcceptorFactory", params, "netty-acceptor", extraProps);
}
Both topic.private.abc.task.V1 and topic.abc.rawmessage.V1 are valid topics but they are not supposed to be linked.
What could explain that behavior?
Here is the plugin code:
#Override
public void beforeMessageRoute(Message message, RoutingContext context, boolean direct, boolean rejectDuplicates) throws ActiveMQException {
Map<String, Object> map = new HashMap<>();
map.put("RoutingContext", new RoutingContextLogView(context));
logger.info(mapper.writeValueAsString(map));
ActiveMQServerPlugin.super.beforeMessageRoute(message, context, direct, rejectDuplicates);
}
public class RoutingContextLogView {
private RoutingContext routingContext;
public RoutingContextLogView(RoutingContext routingContext) {
this.routingContext = routingContext;
}
public String getAddress() {
return routingContext.getAddress() != null ? routingContext.getAddress().toString() : null;
}
public String getPreviousAddress() {
return routingContext.getPreviousAddress() != null ? routingContext.getPreviousAddress().toString() : null;
}
public String getRoutingType() {
return routingContext.getRoutingType() != null ? routingContext.getRoutingType().name() : null;
}
public String getPreviousRoutingType() {
return routingContext.getPreviousRoutingType() != null ? routingContext.getPreviousRoutingType().name() : null;
}
}
Despite the odd logging the flow followed by the message seems to be OK (i.e. the message is produced to topic.abc.rawmessage.V1 and consumed from topic.abc.rawmessage.V1). I'm just wandering why there is message routing and why the previousAddress in the RoutingContext is wrong.
The RoutingContext object, which is used internally by the broker, is reusable. This is done for performance reasons to prevent having to re-create the RoutingContext for every routing operation no matter what. As one might guess, routing messages is a very common operation in the broker so it pays to optimize it as much as possible. Reusing the RoutingContext means fewer objects are created and thrown away which means less garbage needs to be cleaned up which means fewer pauses and better overall performance by the broker.
The fact that the previousAddress is different here from the address where the current message is going to be routed is not a problem. It just means that the context won't be re-used for this routing operation and therefore will be cleared. As the name suggests, the beforeMessageRoute method is invoked before any routing logic is performed (e.g. clearing the RoutingContext). If you inspect the RoutingContext using afterMessageRoute then you should see that it was cleared and populated with the proper details.
Message "sending" and message "routing" (both of which have plugin hooks) are related but distinct operations. A message is "sent" in response to a client operation. Sends always result in a route. However, not all routes are the results of sends. A message can be routed due to internal broker operations which do not involve a send (e.g. moving messages around a cluster, expiring a message, cancelling an undeliverable message to a dead-letter address, using a divert, etc.).
I would caution you against inspecting internal broker state (which can be subtle and nuanced) and assuming a problem exists when everything else indicates that the broker is functioning normally. In this case you said that you were "facing routing issues" and that "some message are routed from topic.private.abc.task.V1 to topic.abc.rawmessage.V1" when, in fact, there was no routing issue and messages were not actually being routed from topic.private.abc.task.V1 to topic.abc.rawmessage.V1. From what I can see everything is in fact functioning normally.

How shutdown KafkaListener when error occurs

I wrote a Listener in this way
#Autowired
private KafkaListenerEndpointRegistry kafkaListenerEndpointRegistry;
#KafkaListener(containerFactory = "cdcKafkaListenerContainerFactory", errorHandler = "errorHandler")
public void consume(#Payload String message) throws Exception {
...
}
#Bean
public KafkaListenerErrorHandler errorHandler() {
return ((message, e) -> {
kafkaListenerEndpointRegistry.stop();
return null;
});
}
In #KafkaListener annotation I specified my error handler that simply stop the consumer.
It seems to work but I've some question to ask.
Is there a built-in errorHandler for this scope? I've read that ContainerStoppingErrorHandler can be use, but I cannot set it because #KafkaListener's errorHandler accept beans of KafkaListenerErrorHandler type.
I see that with kafkaListenerEndpointRegistry.stop(); do a graceful stop. So before stopping the partition offset of the consumed message is committed.
What I would know is what happen when kafkaListenerEndpointRegistry.stop(); is called and before listener is definitely turned off another message arrive into the topic?
Is this message consumed?
I image this scenario
time0: kafkaListenerEndpointRegistry.stop() is called
time1: a message is pushed into the listened topic
time2: kafkaListenerEndpointRegistry.stop() complete graceful stop
I'm worried about a possible message arrive at time1. What would happen in this scenario?
Do not stop the container within the listener.
ContainerStoppingErrorHandler is set on the container factory, not the annotation.
If you are using Spring Boot, just declare the error handler as a bean and boot will wire it in.
Otherwise add the error handler to the connection factory bean.
With this error handler, throwing an exception will immediately stop the container.

ASP.NET Web Api: Delegate after Request

I have a problem with streams and the web api.
I return the stream which is consumed by the web api. Currently, i put the socket into a pool after getting the stream. but this cause some errors.
Now, I must putthe socket into the pool AFTER the request ended. (The stream was consumed and is now closed).
Is there a delegate for this or some other best practises?
Example code:
public HttpResponseMessage Get(int fileId)
{
HttpResponseMessage response = null;
response = new HttpResponseMessage(HttpStatusCode.OK);
Stream s = GetFile(id);
response.Content = new StreamContent(fileStream);
}
GetFile(int id)
{
FSClient fs = GetFSClient();
Stream s = fs.GetFileStream(id);
AddFSToPool(fs);
return s;
}
GetFile uses a self-programmed FileServer-Client.
It has an option to reuse FileServer-Connections. This connections will be stored in a pool. (In the pool are only unused FileServer-connections). If the next request calls GetFSClient() it gets an connected one from the pool (and removes it from the pool).
But if another requests comes in and uses a FileServer-Connection which is in the pool (because unused), there is still the problem, that the Stream is possibly in use.
Now I want to do the "put the FSClint into the pool" after the request ended and the stream is fully consumed.
Is there an entry point for that?
Stream is seen as a volatile/temporary resource - no wonder it implements IDisposable.
Also Stream is not thread-safe since it has a Position which means if it is read up to the end, it should be reset back to start and if two Threads reading the stream they will most likely read different chunks.
As such, I would not even attempt to solve this problem. Re-using streams on a web site (inherently multi-user / multi-threaded) not recommended.
UPDATE
As I said, still think that the best option is to re-think the solution but if you need to register something that runs after request finishes, use RegisterForDispose on request:
public HttpResponseMessage Get(HttpRequestMessage req, int fileId)
{
....
req.RegisterForDispose(myStream);
}