KafkaTemplate for dead letter handler in Spring-Kafka - apache-kafka

Do I need a separate KafkaTemplate for DeadLetterPublishingRecoverer?
I have a KafkaTemplate used to send messages to Kafka and then I have a KafkaListenerContainerFactory with SeekToCurrentErrorHandler and DeadLetterPublishingRecoverer which in turn require me to provide a KafkaTempate. Do I really need this another template just for dlq handling or should I maybe use this KafkaTemplate for my normal kafka operations? I suppose I could also use a non generic KafkaTemplate for both but I suspect that is far from the best practice.

If the generic types are different, you can either configure 2 templates, or use <Object, Object> (as long as your serializer can handle both types).

Related

Does KStream filter consume every message?

I have used Kafka in the past, but never the streams API. I am tasked with building a scalable service that accepts websocket connections and routes outbound messages from a central topic to the correct session based on user id.
This looks ridiculously simple using KStream<String, Object>. From one online tutorial:
builder.stream(inputTopic, Consumed.with(Serdes.String(), publicationSerde))
.filter((name, publication) -> "George R. R. Martin".equals(publication.getName()))
.to(outputTopic, Produced.with(Serdes.String(), publicationSerde));
But does the filter command consume every message from the topic and perform a filter in application space? Or does KStream<K, V> filter(Predicate<? super K,? super V> predicate) contain hooks into the inner workings of Kafka that allow it only to receive messages matching the correct key?
The wording on the KStream<K,V> javadoc seem to suggest the former: "consumed message by message."
If the only purpose of the filter is to consume every message of a topic and throw away those that are not relevant, I could do that by hand.
You are correct - messages need to be deserialized, then inspected against a predicate (in application space)
throw away those that are not relevant, I could do that by hand
Sure, you could, but Kafka Streams has useful methods for defining session windows. Plus, you wouldn't need to define a consumer and producer instance to forward to new topics.

How to Handle Deserialization Exception & Converting to New Schema with Spring Cloud Stream?

I am have trouble understanding how to properly handle a deserialization exception within Spring Cloud stream. Primarily because the framework implemented does not support headers and the DLQ is supposed to be a separate schema than the original message. So the process flow needs to be: consume message -> deserialization error -> DlqHandler -> serialize with NEW schema -> send to DLQ
The documentation linked below doesn't give a good idea on if that is even possible. I have seen quite a few examples of SeekToCurrentErrorHandler for Spring-Kafka but those to my knowledge are different implementations and do not match with how I could properly get the deserialization error and then have a section for custom code to serialize into a new format and move from there.
My main question is: Is capturing the deserialization exception and reserializing possible with spring cloud streams (kafka)?
Spring Cloud Documentation for DLQ
Yes, but not using the binding retry or DLQ properties.
Instead, add a ListenerContainerCustomizer bean and customize the binding's listener container with a SeekToCurrentErrorHandler configured for the retries you need and, probably, a subclass of the DeadLetterPublishingRecoverer using an appropriately configured KafkaTemplate and possibly overriding the createProducerRecord method.

KStreams with multi event types topic

I'm struggling with Kafka and its multi-event types per topic concept. According to this article, there are some cases when it's fine to keep events of different types in single topic. And I believe I have all prerequisites to use it my case. Without going deep into the idea, I just tell that I want to keep commands and events in same topic under the same key to preserve order of the events.
In my case I'm using avro and would like to use io.confluent.kafka.serializers.subject.RecordNameStrategy for serialisation of events coming from topic. And I would like to use Kafka Streams api to avoid low-level api. Thus, KStream is a java class which designed to heavily use generics and type parameter, I'm not sure the right way to express the nature of such polymorf topic nature with it, as I'm using Avro records and autogenerated classes, where I cannot build inheritance tree of objects or use composition to encapsulate such playload inside some wrapper class.
If I will use Object class in the KStream definition and will allow schemaregistry to convert data, and then apply filtering by type, does not looks right to me...
I also thought about defining different consumer for same topic which are supposed to read events only of right type, but also don't have an glue how to filter such, before reaching up my KStream...
And here is my question. What would be the right way of archiving this with KStream ?
I will appreciate any help or ideas
Thanks!

Handle Deserialization Error (Dead Letter Queue) in a kafka consumer

After some research i found few configuration i can use to handle this.
default.deserialization.exception.handler - from the StreamsConfig
errors.deadletterqueue.topic.name - from the SinkConnector config
I cant seem to find the equivalent valid configuration for a simple consumer
I want to start a simple consumer and have a DLQ handling whether its via just stating the DLQ topic and let kafka produce it ( like in the sink connector) or purely providing my own class that will produce it (like the streams config).
How can i achieve a DLQ with a simple consumer?
EDIT: Another option i figured is simply handling it in my Deserializer class, just catch an exception there and produce it to my DLQ
But it will mean ill need to create a producer in my deserializer class...
Is this the best practice to handle a DLQ from a consumer?

Kafka Producer design - multiple topics

I'm trying to implement a Kafka producer/consumer model, and am deliberating whether creating a separate publisher thread per topic would be preferred over having a single publisher handle multiple topics. Any help would be appreciated
PS: I'm new to Kafka
By separate publisher thread, I think you mean separate producer objects. If so..
Since messages are stored as key-value pairs in Kafka, different topics can have different key-value types.
So if your Kafka topics have different key-value types like for example..
Topic1 - key:String, value:Student
Topic2 - key:Long, value:Teacher
and so on, then you should be using multiple producers. This is because the KafkaProducer class at the time of constructing the object asks you for the key and value serializers.
Properties props=new Properties();
props.put("key.serializer",StringSerializer.class);
props.put("value.serializer",LongSerializer.class);
KafkaProducer<String,Long> producer=new KafkaProducer<>(props);
Though, you may also write a generic serializer for all the types! But, it is better to know before hand what we are doing with the producer.
I prefer Keep It Stupid Simple (KISS) approach for the sake of obvious reasons - one producer / multiple producers - one topic.
From Wikipedia,
The KISS principle states that most systems work best if they are kept simple rather than made complicated; therefore, simplicity should be a key goal in design, and unnecessary complexity should be avoided.
Talking about the possibility of one producer supporting multiple topics, it is also far from the truth.
Starting with version 2.5, you can use a RoutingKafkaTemplate to select the producer at runtime, based on the destination topic name.
https://docs.spring.io/spring-kafka/reference/html/#routing-template
Single Publisher can handle multiple Topics and you can customize the Producer Config as per Topic needs
I think a separate thread for each topic would be preferred because due to some reasons if the particular producer is down then the respected topic will be impacted and remaining all the topics will work smoothly without any problem.
If we create one publisher for all topics then, if the publisher is down for some reasons then all the topics would impact.