Spring Kafka Consume JsonNode objects - apache-kafka

I have a service that is producing Kafka messages with a payload of type com.fasterxml.jackson.databind.JsonNode. When I consume this message I want it to be serialized into a POJO, but I'm getting the following message:
IllegalArgumentException: Incorrect type specified for header
'kafka_receivedMessageKey'. Expected [class com.example.Person] but
actual type is [class com.fasterxml.jackson.databind.node.ObjectNode]
How do I configure either the producer or the consumer parts to work as I intend it to?
I think these are the approaches I can take:
Kafka Consumer should ignore __Key_TypeId__ if it is a subclass of JsonNode
OR
Kafka Producer should produce a message without __Key_TypeId__ header if it is a subclass of JsonNode
But how do I implement either of these approaches? Or is there another option?

See the reference manual.
You can either set JsonSerializer.ADD_TYPE_INFO_HEADERS to false on the producer or JsonDeserializer.USE_TYPE_INFO_HEADERS to false on the consumer.

Related

Does Kafka Consumer Deserializer have to match Producer Serializer?

Does the deserializer used by a consumer has to match the serializer used by the produced?
If a producer adds JSON values to messages then could the consumer choose to use the ByteArrayDeserializeror, StringDeserializeror JsonDeserializer regardless of which serializer the producer used or does it have to match?
If they have to match, what is the consequence of using a different one? Would this result in an exception, no data or something else?
It has to be compatible, not necessarily the matching pair
ByteArrayDeserializer can consume anything.
StringDeserializer assumes UTF8 encoded strings and might cast other types to strings upon consumption
JsonDeserializer will attempt to parse the message and will fail on invalid JSON
If you used an Avro, Protobuf, Msgpack, etc binary-format consumer then those also attempt to parse the message and will fail if they don't recognize their respective containers

Can producer have multiple Kafka topics?

I have multiple producers writing to single topic which is default as defined in policy, is it possible to create new topic without changing the default topic ? In other words, one producer same logs to multiple topics possible ?
In other words, one producer same logs to multiple topics possible ?
Yes, one producer can produce to multiple topics. The relation between the topic and a producer is not one-to-one.
Example:
producer.send(new ProducerRecord<String, String>("my-topic", "key", "val"));
The send() method takes a ProducerRecord which contains the topic name. So we can give different topic names to each send() call.
However, the key.serializer and value.serializer matters. We specify only one key.serializer and one value.serializer per-producer rather than per-topic.
This being the case, all of your topic messages can be serialized using those serializers only.
If you want to support different objects, either write a custom serializer that is common for all of them (perhaps, a Json Serializer) or convert your objects to the format that your serializers can serialize (for ex, String for StringSerializer, byte[] for ByteArraySerializer etc)

Publish messages that could not be de-serialized to DLT topic

I do not understand how messages that could not be de-serialized can be written to a DLT topic with spring kafka.
I configured the consumer according to the spring kafka docs and this works well for exceptions that occur after de-serialization of the message.
But when the message is not de-serializable a org.apache.kafka.common.errors.SerializationExceptionis thrown while polling for messages.
Subsequently, SeekToCurrentErrorHandler.handle(Exception thrownException, List<ConsumerRecord<?, ?>> records, ...) is called with this exception but with an empty list of records and is therefore unable to write something to DLT topic.
How can I write those messages to DLT topic also?
The problem is that the exception is thrown by the Kafka client itself so Spring doesn't get to see the actual record that failed.
That's why we added the ErrorHandlingDeserializer2 which can be used to wrap the actual deserializer; the failure is passed to the listener container and re-thrown as a DeserializationException.
See the documentation.
When a deserializer fails to deserialize a message, Spring has no way to handle the problem, because it occurs before the poll() returns. To solve this problem, version 2.2 introduced the ErrorHandlingDeserializer2. This deserializer delegates to a real deserializer (key or value). If the delegate fails to deserialize the record content, the ErrorHandlingDeserializer2 returns a null value and a DeserializationException in a header that contains the cause and the raw bytes. When you use a record-level MessageListener, if the ConsumerRecord contains a DeserializationException header for either the key or value, the container’s ErrorHandler is called with the failed ConsumerRecord. The record is not passed to the listener.
The DeadLetterPublishingRecoverer has logic to detect the exception and publish the failed record.

Handle Deserialization Error (Dead Letter Queue) in a kafka consumer

After some research i found few configuration i can use to handle this.
default.deserialization.exception.handler - from the StreamsConfig
errors.deadletterqueue.topic.name - from the SinkConnector config
I cant seem to find the equivalent valid configuration for a simple consumer
I want to start a simple consumer and have a DLQ handling whether its via just stating the DLQ topic and let kafka produce it ( like in the sink connector) or purely providing my own class that will produce it (like the streams config).
How can i achieve a DLQ with a simple consumer?
EDIT: Another option i figured is simply handling it in my Deserializer class, just catch an exception there and produce it to my DLQ
But it will mean ill need to create a producer in my deserializer class...
Is this the best practice to handle a DLQ from a consumer?

Strom - Reliable Spout

How to set Strom spout as reliable
Is there any property to set
Is kafkaSpout reliable by default
How to change the reliablity in kafkaSpout
Kafka is reliable by default. Here's the code from PartitionManager class, which is responsible for reading messages from Kafka topic:
collector.emit(tup, new KafkaMessageId(_partition, toEmit.offset()));
As you can see the second parameter of emit method is KafkaMessageId. You can pass message id in your spouts in the similar way. Message id can be ordinary integer.