Apache Kafka Consumer Deserialization Error - apache-kafka

I am using kafka for consuming messages. While consuming messages, there are possibility that I may get different messages which would cause DeserializationException. I want to skip the records that causes DeserializationException and process the one which is not causing any issues.
All Kafka related properties are configured through properties like below,
kafka:
producer:
bootstrap-servers:
- PRODUCER_BROKERS
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
consumer:
key-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
value-deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
bootstrap-servers:
- CONSUMER_BROKERS
properties:
key.deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
value.deserializer: org.springframework.kafka.support.serializer.ErrorHandlingDeserializer
spring.deserializer.key.delegate.class: org.apache.kafka.common.serialization.StringDeserializer
spring.deserializer.value.delegate.class: io.confluent.kafka.serializers.KafkaAvroDeserializer
When I googled I get some solution by implementing ErrorHandler from How to catch deserialization error in Kafka-Spring? but since I am using properties I am not sure, how can I bind it to ConcurrentKafkaListenerContainerFactory. What is a better approach?

Your configuration looks correct.
The default error handler (DefaultErrorHandler) will discard (log) the records with failed deserialization errors.

Related

How to add key serializer and value serializer in kafka console producer

I have below property set in spring boot kafka producer application.yaml
consumer-properties:
key.deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
value.deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
producer-properties:
key.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
I have to produce message from kafka console producer eg-
kafka-console-producer --bootstrap-server confluent-cp-kafka:9092 --topic TSTTOPIC --producer-property key.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer value.serializer: io.confluent.kafka.serializers.KafkaAvroSerializer
but its not working and whn I produce message from console producer I get error in consumer log as below
You cannot use colons on the CLI.
If you want to use your property file, then pass --producer.config with the producer.properties file
Otherwise, you can use kafka-avro-console-producer along with --producer-property key.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer
As for the Avro serializers, you appear to be missing any key.schema or value.schema + schema.registry.url, which are only properties read by the kakfa-avro-console-producer and would explain why your Avro consumer would be unable to read the data (it was sent as plaintext)

How to provide more than one trusted package to deserialization in .yml?

When I am creating consumer and trying to deserialize object I got error
Caused by: IllegalArgumentException: The class 'com.domain.project2.package2.SomeEvent' is not in the trusted packages: [java.util, java.lang, com.domain.project2.package1, com.domain.project2.package2]. If you belive this class is....
My .yml config:
spring:
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.support.serializer.JsonSerializer
consumer:
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.support.serializer.JsonDeserializer
properties:
spring:
json:
trusted:
packages: 'com.domain.project2.package1, com.domain.project2.package2'
I presume you mean
spring:
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
consumer:
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
properties:
spring:
json:
trusted:
packages: 'com.domain.project2.package1, com.domain.project2.package2'
Since you are using the Spring deserializer, not the apache JsonDeserializer.
The problem is the space after the comma.
Use 'com.domain.project2.package1,com.domain.project2.package2'.
We should probably trim the packages to remove extraneous spaces.

How to configure multiple kafka consumer in application.yml file

Actually i have a springboot based micro-service , and i have used kafka to produce/consume data from different system.
Now my question is i have two different topics and based on topics i have two different consumer classes to consume data,
how to define multiple consumer properties in application.yml file ?
I configured for one consumer in application.yml like below :-
spring:
kafka:
consumer:
bootstrapservers: http://199.968.98.101:9092
group-id: groupid-QA-02
auto-offset-reset: latest
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
I am using #KafkaListener in my consumer classes
example of consumer method which i used in code
#KafkaListener(topics = "${app.topic.b2b_tf_ta_req}", groupId = "${app.topic.groupoId}")
public void consume(String message) throws Exception {
}
As far as I know bootstrap-servers accept comma separated list of servers
i.e. if you set it to server1:9092,server2:9092 kafka should connect to all of them

Producer Partition Count override not effective

Interpreting this - https://docs.spring.io/spring-cloud-stream/docs/current/reference/htmlsingle/#_producer_properties
my understanding is that, if the partitionCount override is less than the actual number of partitions on an existing kafka topic, then the producer should use the actual number of partitions rather than the override value. My experience is that the producer uses the partitionCount value regardless of how many partitions ( > partitionCount ) are actually configured on the kafka topic.
Ideally, I would like the producer to read the number of partitions on a pre-configured topic from kafka and write messages across all available partitions.
Spring-Cloud version: Finchley.RELEASE
Kafka Broker version : 1.0.0
application.yml:
spring:
application:
name: my-app
cloud:
stream:
default:
contentType: application/json
kafka:
binder:
brokers:
- ${KAFKA_HOST}:${KAFKA_PORT}
auto-create-topics: false
bindings:
input-channel:
destination: input-topic
contentType: application/json
group: input-group
output-channel:
destination: output-topic
contentType: application/json
producer:
partition-count: 2
partition-key-expression: payload['Id']
So, I am expecting that if the output topic is already configured with 6 partitions, the producer will recognise this and write to all of those. Could someone please verify my interpretation above? Or point out what I am missing to get the desired functionality?

Kafka Consumer - receiving messages Inconsistently

I can send and receive messages on command line against a Kafka location installation. I also can send messages through a Java code. And those messages are showing up in a Kafka command prompt. I also have a Java code for the Kafka consumer. The code received message yesterday. It doesn't receive any messages this morning, however. The code has not been changed. I am wondering whether the property configuration isn't quite right nor not. Here is my configuration:
The Producer:
bootstrap.servers - localhost:9092
group.id - test
key.serializer - StringSerializer.class.getName()
value.serializer - StringSerializer.class.getName()
and the ProducerRecord is set as
ProducerRecord<String, String>("test", "mykey", "myvalue")
The Consumer:
zookeeper.connect - "localhost:2181"
group.id - "test"
zookeeper.session.timeout.ms - 500
zookeeper.sync.time.ms - 250
auto.commit.interval.ms - 1000
key.deserializer - org.apache.kafka.common.serialization.StringDeserializer
value.deserializer - org.apache.kafka.common.serialization.StringDeserializer
and for Java code:
Map<String, Integer> topicCount = new HashMap<>();
topicCount.put("test", 1);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreams = consumer
.createMessageStreams(topicCount);
List<KafkaStream<byte[], byte[]>> streams = consumerStreams.get(topic);
What is missing?
A number of things could be going on.
First, your consumer's ZooKeeper session timeout is very low, which means the consumer may be experiencing many "soft failures" due to garbage collection pauses. When this happens, the consumer group will rebalance, which can pause consumption. And if this is happening very frequently, the consumer could get into a state where it never consumes messages because it's constantly being rebalanced. I suggest increasing the ZooKeeper session timeout to 30 seconds to see if this resolves the issue. If so, you can experiment setting it lower.
Second, can you confirm new messages are being produced to the "test" topic? Your consumer will only consume new messages that it hasn't committed yet. It's possible the topic doesn't have any new messages.
Third, do you have other consumers in the same consumer group that could be processing the messages? If one consumer is experiencing frequent soft failures, other consumers will be assigned its partitions.
Finally, you're using the "old" consumer which will eventually be removed. If possible, I suggest moving to the "new" consumer (KafkaConsumer.java) which was made available in Kafka 0.9. Although I can't promise this will resolve your issue.
Hope this helps.