spring cloud stream custom value deserializer does not work - apache-kafka

I have this simple spring cloud stream Function
#Configuration
public class ItemProcessor {
#Bean
public Serde<Wish> WishSerde(){
Serde<Wish> wishSerde = DebeziumSerdes.payloadJson(Wish.class);
wishSerde.configure(Collections.singletonMap("from.field", "after"), false);
return wishSerde;
}
#Bean
public Serde<Long> KeySerde(){
final Serde<Long> keySerde = DebeziumSerdes.payloadJson(Long.class);
keySerde.configure(Collections.emptyMap(), true);
return keySerde;
}
#Bean
public Function<KStream<Long, Wish>, KStream<Long, Wish>> processItems() {
return (models) -> models
.peek((k, v) -> System.out.println(k + ": " + v));
}
}
And here is the Wish model
#Data
#NoArgsConstructor
#AllArgsConstructor
#ToString
public class Wish {
public long wish_id;
public long user_id_fk;
public long item_id_fk;
public long wish_status;
}
And the application.yml
spring.cloud:
function.definition: processItems
stream:
bindings:
processItems-in-0:
destination: source.wish
processItems-out-0:
destination: processed.wish
kafka:
streams:
binder:
brokers: 127.0.0.1:9092
I am using configured bean for Serde as described in the documentation and using Debezium JsonSerde as described here to deserialize objects created by Debezium.
The incoming message is like this:
{"wish_id":759}|{"before":null,"after":{"wish_id":759,"user_id_fk":2,"item_id_fk":823,"wish_status":1},"source":{"version":"1.6.0.Final","connector":"mysql","name":"JDP","ts_ms":1635151905000,"snapshot":"false","db":"jdb","sequence":null,"table":"wish","server_id":1,"gtid":null,"file":"mysql-bin.000008","pos":2694699,"row":0,"thread":null,"query":null},"op":"c","ts_ms":1635151886089,"transaction":null}
where the key and value are separated by |.
I need the content of the 'after' field to use as data for the Wish model and this Serde with this config suppose to do that. But I get the following error at the runtime:
Exception in thread "processItems-applicationId-062d6a97-b543-47ea-b938-b1b520a6faa8-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Deserialization exception handler is set to fail upon a deserialization error. If you would rather have the streaming pipeline continue after a deserialization error, please set the default.deserialization.exception.handler appropriately.
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:82)
at org.apache.kafka.streams.processor.internals.RecordQueue.updateHead(RecordQueue.java:176)
at org.apache.kafka.streams.processor.internals.RecordQueue.addRawRecords(RecordQueue.java:112)
at org.apache.kafka.streams.processor.internals.PartitionGroup.addRawRecords(PartitionGroup.java:185)
at org.apache.kafka.streams.processor.internals.StreamTask.addRecords(StreamTask.java:895)
at org.apache.kafka.streams.processor.internals.TaskManager.addRecordsToTasks(TaskManager.java:1008)
at org.apache.kafka.streams.processor.internals.StreamThread.pollPhase(StreamThread.java:812)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:625)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:564)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:523)
Caused by: java.lang.RuntimeException: com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException: Unrecognized field "before" (class ir.jdro.kafkaStream.rdbAggregator.model.Wish), not marked as ignorable (4 known properties: "wish_status", "wish_id", "user_id_fk", "item_id_fk"])
at [Source: UNKNOWN; line: -1, column: -1] (through reference chain: ir.jdro.kafkaStream.rdbAggregator.model.Wish["before"])
at io.debezium.serde.json.JsonSerde$JsonDeserializer.deserialize(JsonSerde.java:95)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:60)
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:58)
at org.apache.kafka.streams.processor.internals.RecordDeserializer.deserialize(RecordDeserializer.java:66)
Once I set "unknown.properties.ignored", true for Serde configuration and prevent application from raising the exception I get the following output result:
759|{"wish_id":0,"user_id_fk":0,"item_id_fk":0,"wish_status":0}
Which shows the key deserialization works fine but the value deserialization isn't.
I can't find where I am wrong!

Related

Quarkus Kafka - Batch/Bulk message consumer

I want to batch process. In my use case send kafka producer messages are sent one by one. I want to read them as a list in the consumer application. I can do that at the Spring Kafka library. Spring Kafka batch listener
Is there any way to do this with the quarkus-smallrye-reactive-messaging-kafka library?
I tried the example below but got an error.
ERROR [io.sma.rea.mes.provider] (vert.x-eventloop-thread-3) SRMSG00200: The method org.MyConsumer#aggregate has thrown an exception: java.lang.ClassCastException: class org.TestConsumer cannot be cast to class io.smallrye.mutiny.Multi (org.TestConsumer is in unnamed module of loader io.quarkus.bootstrap.classloading.QuarkusClassLoader #6f2c0754; io.smallrye.mutiny.Multi is in unnamed module of loader io.quarkus.bootstrap.classloading.QuarkusClassLoader #4c1638b)
application.properties:
kafka.bootstrap.servers=hosts
mp.messaging.connector.smallrye-kafka.group.id=KafkaQuick
mp.messaging.connector.smallrye-kafka.auto.offset.reset=earliest
mp.messaging.incoming.test-consumer.connector=smallrye-kafka
mp.messaging.incoming.test-consumer.value.deserializer=org.TestConsumerDeserializer
TestConsumerDeserializer:
public class TestConsumerDeserializer extends JsonbDeserializer<TestConsumer>{
public TestConsumerDeserializer(){
// pass the class to the parent.
super(TestConsumer.class);
}
}
MyConsumer:
#ApplicationScoped
public class MyConsumer {
#Incoming("test-consumer")
//#Outgoing("aggregated-channel")
public void aggregate(Multi<Message<TestConsumer>> in) {
System.out.println(in);
}
}
Batch support has been added to the Quarkus Kafka connector.
See https://quarkus.io/guides/kafka#receiving-kafka-records-in-batches.
I don't understand the reason why the ClassNotFoundException in the question.
But I found solutions for reading bulk/bach messages using quarkus-smallrye-reactive-messaging-kafka.
Solution 1:
#Incoming("test-consumer-topic")
#Outgoing("aggregated-channel")
public Multi<List<TestConsumer>> aggregate(Multi<TestConsumer> in) {
return in.groupItems().intoLists().every(Duration.ofSeconds(5));
}
#Incoming("aggregated-channel")
public void test(List<TestConsumer> test) {
System.out.println("size: "+ test.size());
}
Solution 2:
#Incoming("test-consumer-topic")
#Outgoing("events-persisted")
public Multi<Message<TestConsumer>> processPayloadStream(Multi<Message<TestConsumer>> messages) {
return messages
.groupItems().intoLists().of(4)
.emitOn(Infrastructure.getDefaultWorkerPool())
.flatMap(messages1 -> {
persist(messages1);
return Multi.createFrom().items(messages1.stream());
}).emitOn(Infrastructure.getDefaultExecutor());
}
public void persist(List<Message<TestConsumer>> messages){
System.out.println("messages size:"+ messages.size());
}
#Incoming("events-persisted")
public CompletionStage<Void> messageAcknowledging(Message<TestConsumer> message){
return message.ack();
}
note: Using the application.properties config in the question.
references:
Support subscribing with Multi<Message<>>...
Get Bulk polled message from kafka

Kafka Streams: Define multiple Kafka Streams using Spring Cloud Stream for each set of topics

I am trying to do a simple POC with Kafka Streams. However I am getting exception while starting the application. I am using Spring-Kafka, Kafka-Streams 2.5.1 with Spring boot 2.3.5
Kafka stream configuration
#Configuration
public class KafkaStreamsConfig {
private static final Logger log = LoggerFactory.getLogger(KafkaStreamsConfig.class);
#Bean
public Function<KStream<String, String>, KStream<String, String>> processAAA() {
return input -> input.peek((key, value) -> log
.info("AAA Cloud Stream Kafka Stream processing : {}", input.toString().length()));
}
#Bean
public Function<KStream<String, String>, KStream<String, String>> processBBB() {
return input -> input.peek((key, value) -> log
.info("BBB Cloud Stream Kafka Stream processing : {}", input.toString().length()));
}
#Bean
public Function<KStream<String, String>, KStream<String, String>> processCCC() {
return input -> input.peek((key, value) -> log
.info("CCC Cloud Stream Kafka Stream processing : {}", input.toString().length()));
}
/*
#Bean
public KafkaStreams kafkaStreams(KafkaProperties kafkaProperties) {
final Properties props = new Properties();
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaProperties.getBootstrapServers());
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "groupId-1"););
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE);
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, JsonSerde.class);
props.put(JsonDeserializer.VALUE_DEFAULT_TYPE, JsonNode.class);
final KafkaStreams kafkaStreams = new KafkaStreams(kafkaStreamTopology(), props);
kafkaStreams.start();
return kafkaStreams;
}
#Bean
public Topology kafkaStreamTopology() {
final StreamsBuilder streamsBuilder = new StreamsBuilder();
streamsBuilder.stream(Arrays.asList(AAATOPIC, BBBInputTOPIC, CCCInputTOPIC));
return streamsBuilder.build();
} */
}
application.yaml configured is like below. The idea is that I have 3 input and 3 output topics.
The component takes input from input topic and gives output to outputtopic.
spring:
application.name: consumerapp-1
cloud:
function:
definition: processAAA;processBBB;processCCC
stream:
kafka.binder:
brokers: 127.0.0.1:9092
autoCreateTopics: true
auto-add-partitions: true
kafka.streams.binder:
configuration:
commit.interval.ms: 1000
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
bindings:
processAAA-in-0:
destination: aaaInputTopic
processAAA-out-0:
destination: aaaOutputTopic
processBBB-in-0:
destination: bbbInputTopic
processBBB-out-0:
destination: bbbOutputTopic
processCCC-in-0:
destination: cccInputTopic
processCCC-out-0:
destination: cccOutputTopic
Exception thrown is
Caused by: java.lang.IllegalArgumentException: Trying to prepareConsumerBinding public abstract void org.apache.kafka.streams.kstream.KStream.to(java.lang.String,org.apache.kafka.streams.kstream.Produced) but no delegate has been set.
at org.springframework.util.Assert.notNull(Assert.java:201)
at org.springframework.cloud.stream.binder.kafka.streams.KStreamBoundElementFactory$KStreamWrapperHandler.invoke(KStreamBoundElementFactory.java:134)
Can anyone help me with Kafka Streams Spring-Kafka code samples for processing with multiple input and output topics.
Updates: 21-Jan-2021
After removing all kafkaStreams and kafkaStreamsTopology beans configuration iam getting below message in an infinite loop. The messages consumption is still not working. I have checked the subscription in application.yaml with the #Bean Function definitions. they all look ok to me but still I get this cross wiring error. I have replaced the application.properties with application.yaml above
[consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1] [Consumer clientId=consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1-consumer, groupId=consumerapp-1] We received an assignment [cccParserTopic-0] that doesn't match our current subscription Subscribe(bbbParserTopic); it is likely that the subscription has changed since we joined the group. Will try re-join the group with current subscription
2021-01-21 14:12:43,336 WARN org.apache.kafka.clients.consumer.internals.ConsumerCoordinator [consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1] [Consumer clientId=consumerapp-1-75eec5e5-2772-4999-acf2-e9ef1e69f100-StreamThread-1-consumer, groupId=consumerapp-1] We received an assignment [cccParserTopic-0] that doesn't match our current subscription Subscribe(bbbParserTopic); it is likely that the subscription has changed since we joined the group. Will try re-join the group with current subscription
I have managed to solve the problem. I am writing this for the benefit of others.
If you want to include multiple streams in your single app jar then the key is in defining multiple application Ids that is one per each of your streams. I knew this all along but I was not aware on how to define it. Finally the answer is something I have managed to dig out after reading the SCSt documentation. Below is how the application.yaml can be defined.
application.yaml is like below
spring:
application.name: kafkaMultiStreamConsumer
cloud:
function:
definition: processAAA; processBBB; processCCC --> // needed for Imperative #StreamListener
stream:
kafka:
binder:
brokers: 127.0.0.1:9092
min-partition-count: 3
replication-factor: 2
transaction:
transaction-id-prefix: transaction-id-2000
autoCreateTopics: true
auto-add-partitions: true
streams:
binder:
functions:
// needed for functional
processBBB:
application-id: SampleBBBapplication
processAAA:
application-id: SampleAAAapplication
processCCC:
application-id: SampleCCCapplication
configuration:
commit.interval.ms: 1000
default.key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
default.value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
bindings:
// Below is for Imperative Style programming using
// the annotation namely #StreamListener, #SendTo in .java class
inputAAA:
destination: aaaInputTopic
outputAAA:
destination: aaaOutputTopic
inputBBB:
destination: bbbInputTopic
outputBBB:
destination: bbbOutputTopic
inputCCC:
destination: cccInputTopic
outputCCC:
destination: cccOutputTopic
// Functional Style programming using Function<KStream...> use either one of them
// as both are not required. If you use both its ok but only one of them works
// from what i have seen #StreamListener is triggered always.
// Below is from functional style
processAAA-in-0:
destination: aaaInputTopic
group: processAAA-group
processAAA-out-0:
destination: aaaOutputTopic
group: processAAA-group
processBBB-in-0:
destination: bbbInputTopic
group: processBBB-group
processBBB-out-0:
destination: bbbOutputTopic
group: processBBB-group
processCCC-in-0:
destination: cccInputTopic
group: processCCC-group
processCCC-out-0:
destination: cccOutputTopic
group: processCCC-group
Once above is defined we now need to define individual java classes where the Stream processing logic is implemented.
Your Java class can be something like below. Create similarly for other 2 or N streams as per your requirement. One example is like below : AAASampleStreamTask.java
#Component
#EnableBinding(AAASampleChannel.class) // One Channel interface corresponding to in-topic and out-topic
public class AAASampleStreamTask {
private static final Logger log = LoggerFactory.getLogger(AAASampleStreamTask.class);
#StreamListener(AAASampleChannel.INPUT)
#SendTo(AAASampleChannel.OUTPUT)
public KStream<String, String> processAAA(KStream<String, String> input) {
input.foreach((key, value) -> log.info("Annotation AAA *Sample* Cloud Stream Kafka Stream processing {}", String.valueOf(System.currentTimeMillis())));
...
// do other business logic
...
return input;
}
/**
* Use above or below. Below style is latest startting from ScSt 3.0 if iam not
* wrong. 2 different styles of consuming Kafka Streams using SCSt. If we have
* both then above gets priority as per my observation
*/
/*
#Bean
public Function<KStream<String, String>, KStream<String, String>> processAAA() {
return input -> input.peek((key, value) -> log.info(
"Functional AAA *Sample* Cloud Stream Kafka Stream processing : {}", String.valueOf(System.currentTimeMillis())));
...
// do other business logic
...
}
*/
}
The Channel is required if you want to go with Imperative style programming not for functional.
AAASampleChannel.java
public interface AAASampleChannel {
String INPUT = "inputAAA";
String OUTPUT = "outputAAA";
#Input(INPUT)
KStream<String, String> inputAAA();
#Output(OUTPUT)
KStream<String, String> outputAAA();
}
Looks like you are mixing Spring Cloud Stream and Spring Kafka in the application. When using the binder, you don't need to directly define components required by Spring Kafka such as KafkaStreams and Topology, rather they are created by SCSt implicitly. Can you remove the following beans and try again?
#Bean
public KafkaStreams kafkaStreams(KafkaProperties kafkaProperties) {
and
#Bean
public Topology kafkaStreamTopology() {
If you are still facing issues, please share a small sample that can be reproducible, that way we can triage it further.

Kafkastream springcloud kafka join selectKey

could you please help to configure a spring cloud stream app based on Kafka, I'm facing issue on the selectKey operation.
Let's explain what i m try to reach
2 incoming topics Person, RefGenre
Person contain the key of Refgenre (in value)
public class Person {
String nom;
String prenom;
String codeGenre; <<--- here is the key of the second topic refgenre
}
So I m using the selectKey operator to prepare my stream before the join operation.
a new topic is created with selectByKey (my-app-KSTREAM-KEY-SELECT-0000000004-repartition), and then serialization issue happens :
Exception in thread "my-app-3c57b31c-28e5-4199-b07d-87f8940425ab-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: ClassCastException while producing data to topic my-app-KSTREAM-KEY-SELECT-0000000004-repartition. A serializer (key: org.apache.kafka.common.serialization.StringSerializer / value: statefull.serde.PersonWithGenreSerde) is not compatible to the actual key or value type (key type: java.lang.String / value type: statefull.model.Person). Change the default Serdes in StreamConfig or provide correct Serdes via method parameters (for example if using the DSL, #to(String topic, Produced<K, V> produced) with Produced.keySerde(WindowedSerdes.timeWindowedSerdeFrom(String.class))).
Where can i specify serde for this repartition topic and can i specify the name of this "internal" topic ?
#Bean
public BiFunction<KStream<String, Person>, KTable<String, ReferentielGenre>, KStream<Long, PersonWithGenre>> joinKtable() {
return (persons, referentielGenres) ->
persons.selectKey((k,v) -> v.getCodeGenre())
.join(referentielGenres,
(person, genre) -> new PersonWithGenre(person.getNom(), person.getPrenom(),genre),
Joined.with(Serdes.String(), new PersonWithGenreSerde(), null));
}
here is the full code of my not working job : https://github.com/YohanAlard/joinkstream
Is there a better way to handle this usecase ?

Error using "condition paramter header" #StreamListener of new release Chelsea.RC1

I am trying to use the event filter to reduce the amount of topics the application uses using the new feature available in the new release of the spring cloud stream (Chelsea.RC1). The message is being created, with the correct header, however, inspecting the contents of the message in the queue, the message does not contain the header, only the body with the payload.
public void sendEnroll(EnrollCommand data) {
//MessageChannel
outputEnroll.send(MessageBuilder
.withPayload(data)
.setHeader("brand", "MASTERCARD")
.setHeader("operation", Operation.ENROLL).build());
}
Consumer
#Service
#EnableBinding(Channel.class)
public class EnrollConsumer {
#Autowired
private EnrollService service;
#StreamListener(target = Channel.INPUT_ENROLL, condition = "headers['brand']=='MASTERCARD'")
public void enrollConsumer(#Payload String command){
System.out.println(command);
//service.enrollment(command);
}
}
In consumer service, it gives the following warning:
WARN -kafka-listener-1 o.s.c.s.b.DispatchingStreamListenerMessageHandler:62 - Cannot find a #StreamListener matching for message with id: 7baae934-7484-a7fd-91b0-ba906558bb13
You have to map that your custom headers:
spring.cloud.stream.kafka.binder.headers = brand,operation
That information is present in the documentation.

Error creating unit test with Spring cloud stream using kafka

i dunno how make one sample test using kafka, i tried to follow the spring guide but dont work.
Can someone help me?
zzzzz zz z z z z z z z z z z z
#RunWith(SpringRunner.class)
#SpringBootTest
#DirtiesContext
public class EnrollSenderTest {
#Autowired
public EnrollSender producer;
#Autowired
private BinderFactory<MessageChannel> binderFactory;
#Autowired
private MessageCollector messageCollector;
#SuppressWarnings("unchecked")
#Test
public void test() {
Message<String> message = new GenericMessage<>("hello");
producer.sendEnroll(message);
Message<String> received = (Message<String>) messageCollector.forChannel(producer.getOutput()).poll();
assertThat(received.getPayload(), equalTo("hello"));
}
}
And my class Producer is:
#Service
#EnableBinding(Source.class)
public class EnrollSender {
private final MessageChannel output;
public EnrollSender(Source output) {
this.output = output.output();
}
public void sendEnroll(Object enroll) {
output.send(MessageBuilder.withPayload(enroll).build());
}
public MessageChannel getOutput() {
return output;
}
}
But gives the following error:
java.lang.IllegalStateException: Failed to load ApplicationContext
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'messageCollector' defined in class path resource [org/springframework/cloud/stream/test/binder/TestSupportBinderAutoConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springframework.cloud.stream.test.binder.MessageCollector]: Factory method 'messageCollector' threw exception; nested exception is java.lang.NoSuchMethodError: org.springframework.cloud.stream.binder.BinderFactory.getBinder(Ljava/lang/String;Ljava/lang/Class;)Lorg/springframework/cloud/stream/binder/Binder;
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springframework.cloud.stream.test.binder.MessageCollector]: Factory method 'messageCollector' threw exception; nested exception is java.lang.NoSuchMethodError: org.springframework.cloud.stream.binder.BinderFactory.getBinder(Ljava/lang/String;Ljava/lang/Class;)Lorg/springframework/cloud/stream/binder/Binder;
Caused by: java.lang.NoSuchMethodError: org.springframework.cloud.stream.binder.BinderFactory.getBinder(Ljava/lang/String;Ljava/lang/Class;)Lorg/springframework/cloud/stream/binder/Binder;
Marius Bogoevici, my dependencys
dependencyManagement {
imports {
mavenBom "org.springframework.cloud:spring-cloud-dependencies:Camden.SR4"
}
}
compile 'org.springframework.cloud:spring-cloud-starter-stream-kafka'
compile group: 'org.springframework.cloud', name: 'spring-cloud-stream-test-support', version: '1.1.1.RELEASE'
Looks like you have a mismatched dependency set on the classpath (i.e. an older version of Spring Cloud Stream core).
You can solve this by removing the version for spring-cloud-stream-test-support because the Camden.SR4 BOM will provide the correct one.
Moreover, if you want to test with an embedded Kafka instance, you can find an example here: https://github.com/spring-cloud/spring-cloud-stream-samples/blob/master/multibinder/src/test/java/multibinder/RabbitAndKafkaBinderApplicationTests.java#L57
(The example shows you how to configure the Kafka binder with an embedded broker for testing - it also shows how to use two different binders within the same app, but probably you don't care about that).
This is because of the incompatible versions as pointed out by Marius above.
You would either need Camden.SR5 that has compatible versions of Spring Cloud Stream and Spring Cloud Stream test support or Camden.SR4 with Spring Cloud Stream test support version 1.1.0.RELEASE.
This is change that went in between 1.1.0.RELEASE and 1.1.1.RELEASE of Spring Cloud Steram: