How to write generic deserializer with Spring Cloud Stream?(Batch-Consumer) - deserialization

I was communicating between my cloud services. I was sending person from one service and receiving from another service and it was working without any serializer or deserializer.
My code is like that:
#EnableBinding({ PersonStream.class })
public class StreamConfiguration {
}
public interface PersonStream {
String OUTPUT = "person-topic-out";
String INPUT = "person-topic-in";
#Input(INPUT)
SubscribableChannel inboundPerson();
#Output(OUTPUT)
MessageChannel outboundPerson();
}
#Service
public class PersonProducer {
#Autowired
private PersonStream personStream;
#Scheduled(fixedRate = 2000, initialDelay = 10000)
public void publishPerson() {
MessageChannel messageChannel = personStream.outboundPerson();
Person person = new Person("Omer", "Celik");
messageChannel.send(
MessageBuilder.withPayload(person)
.build());
}
}
#Service
public class PersonListener {
#StreamListener(value = PersonStream.INPUT)
private void personBulkReceiver(Person person) {
System.out.println(person.getName());
}
}
spring:
cloud:
stream:
kafka:
binders:
defaultKafka:
type: kafka
environment:
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9092
bindings:
person-topic-in:
binder: defaultKafka
destination: person-topic
contentType: application/person
group : omercelik
person-topic-out:
binder: defaultKafka
destination: person-topic
contentType: application/json
After than I needed to consume data as batch. But when I consume as batch, data types will be List. I solved that problem by writing deserializer and data types will be List. However, that way I have to write deserializer for every data. Is there a generic deserializer? How can I write generic deserializer?
I do not use avro schema. Waiting for your suggestions...
#Service
public class PersonListener {
#StreamListener(value = PersonStream.INPUT)
private void personBulkReceiver(List<Person> person) {
System.out.println(person.get(0).getName());
System.out.println("personBulkReceiver : " + person.size());
}
}
public class PersonDeserializer implements Deserializer<Person> {
#Override
public Person deserialize(String s, byte[] bytes) {
ObjectMapper objectMapper = new ObjectMapper();
try {
Person p = objectMapper.readValue(bytes, Person.class);
return p;
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
}
spring:
cloud:
stream:
kafka:
binders:
defaultKafka:
type: kafka
environment:
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9092
bulkKafka:
type: kafka
environment:
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9092
configuration:
max.poll.records: 1500
fetch.min.bytes: 1000000
fetch.max.wait.ms: 10000
value.deserializer: tr.cloud.stream.examples.PersonDeserializer
bindings:
person-topic-in:
binder: bulkKafka
destination: person-topic
contentType: application/person
group : omercelik
consumer:
batch-mode: true
person-topic-out:
binder: defaultKafka
destination: person-topic
contentType: application/json
Codes : https://github.com/omercelikceng/spring-cloud-stream-batch-consumer

Related

Spring cloud stream consumer poll timeout has expired

We have a small microservice to read from a kafka topic and write to mqtt, using Spring Cloud Stream. It works fine, but after some time we get the following exception and no further messages are published to mqtt:
"2022-10-18 16:22:29.861 WARN 1 --- [d | tellus-mqtt] o.a.k.c.c.internals.ConsumerCoordinator : [Consumer clientId=consumer-mqtt-2, groupId=mqtt] consumer poll timeout has expired. This means the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time processing messages. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches r
Is there a way to programmatically resubscribe or recover from this timeout?
Could we implement a custom health check for the actuator to include the consumer, and then the pod would get automatically restarted by k8s? Something like:
management:
endpoint:
health:
group:
liveness:
include: livenessstate,binders
Where binders is the kafka component.
EDIT: Here is the consumer code (OutputConfig class):
#Configuration
#Log4j2
#Profile("output")
public class OutputConfig {
private final Mqtt3ReactorClient outboundMqttClient;
private final Mqtt3ReactorClient outboundRootMqttClient;
private final MeterUtils meterUtils;
#Autowired
public OutputConfig(#Qualifier("outboundMqttClient") Mqtt3ReactorClient outboundMqttClient,
#Qualifier("outboundRootMqttClient") Mqtt3ReactorClient outboundRootMqttClient,
MeterUtils meterUtils) {
this.outboundMqttClient = outboundMqttClient;
this.outboundRootMqttClient = outboundRootMqttClient;
this.meterUtils = meterUtils;
log.info("Starting Output Config!");
}
#Bean
public Consumer<Flux<Output.GatewayNotification>> kafka() {
return new Output(outboundMqttClient, meterUtils);
}
#Bean
public Consumer<Flux<Output.GatewayNotification>> kafkaRoot() {
return new Output(outboundRootMqttClient, meterUtils);
}
}
And Output class:
#Log4j2
public class Output implements Consumer<Flux<Output.GatewayNotification>> {
public static final HexFormat FORMAT = HexFormat.of().withDelimiter(" ").withUpperCase();
private final Mqtt3ReactorClient outboundMqttClient;
private final MeterUtils meterUtils;
public Output(Mqtt3ReactorClient outboundMqttClient, MeterUtils meterUtils) {
this.outboundMqttClient = outboundMqttClient;
this.meterUtils = meterUtils;
}
#Override
public void accept(Flux<Output.GatewayNotification> gatewayNotifications) {
Flux<Mqtt3Publish> messagesToPublish = gatewayNotifications
.map(gatewayNotification -> Mqtt3Publish.builder()
.topic(gatewayNotification.getAddress())
.qos(MqttQos.AT_LEAST_ONCE)
.payload(Base64.getDecoder().decode(gatewayNotification.getPayload()))
.build());
outboundMqttClient.publish(messagesToPublish)
.doOnNext(publishResult -> {
log.debug(
"Publish acknowledged: " + FORMAT.formatHex(publishResult.getPublish().getPayloadAsBytes()));
meterUtils.incrementCounter("output");
})
.doOnError(error -> log.error(error.getMessage()))
.subscribe();
}
#Data
public static class GatewayNotification {
private String address;
private String payload;
private Long buildingId;
}
HiveMqMqttConfig:
#Configuration
#Log4j2
public class HiveMqMqttConfig {
#Value("${mqtt.endpointUrl}")
private String endpointUrl;
#Value("${mqtt.rootEndpointUrl}")
private String rootEndpointUrl;
#Value("${mqtt.inboundClientId}")
private String inboundClientId;
#Value("${mqtt.outboundClientId}")
private String outboundClientId;
#Value("${mqtt.caFilename:#{null}}")
private String caFilename;
#Value("${mqtt.inboundPrivateKeyFilename:#{null}}")
private String inboundPrivateKeyFilename;
#Value("${mqtt.inboundRootPrivateKeyFilename:#{null}}")
private String inboundRootPrivateKeyFilename;
#Value("${mqtt.inboundClientCertFilename:#{null}}")
private String inboundClientCertFilename;
#Value("${mqtt.inboundRootClientCertFilename:#{null}}")
private String inboundRootClientCertFilename;
#Value("${mqtt.outboundPrivateKeyFilename:#{null}}")
private String outboundPrivateKeyFilename;
#Value("${mqtt.outboundRootPrivateKeyFilename:#{null}}")
private String outboundRootPrivateKeyFilename;
#Value("${mqtt.outboundClientCertFilename:#{null}}")
private String outboundClientCertFilename;
#Value("${mqtt.outboundRootClientCertFilename:#{null}}")
private String outboundRootClientCertFilename;
#Bean(name = "inboundMqttClient")
public Mqtt3ReactorClient inboundMqttClient() {
var client = Mqtt3ReactorClient.from(buildMqtt3Client(endpointUrl, UUID.randomUUID().toString(), caFilename, inboundPrivateKeyFilename, inboundClientCertFilename));
connectClient(client);
return client;
}
#Bean(name = "inboundRootMqttClient")
public Mqtt3ReactorClient inboundRootMqttClient() {
var client = Mqtt3ReactorClient.from(buildMqtt3Client(rootEndpointUrl, UUID.randomUUID().toString(), caFilename, inboundRootPrivateKeyFilename, inboundRootClientCertFilename));
connectClient(client);
return client;
}
#Bean(name = "outboundMqttClient")
public Mqtt3ReactorClient outboundMqttClient() {
var client = Mqtt3ReactorClient.from(buildMqtt3Client(endpointUrl, UUID.randomUUID().toString(), caFilename, outboundPrivateKeyFilename, outboundClientCertFilename));
connectClient(client);
return client;
}
#Bean(name = "outboundRootMqttClient")
public Mqtt3ReactorClient outboundRootMqttClient() {
var client = Mqtt3ReactorClient.from(buildMqtt3Client(rootEndpointUrl, UUID.randomUUID().toString(), caFilename, outboundRootPrivateKeyFilename, outboundRootClientCertFilename));
connectClient(client);
return client;
}
private Mqtt3Client buildMqtt3Client(String endpointUrl, String clientId, String caFilename, String privateKeyFilename, String clientCertFilename) {
log.info("Creating mqtt3 client with client id: {}", clientId);
// endpoint is in the form 'protocol://host:port'
String[] endpointUrlComponents = endpointUrl.split(":");
String host = endpointUrlComponents[1].substring(2);
int port = Integer.parseInt(endpointUrlComponents[2]);
Mqtt3ClientBuilder mqtt3ClientBuilder = Mqtt3Client.builder()
.identifier(clientId)
.serverHost(host)
.serverPort(port)
.automaticReconnectWithDefaultConfig();
try {
if (caFilename != null && !caFilename.isEmpty()) {
boolean isUsingKeyBasedAuthentication = privateKeyFilename != null && !privateKeyFilename.isEmpty() && clientCertFilename != null && !clientCertFilename.isEmpty();
PemFileSslContext context
= isUsingKeyBasedAuthentication
? new PemFileSslContext(getStreamFromClassPathOrLocal(caFilename), getStreamFromClassPathOrLocal(privateKeyFilename), getStreamFromClassPathOrLocal(clientCertFilename))
: new PemFileSslContext(new ClassPathResource(caFilename).getInputStream());
context.getSocketFactory();
mqtt3ClientBuilder
.sslConfig()
.keyManagerFactory(context.getKeyManagerFactory())
.trustManagerFactory(context.getTrustManagerFactory())
.applySslConfig();
}
} catch (IOException | NoSuchAlgorithmException | KeyStoreException | CertificateException |
InvalidKeySpecException | UnrecoverableKeyException | PemFileSslContext.SocketFactoryCreationFailedException e) {
throw new RuntimeException(e);
}
return mqtt3ClientBuilder.build();
}
private InputStream getStreamFromClassPathOrLocal(String uri) throws IOException {
return new ClassPathResource(uri).getInputStream();
}
private void connectClient(Mqtt3ReactorClient mqtt3ReactorClient) {
Mono<Mqtt3ConnAck> connAckSingle = mqtt3ReactorClient.connect();
connAckSingle
.doOnSuccess(connAck -> log.info("Connected, " + connAck.getReturnCode()))
.doOnError(throwable -> log.info("Connection failed, " + throwable.getMessage()))
.subscribe();
}
}
config:
management:
endpoint:
health:
group:
liveness:
include: livenessstate,kafkaConsumers
spring:
cloud:
stream:
kafka:
bindings:
kafka-in-0:
consumer:
configuration:
max.poll.records: 10
kafkaRoot-in-0:
consumer:
configuration:
max.poll.records: 10
function:
definition: kafka;kafkaRoot
bindings:
kafka-in-0:
destination: output
group: mqtt
consumer:
concurrency: 1
kafkaRoot-in-0:
destination: output
group: mqtt-root
consumer:
concurrency: 1
... (certs/endpoints omitted)

RabbitTransactionManager not rolling back at ChainedTransactionManager when an error occurs

I'm trying to use one transaction manager (ChainedTransactionManager) for Rabbit and Kafka, chaining RabbitTransactionManager and KafkaTransactionManager. We intend to achieve a Best effort 1-phase commit.
To test it, the transactional method throws an exception after the 2 operations (sending a message to a Rabbit exchange and publishing and event in Kafka). When running the test, the logs suggest a rollback is initiated but the message ends up in Rabbit anyway.
Notes:
We're using QPid to simulate in-memory RabbitMQ for testing (version 7.1.12)
We're using an in-memory Kafka for testing (spring-kafka-test)
Other relevant frameworks/libraries: spring-cloud-stream
Here's the method where the problem occurs:
#Transactional
public void processMessageAndEvent() {
Message<String> message = MessageBuilder
.withPayload("Message to RabbitMQ")
.build();
outputToRabbitMQExchange.output().send(message);
outputToKafkaTopic.output().send(
withPayload("Message to Kafka")
.setHeader(KafkaHeaders.MESSAGE_KEY, "Kafka message key")
.build()
);
throw new RuntimeException("We want the previous changes to rollback");
}
Here is the main Spring-boot application configuration:
#SpringBootApplication
**#EnableTransactionManagement**
public class MyApplication {
public static void main(String[] args) {
SpringApplication.run(MyApplication.class, args);
}
}
Here is TransactionManager configuration:
#Bean
public RabbitTransactionManager rabbitTransactionManager(ConnectionFactory cf) {
return new RabbitTransactionManager(cf);
}
#Bean(name = "transactionManager")
#Primary
public ChainedTransactionManager chainedTransactionManager(RabbitTransactionManager rtm, BinderFactory binders) {
ProducerFactory<byte[], byte[]> pf = ((KafkaMessageChannelBinder) binders.getBinder("kafka", MessageChannel.class))
.getTransactionalProducerFactory();
KafkaTransactionManager<byte[], byte[]> ktm = new KafkaTransactionManager<>(pf);
ktm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return new ChainedKafkaTransactionManager<>(ktm, rtm);
}
And finally, the relevant configuration in the application.yml file:
spring:
application:
name: my-application
main:
allow-bean-definition-overriding: true
cloud:
stream:
bindings:
source_outputToRabbitMQExchange:
content-type: application/json
destination: outputToRabbitMQExchange
group: ${spring.application.name}
sink_outputToKafkaTopic:
content-type: application/json
destination: outputToKafkaTopic
binder: kafka
rabbit:
bindings:
output_outputToRabbitMQExchange:
producer:
transacted: true
routing-key-expression: headers.myKey
kafka:
bindings:
sink_outputToKafkaTopic:
producer:
transacted: true
binder:
brokers: ${...kafka.hostname}
transaction:
transaction-id-prefix: ${CF_INSTANCE_INDEX}.${spring.application.name}.T
default-binder: rabbit
kafka:
producer:
properties:
max.block.ms: 3000
transaction.timeout.ms: 5000
enable.idempotence: true
retries: 1
acks: all
bootstrap-servers: ${...kafka.hostname}
When we execute the method, we can see the message is still in Rabbit despite the logs saying the transaction is to be rolled back.
Anything we could be missing or misunderstood?
#EnableBinding is deprecated in favor of the newer functional programming model.
That said, I copied your code/config pretty-much as-is (transacted is not a kafka producer binding property) and it works fine for me (Boot 2.4.5, cloud 2020.0.2)...
#SpringBootApplication
#EnableTransactionManagement
#EnableBinding(Bindings.class)
public class So67297869Application {
public static void main(String[] args) {
SpringApplication.run(So67297869Application.class, args);
}
#Bean
public RabbitTransactionManager rabbitTransactionManager(ConnectionFactory cf) {
return new RabbitTransactionManager(cf);
}
#Bean(name = "transactionManager")
#Primary
public ChainedTransactionManager chainedTransactionManager(RabbitTransactionManager rtm, BinderFactory binders) {
ProducerFactory<byte[], byte[]> pf = ((KafkaMessageChannelBinder) binders.getBinder("kafka",
MessageChannel.class))
.getTransactionalProducerFactory();
KafkaTransactionManager<byte[], byte[]> ktm = new KafkaTransactionManager<>(pf);
ktm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return new ChainedKafkaTransactionManager<>(ktm, rtm);
}
#Bean
public ApplicationRunner runner(Foo foo) {
return args -> {
foo.send("test");
};
}
}
interface Bindings {
#Output("source_outputToRabbitMQExchange")
MessageChannel rabbitOut();
#Output("sink_outputToKafkaTopic")
MessageChannel kafkaOut();
}
#Component
class Foo {
#Autowired
Bindings bindings;
#Transactional
public void send(String in) {
bindings.rabbitOut().send(MessageBuilder.withPayload(in)
.setHeader("myKey", "test")
.build());
bindings.kafkaOut().send(MessageBuilder.withPayload(in)
.setHeader(KafkaHeaders.MESSAGE_KEY, "test".getBytes())
.build());
throw new RuntimeException("fail");
}
}
spring:
application:
name: my-application
main:
allow-bean-definition-overriding: true
cloud:
stream:
bindings:
source_outputToRabbitMQExchange:
content-type: application/json
destination: outputToRabbitMQExchange
group: ${spring.application.name}
sink_outputToKafkaTopic:
content-type: application/json
destination: outputToKafkaTopic
binder: kafka
rabbit:
bindings:
source_outputToRabbitMQExchange:
producer:
transacted: true
routing-key-expression: headers.myKey
kafka:
binder:
brokers: localhost:9092
transaction:
transaction-id-prefix: foo.${spring.application.name}.T
default-binder: rabbit
kafka:
producer:
properties:
max.block.ms: 3000
transaction.timeout.ms: 5000
enable.idempotence: true
retries: 1
acks: all
bootstrap-servers: localhost:9092
logging:
level:
org.springframework.transaction: debug
org.springframework.kafka: debug
org.springframework.amqp.rabbit: debug
2021-04-28 09:35:32.488 DEBUG 53253 --- [ main] o.s.a.r.t.RabbitTransactionManager : Initiating transaction rollback
2021-04-28 09:35:32.489 DEBUG 53253 --- [ main] o.s.a.r.connection.RabbitResourceHolder : Rolling back messages to channel: Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5672/,2), conn: Proxy#3c770db4 Shared Rabbit Connection: SimpleConnection#1f736d00 [delegate=amqp://guest#127.0.0.1:5672/, localPort= 63439]
2021-04-28 09:35:32.490 DEBUG 53253 --- [ main] o.s.a.r.t.RabbitTransactionManager : Resuming suspended transaction after completion of inner transaction
2021-04-28 09:35:32.490 DEBUG 53253 --- [ main] o.s.k.t.KafkaTransactionManager : Initiating transaction rollback
2021-04-28 09:35:32.490 DEBUG 53253 --- [ main] o.s.k.core.DefaultKafkaProducerFactory : CloseSafeProducer [delegate=org.apache.kafka.clients.producer.KafkaProducer#38e83838] abortTransaction()
And there is no message in the queue that I bound to the exchange with RK #.
What versions are you using?
EDIT
And here is the equivalent app after removing the deprecations, using the functional model and StreamBridge (same yaml):
#SpringBootApplication
#EnableTransactionManagement
public class So67297869Application {
public static void main(String[] args) {
SpringApplication.run(So67297869Application.class, args);
}
#Bean
public RabbitTransactionManager rabbitTransactionManager(ConnectionFactory cf) {
return new RabbitTransactionManager(cf);
}
#Bean(name = "transactionManager")
#Primary
public ChainedTransactionManager chainedTransactionManager(RabbitTransactionManager rtm, BinderFactory binders) {
ProducerFactory<byte[], byte[]> pf = ((KafkaMessageChannelBinder) binders.getBinder("kafka",
MessageChannel.class))
.getTransactionalProducerFactory();
KafkaTransactionManager<byte[], byte[]> ktm = new KafkaTransactionManager<>(pf);
ktm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return new ChainedKafkaTransactionManager<>(ktm, rtm);
}
#Bean
public ApplicationRunner runner(Foo foo) {
return args -> {
foo.send("test");
};
}
}
#Component
class Foo {
#Autowired
StreamBridge bridge;
#Transactional
public void send(String in) {
bridge.send("source_outputToRabbitMQExchange", MessageBuilder.withPayload(in)
.setHeader("myKey", "test")
.build());
bridge.send("sink_outputToKafkaTopic", MessageBuilder.withPayload(in)
.setHeader(KafkaHeaders.MESSAGE_KEY, "test".getBytes())
.build());
throw new RuntimeException("fail");
}
}

Spring Cloud Sleuth with Reactor Kafka

I'm using Reactor Kafka in a Spring Boot Reactive app, with Spring Cloud Sleuth for distributed tracing.
I've setup Sleuth to use a custom propagation key from a header named "traceId".
I've also customized the log format to print the header in my logs, so a request like
curl -H "traceId: 123456" -X POST http://localhost:8084/parallel
will print 123456 in every log anywhere downstream starting from the Controller.
I would now like this header to be propagated via Kafka too. I understand that Sleuth has built-in instrumentation for Kafka too, so the header should be propagated automatically, however I'm unable to get this to work.
From my Controller, I produce a message onto a Kafka topic, and then have another Kafka consumer pick it up for processing.
Here's my Controller:
#RestController
#RequestMapping("/parallel")
public class BasicController {
private Logger logger = Loggers.getLogger(BasicController.class);
KafkaProducerLoadGenerator generator = new KafkaProducerLoadGenerator();
#PostMapping
public Mono<ResponseEntity> createMessage() {
int data = (int)(Math.random()*100000);
return Flux.just(data)
.doOnNext(num -> logger.info("Generating document for {}", num))
.map(generator::generateDocument)
.flatMap(generator::sendMessage)
.doOnNext(result ->
logger.info("Sent message {}, offset is {} to partition {}",
result.getT2().correlationMetadata(),
result.getT2().recordMetadata().offset(),
result.getT2().recordMetadata().partition()))
.doOnError(error -> logger.error("Error in subscribe while sending message", error))
.single()
.map(tuple -> ResponseEntity.status(HttpStatus.OK).body(tuple.getT1()));
}
}
Here's the code that produces messages on to the Kafka topic
#Component
public class KafkaProducerLoadGenerator {
private static final Logger logger = Loggers.getLogger(KafkaProducerLoadGenerator.class);
private static final String bootstrapServers = "localhost:9092";
private static final String TOPIC = "load-topic";
private KafkaSender<Integer, String> sender;
private static int documentIndex = 0;
public KafkaProducerLoadGenerator() {
this(bootstrapServers);
}
public KafkaProducerLoadGenerator(String bootstrapServers) {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "load-generator");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, IntegerSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
SenderOptions<Integer, String> senderOptions = SenderOptions.create(props);
sender = KafkaSender.create(senderOptions);
}
#NewSpan("generator.sendMessage")
public Flux<Tuple2<DataDocument, SenderResult<Integer>>> sendMessage(DataDocument document) {
return sendMessage(TOPIC, document)
.map(result -> Tuples.of(document, result));
}
public Flux<SenderResult<Integer>> sendMessage(String topic, DataDocument document) {
ProducerRecord<Integer, String> producerRecord = new ProducerRecord<>(topic, document.getData(), document.toString());
return sender.send(Mono.just(SenderRecord.create(producerRecord, document.getData())))
.doOnNext(record -> logger.info("Sent message to partition={}, offset={} ", record.recordMetadata().partition(), record.recordMetadata().offset()))
.doOnError(e -> logger.error("Error sending message " + documentIndex, e));
}
public DataDocument generateDocument(int data) {
return DataDocument.builder()
.header("Load Data")
.data(data)
.traceId("trace"+data)
.timestamp(Instant.now())
.build();
}
}
My consumer looks like this:
#Component
#Scope(scopeName = ConfigurableBeanFactory.SCOPE_PROTOTYPE)
public class IndividualConsumer {
private static final Logger logger = Loggers.getLogger(IndividualConsumer.class);
private static final String bootstrapServers = "localhost:9092";
private static final String TOPIC = "load-topic";
private int consumerIndex = 0;
public ReceiverOptions setupConfig(String bootstrapServers) {
Map<String, Object> properties = new HashMap<>();
properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
properties.put(ConsumerConfig.CLIENT_ID_CONFIG, "load-topic-consumer-"+consumerIndex);
properties.put(ConsumerConfig.GROUP_ID_CONFIG, "load-topic-multi-consumer-2");
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, IntegerDeserializer.class);
properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, DataDocumentDeserializer.class);
return ReceiverOptions.create(properties);
}
public void setIndex(int i) {
consumerIndex = i;
}
#EventListener(ApplicationReadyEvent.class)
public Disposable consumeMessage() {
ReceiverOptions<Integer, DataDocument> receiverOptions = setupConfig(bootstrapServers)
.subscription(Collections.singleton(TOPIC))
.addAssignListener(receiverPartitions -> logger.debug("onPartitionsAssigned {}", receiverPartitions))
.addRevokeListener(receiverPartitions -> logger.debug("onPartitionsRevoked {}", receiverPartitions));
Flux<ReceiverRecord<Integer, DataDocument>> messages = Flux.defer(() -> {
KafkaReceiver<Integer, DataDocument> receiver = KafkaReceiver.create(receiverOptions);
return receiver.receive();
});
Consumer<? super ReceiverRecord<Integer, DataDocument>> acknowledgeOffset = record -> record.receiverOffset().acknowledge();
return messages
.publishOn(Schedulers.newSingle("Parallel-Consumer"))
.doOnError(error -> logger.error("Error in the reactive chain", error))
.delayElements(Duration.ofMillis(100))
.doOnNext(record -> {
logger.info("Consumer {}: Received from partition {}, offset {}, data with index {}",
consumerIndex,
record.receiverOffset().topicPartition(),
record.receiverOffset().offset(),
record.value().getData());
})
.doOnNext(acknowledgeOffset)
.doOnError(error -> logger.error("Error receiving record", error))
.retryBackoff(100, Duration.ofSeconds(5), Duration.ofMinutes(5))
.subscribe();
}
}
I would expect Sleuth to automatically carry over the built-in Brave trace and the custom headers to the consumer, so that the trace covers the entire transaction.
However I have two problems.
The generator bean doesn't get the same trace as the one in the Controller. It uses a different (and new) trace for every message sent.
The trace isn't propagated from Kafka producer to Kafka consumer.
I can resolve #1 above by replacing the generator bean with a simple Java class and instantiating it in the controller. However that means I can't autowire other dependencies, and in any case it doesn't solve #2.
I am able to load an instance of the bean brave.kafka.clients.KafkaTracing so I know it's being loaded by Spring. However, it doesn't look the instrumentation is working. I inspected the content on Kafka using Kafka Tool, and no headers are populated on any message.
In fact the consumer doesn't have a trace at all.
2020-05-06 23:57:32.898 INFO parallel-consumer:local [123-21922,578c510e23567aec,578c510e23567aec] 8180 --- [reactor-http-nio-3] rja.parallelconsumers.BasicController : Generating document for 23965
2020-05-06 23:57:32.907 INFO parallel-consumer:local [52e02d36b59c5acd,52e02d36b59c5acd,52e02d36b59c5acd] 8180 --- [single-11] r.p.kafka.KafkaProducerLoadGenerator : Sent message to partition=17, offset=0
2020-05-06 23:57:32.908 INFO parallel-consumer:local [123-21922,578c510e23567aec,578c510e23567aec] 8180 --- [single-11] rja.parallelconsumers.BasicController : Sent message 23965, offset is 0 to partition 17
2020-05-06 23:57:33.012 INFO parallel-consumer:local [-,-,-] 8180 --- [parallel-5] r.parallelconsumers.IndividualConsumer : Consumer 8: Received from partition load-topic-17, offset 0, data with index 23965
In the log above, [123-21922,578c510e23567aec,578c510e23567aec] is [custom-trace-header, brave traceId, brave spanId]
What am I missing?

Transactional Kafka Producer

I am trying to make make my kafka producer transactional.
I am sending 10 messages .If any error occurs no message should be sent to kafka i.e none or all.
I am using Spring Boot KafkaTemplate.
#Configuration
#EnableKafka
public class KakfaConfiguration {
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> config = new HashMap<>();
// props.put(CommonClientConfigs.SECURITY_PROTOCOL_CONFIG, "SSL");
// props.put(SslConfigs.SSL_TRUSTSTORE_LOCATION_CONFIG,
// appProps.getJksLocation());
// props.put(SslConfigs.SSL_TRUSTSTORE_PASSWORD_CONFIG,
// appProps.getJksPassword());
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.ACKS_CONFIG, acks);
config.put(ProducerConfig.RETRY_BACKOFF_MS_CONFIG, retryBackOffMsConfig);
config.put(ProducerConfig.RETRIES_CONFIG, retries);
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
config.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "prod-99");
return new DefaultKafkaProducerFactory<>(config);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
#Bean(name = "ktm")
public KafkaTransactionManager kafkaTransactionManager() {
KafkaTransactionManager ktm = new KafkaTransactionManager(producerFactory());
ktm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return ktm;
}
}
I am sending 10 messages like below as mentioned in the document. 9 messages should be sent and I message has size over 1MB which gets rejected by Kafka broker due to RecordTooLargeException
https://docs.spring.io/spring-kafka/reference/html/#using-kafkatransactionmanager
#Component
#EnableTransactionManagement
class Sender {
#Autowired
private KafkaTemplate<String, String> template;
private static final Logger LOG = LoggerFactory.getLogger(Sender.class);
#Transactional("ktm")
public void sendThem(List<String> toSend) throws InterruptedException {
List<ListenableFuture<SendResult<String, String>>> futures = new ArrayList<>();
CountDownLatch latch = new CountDownLatch(toSend.size());
ListenableFutureCallback<SendResult<String, String>> callback = new ListenableFutureCallback<SendResult<String, String>>() {
#Override
public void onSuccess(SendResult<String, String> result) {
LOG.info(" message sucess : " + result.getProducerRecord().value());
latch.countDown();
}
#Override
public void onFailure(Throwable ex) {
LOG.error("Message Failed ");
latch.countDown();
}
};
toSend.forEach(str -> {
ListenableFuture<SendResult<String, String>> future = template.send("t_101", str);
future.addCallback(callback);
});
if (latch.await(12, TimeUnit.MINUTES)) {
LOG.info("All sent ok");
} else {
for (int i = 0; i < toSend.size(); i++) {
if (!futures.get(i).isDone()) {
LOG.error("No send result for " + toSend.get(i));
}
}
}
But when I see the topic t_hello_world 9 messages are there. My expectation was to see 0 messages as my producer is transactional.
How can I achieve it?
I am getting the following logs
2020-04-30 18:04:36.036 ERROR 18688 --- [ scheduling-1] o.s.k.core.DefaultKafkaProducerFactory : commitTransaction failed: CloseSafeProducer [delegate=org.apache.kafka.clients.producer.KafkaProducer#1eb5a312, txId=prod-990]
org.apache.kafka.common.KafkaException: Cannot execute transactional method because we are in an error state
at org.apache.kafka.clients.producer.internals.TransactionManager.maybeFailWithError(TransactionManager.java:923) ~[kafka-clients-2.4.1.jar:na]
at org.apache.kafka.clients.producer.internals.TransactionManager.lambda$beginCommit$2(TransactionManager.java:297) ~[kafka-clients-2.4.1.jar:na]
at org.apache.kafka.clients.producer.internals.TransactionManager.handleCachedTransactionRequestResult(TransactionManager.java:1013) ~[kafka-clients-2.4.1.jar:na]
at org.apache.kafka.clients.producer.internals.TransactionManager.beginCommit(TransactionManager.java:296) ~[kafka-clients-2.4.1.jar:na]
at org.apache.kafka.clients.producer.KafkaProducer.commitTransaction(KafkaProducer.java:713) ~[kafka-clients-2.4.1.jar:na]
at org.springframework.kafka.core.DefaultKafkaProducerFactory$CloseSafeProducer.commitTransaction(DefaultKafkaProducerFactory.java
Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
2020-04-30 18:04:36.037 WARN 18688 --- [ scheduling-1] o.s.k.core.DefaultKafkaProducerFactory : Error during transactional operation; producer removed from cache; possible cause: broker restarted during transaction: CloseSafeProducer [delegate=org.apache.kafka.clients.producer.KafkaProducer#1eb5a312, txId=prod-990]
2020-04-30 18:04:36.038 INFO 18688 --- [ scheduling-1] o.a.k.clients.producer.KafkaProducer : [Producer clientId=producer-prod-990, transactionalId=prod-990] Closing the Kafka producer with timeoutMillis = 5000 **ms.
2020-04-30 18:04:36.038 INFO 18688 --- [oducer-prod-990] o.a.k.clients.producer.internals.Sender : [Producer clientId=producer-prod-990, transactionalId=prod-990] Aborting incomplete transaction due to shutdown**
Uncommitted records are written to the log; when a transaction commits or rolls back, an extra record is written to the log with the state of the transaction.
Consumers, by default, see all records, including the uncommitted records (but not the special commit/abort record).
For the console consumer, you need to set the isolation level to read_committed. See the help:
--isolation-level <String> Set to read_committed in order to
filter out transactional messages
which are not committed. Set to
read_uncommitted to read all
messages. (default: read_uncommitted)
If I provide below configurations in yml file will I need to create factory, template and tx bean as given in the example code ?
for the given tx example if I use simple Consumer ( java code) or Kafka Tools will I able to view any record for the above Tx example - hope fully not - Am I correct as per Tx example.
spring:
profiles: local
kafka:
producer:
client-id: book-event-producer-client
bootstrap-servers: localhost:9092,localhost:9093,localhost:9094
key-serializer: org.apache.kafka.common.serialization.IntegerSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
**transaction-id-prefix: tx-${random.uuid}**
properties:
**enable.idempotence: true**
**acks: all**
retries: 2
metadata.max.idle.ms: 10000

KafkaStreamsStateStore not working when the store value is an Avro SpecificRecord

I have a Spring Cloud Kafka Streams application that uses a StateStore in the Processor API, when using a transformer to perform a deduplication.
The state store key-value are of the following types: <String, TransferEmitted>.
When running the application, at the moment of putting a value in the state store (dedupStore.put(key, value)), I get this exception:
Caused by: java.lang.ClassCastException: com.codependent.outboxpattern.account.TransferEmitted cannot be cast to java.lang.String
This is due to the fact that the default value serde for the KafkaStreamsStateStore is a StringSerde.
Thus, I have added the valueSerde parameter in the KafkaStreamsStateStore annotation, indicating the one for a SpecificAvroSerde:
#KafkaStreamsStateStore(name = DEDUP_STORE, type = KafkaStreamsStateStoreProperties.StoreType.KEYVALUE,
valueSerde = "io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde")
Now I get a NullPointerException in AbstractKafkaAvroSerializer.serializeImpl because at id = this.schemaRegistry.getId(subject, schema); schemaRegistry is null:
Caused by: org.apache.kafka.common.errors.SerializationException: Error serializing Avro message
Caused by: java.lang.NullPointerException
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:82)
at io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:53)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:65)
at io.confluent.kafka.streams.serdes.avro.SpecificAvroSerializer.serialize(SpecificAvroSerializer.java:38)
Despite having configured the schema registry as a Spring bean...
#Configuration
class SchemaRegistryConfiguration {
#Bean
fun schemaRegistryClient(#Value("\${spring.cloud.stream.schema-registry-client.endpoint}") endpoint: String): SchemaRegistryClient {
val client = ConfluentSchemaRegistryClient()
client.setEndpoint(endpoint)
return client
}
}
...when Kafka sets up the SpecificAvroSerde it uses the no-params constructor so it doesn't initialize the schema registry client:
public class SpecificAvroSerde<T extends SpecificRecord> implements Serde<T> {
private final Serde<T> inner;
public SpecificAvroSerde() {
this.inner = Serdes.serdeFrom(new SpecificAvroSerializer(), new SpecificAvroDeserializer());
}
public SpecificAvroSerde(SchemaRegistryClient client) {
if (client == null) {
throw new IllegalArgumentException("schema registry client must not be null");
} else {
this.inner = Serdes.serdeFrom(new SpecificAvroSerializer(client), new SpecificAvroDeserializer(client));
}
}
How can I configure this application so that it allows to serialize a StateStore<String, TransferEmitted>?
EXCERPTS FROM THE PROJECT (source available at https://github.com/codependent/kafka-outbox-pattern)
KStream
const val DEDUP_STORE = "dedup-store"
#EnableBinding(KafkaStreamsProcessor::class)
class FraudKafkaStreamsConfiguration(private val fraudDetectionService: FraudDetectionService) {
#KafkaStreamsStateStore(name = DEDUP_STORE, type = KafkaStreamsStateStoreProperties.StoreType.KEYVALUE)
#StreamListener
#SendTo("output")
fun process(#Input("input") input: KStream<String, TransferEmitted>): KStream<String, TransferEmitted> {
return input
.transform(TransformerSupplier { DeduplicationTransformer() }, DEDUP_STORE)
.filter { _, value -> fraudDetectionService.isFraudulent(value) }
}
}
Transformer
#Suppress("UNCHECKED_CAST")
class DeduplicationTransformer : Transformer<String, TransferEmitted, KeyValue<String, TransferEmitted>> {
private lateinit var dedupStore: KeyValueStore<String, TransferEmitted>
private lateinit var context: ProcessorContext
override fun init(context: ProcessorContext) {
this.context = context
dedupStore = context.getStateStore(DEDUP_STORE) as KeyValueStore<String, TransferEmitted>
}
override fun transform(key: String, value: TransferEmitted): KeyValue<String, TransferEmitted>? {
return if (isDuplicate(key)) {
null
} else {
dedupStore.put(key, value)
KeyValue(key, value)
}
}
private fun isDuplicate(key: String) = dedupStore[key] != null
override fun close() {
}
}
application.yml
spring:
application:
name: fraud-service
cloud:
stream:
schema-registry-client:
endpoint: http://localhost:8081
kafka:
streams:
binder:
configuration:
application:
id: fraud-service
default:
key:
serde: org.apache.kafka.common.serialization.Serdes$StringSerde
schema:
registry:
url: http://localhost:8081
bindings:
input:
destination: transfer
contentType: application/*+avro
output:
destination: fraudulent-transfer
contentType: application/*+avro
server:
port: 8086
logging:
level:
org.springframework.cloud.stream: debug
I ran into the same issue and forgot that schema.registry.url needs to be passed in to make sure that you can store Avro records in your State store.
For eg:
#Bean
public StoreBuilder eventStore(Map<String, String> schemaConfig) {
final Duration windowSize = Duration.ofMinutes(DUPLICATION_WINDOW_DURATION);
// retention period must be at least window size -- for this use case, we don't need a longer retention period
// and thus just use the window size as retention time
final Duration retentionPeriod = windowSize;
// We have to specify schema.registry.url here, otherwise schemaRegistry value will end up null
KafkaAvroSerializer serializer = new KafkaAvroSerializer();
KafkaAvroDeserializer deserializer = new KafkaAvroDeserializer();
serializer.configure(schemaConfig, true);
deserializer.configure(schemaConfig, true);
final StoreBuilder<WindowStore<Object, Long>> dedupStoreBuilder = Stores.windowStoreBuilder(
Stores.persistentWindowStore(STORE_NAME,
retentionPeriod,
windowSize,
false
),
Serdes.serdeFrom(serializer, deserializer),
// timestamp value is long
Serdes.Long());
return dedupStoreBuilder;
}
#Bean
public Map<String, String> schemaConfig(#Value("${spring.cloud.stream.schemaRegistryClient.endpoint}") String url) {
return Collections.singletonMap("schema.registry.url", "http://localhost:8081");
}
Here's the application.yml file:
spring:
cloud:
stream:
schemaRegistryClient:
endpoint: http://localhost:8081
After I did this, I was able to get this Store properly configured and didn't see a NullPointerException anymore.