Kafka - Retries and Recovery not invoked - apache-kafka

I have implemented a custom listener i.e. not using the #KafkaListener annotation due to the fact that my application need to dynamically lsiten to topics. I have seen suggestions of moving to Spring kafka 2.6.x but I can't upgrade due to the fact that I am stuck (at least for now) with Spring 5.1.X.RELEASE which means I can only use Spring-kafka 2.2.x.
My question is, how can I achieve retry, recovery and error handling with Spring-kafka 2.2.x?
ConcurrentKafkaListenerContainerFactory listenerFactory = new ConcurrentKafkaListenerContainerFactory();
listenerFactory.setConsumerFactory(consumerFactory);
configurer.configure(listenerFactory, consumerFactory);
listenerFactory.setConcurrency(listenerConcurrency);
listenerFactory.setStatefulRetry(Boolean.TRUE);
listenerFactory.setBatchListener(isBatchListener);
listenerFactory.getContainerProperties().setTransactionManager(chainedKafkaTransactionManager);
listenerFactory.getContainerProperties().setAckOnError(false);
listenerFactory.setRetryTemplate(retryTemplate(kafkaEhCacheRetryManager));
listenerFactory.setRecoveryCallback(kafkaRecoverer);
My retry template looks like:
RetryTemplate retryTemplate(EhCacheCacheManager kafkaEhCacheRetryManager) {
ExponentialBackOffPolicy exponentialBackOffPolicy = new ExponentialBackOffPolicy();
exponentialBackOffPolicy.setInitialInterval(initialIntervalForRetries);
exponentialBackOffPolicy.setSleeper(new ThreadWaitSleeper());
exponentialBackOffPolicy.setMultiplier(2.0);
exponentialBackOffPolicy.setMaxInterval(maxIntervalForRetries);
RetryTemplate retryTemplate = new RetryTemplate();
retryTemplate.setRetryContextCache(new KafkaEhRetryContextCache(kafkaEhCacheRetryManager));
retryTemplate.setBackOffPolicy(exponentialBackOffPolicy);
// KafkaTransactionalRetryPolicy extends SimpleRetryPolicy
KafkaTransactionalRetryPolicy retryPolicy = new KafkaTransactionalRetryPolicy(kafkaTemplate);
retryPolicy.setMaxAttempts(maxAttempts);
retryTemplate.setRetryPolicy(kafkaTransactionalRetryPolicy);
return retryTemplate;
}
My listener looks like:
public class MyKafkaListener implements MessageListener<String, String> {
#Override
#Transactional(value = "chainedKafkaTransactionManager")
public void onMessage(final ConsumerRecord<String, String> consumerRecord){
throw new RuntimeException("thrown out of out anger");
}
}
with this config:
spring.kafka:
bootstrap-servers: ${service.kakfa.host}
admin:
client-id: test-consumers
bootstrap-servers: ${service.kakfa.host}
consumer:
bootstrap-servers: ${service.kakfa.host}
group-id: local-consumers
client-id: local-consumers
auto-offset-reset: earliest
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
enable-auto-commit: false
isolation-level: read_committed
producer:
bootstrap-servers: ${service.kakfa.host}
client-id: local-producer
acks: all
retries: 3
transaction-id-prefix: local-producer-tx-
properties:
enable.idempotence: true
transactional.id: tran-id-1-
max.in.flight.requests.per.connection: 5
listener.concurrency: 1
I have seen several examples on StackOverflow on how to do this, but none has worked so far.

The retry mechanism you are trying to use only applies to #KafakListeners. It is built into a listener adapter used to call the listener POJO.
In newer versions, the SeekToCurrentErrorHandler and DefaultAfterRollbackProcessor have a back off (Since 2.3), eliminating the need for a retry template at the listener level, in favor of retry at the container level.
With your own listener you would have to use a RetryTemplate within the listener code itself.
BTW, Spring 5.1.x is no longer supported https://github.com/spring-projects/spring-framework/wiki/Spring-Framework-Versions#supported-versions

Related

Quarkus: Smallrye Kafka configure channels for distinct bootstrap servers using diferent KEYS and PASSWORD

My goal is to produce events in 2 different channels using distinct bootstrap servers using jaas configuration using SASL_SSL, but I am not able to setup the channels to authenticate correctly on the bootstrap servers.
I've tried the following setup
mp.messaging.outgoing.channel1.bootstrap.servers=${KAFKA1}
mp.messaging.outgoing.channel1.ssl.endpoint-identification-algorithm=https
mp.messaging.outgoing.channel1.security.protocol=SASL_SSL
mp.messaging.outgoing.channel1.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="${KEY1}" password="${PWD1}";
mp.messaging.outgoing.channel1.sasl.mechanism=PLAIN
mp.messaging.outgoing.channel2.bootstrap.servers=${KAFKA2}
mp.messaging.outgoing.channel2.ssl.endpoint-identification-algorithm=https
mp.messaging.outgoing.channel2.security.protocol=SASL_SSL
mp.messaging.outgoing.channel2.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="${KEY2}" password="${PWD2}";
mp.messaging.outgoing.channel2.sasl.mechanism=PLAIN
Using this setup I am receiving errors on the channel initialization.
2023-01-18 13:57:10 13:57:10.445 ERROR [Application] (main) Failed to start application (with profile prod): java.lang.IllegalArgumentException: Could not find a 'KafkaClient' entry in the JAAS configuration. System property 'java.security.auth.login.config' is not set
2023-01-18 13:57:10 at org.apache.kafka.common.security.JaasContext.defaultContext(JaasContext.java:131)
2023-01-18 13:57:10 at org.apache.kafka.common.security.JaasContext.load(JaasContext.java:96)
2023-01-18 13:57:10 at org.apache.kafka.common.security.JaasContext.loadClientContext(JaasContext.java:82)
2023-01-18 13:57:10 at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:167)
2023-01-18 13:57:10 at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:81)
2023-01-18 13:57:10 at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:105)
The initial setup used the default bootstrap settings and it worked fined until KAFKA was brought to the equation.
kafka.bootstrap.servers='${KAFKA1}'
kafka.ssl.endpoint-identification-algorithm=https
kafka.security.protocol=SASL_SSL
kafka.sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="${Key1}" password="${PWD1}";
kafka.sasl.mechanism=PLAIN
I've tried the described on the issue and I am not being able figure out how to configure the channels to authenticate to 2 different bootstrap servers.
As the error says, you need a JAAS conf set in your JVM system properties
-Djava.security.auth.login.config=/path/to/kafka-jaas.conf
After reading the documentation [https://quarkus.io/guides/kafka] the option
1
kafka-configuration Allows the config1
Therefore the solution is to implement a provider bean
#ApplicationScoped
#Slf4j
public class KafkaConfigBean {
#Produces
#Identifier("kafka1")
public Map<String, Object> kafkaConfig() {
HashMap<String, Object> config = new HashMap<>();
config.put("security.protocol", "SASL_SSL");
config.put("sasl.mechanism", "PLAIN");
String saslConfig = String.format("org.apache.kafka.common.security.plain.PlainLoginModule required username=\"%S\" password=\"%s\";",
System.getenv("KEY1"), System.getenv("PWD1"));
config.put("sasl.jaas.config", saslConfig);
log.info("Initialized Kafka 1 config");
return config;
}
#Produces
#Identifier("kafka2")
public Map<String, Object> kafkaConfigPTT() {
HashMap<String, Object> config = new HashMap<>();
config.put("security.protocol", "SASL_SSL");
config.put("sasl.mechanism", "PLAIN");
String saslConfig = String.format("org.apache.kafka.common.security.plain.PlainLoginModule required username=\"%S\" password=\"%s\";",
System.getenv("KEY2"), System.getenv("PWD2"));
config.put("sasl.jaas.config", saslConfig);
log.info("Initialized Kafka 2 config");
return config;
}
}
Thus resulting on the following configuration file
mp.messaging.outgoing.channel1.bootstrap.servers=\${KAFKA1}
mp.messaging.outgoing.channel1.kafka-configuration=kafka1
mp.messaging.outgoing.channel2.bootstrap.servers=\${KAFKA2}
mp.messaging.outgoing.channel2.kafka-configuration=kafka2

Exactly once semantic with spring kafka

Im trying to test my exactly once configuration to make sure all the configs i set are correct and the behavior is as i expect
I seem to encounter a problem with duplicate sends
public static void main(String[] args) {
MessageProducer producer = new ProducerBuilder()
.setBootstrapServers("kafka:9992")
.setKeySerializerClass(StringSerializer.class)
.setValueSerializerClass(StringSerializer.class)
.setProducerEnableIdempotence(true).build();
MessageConsumer consumer = new ConsumerBuilder()
.setBootstrapServers("kafka:9992")
.setIsolationLevel("read_committed")
.setTopics("someTopic2")
.setGroupId("bla")
.setKeyDeserializerClass(StringDeserializer.class)
.setValueDeserializerClass(MapDeserializer.class)
.setConsumerMessageLogic(new ConsumerMessageLogic() {
#Override
public void onMessage(ConsumerRecord cr, Acknowledgment acknowledgment) {
producer.sendMessage(new TopicPartition("someTopic2", cr.partition()),
new OffsetAndMetadata(cr.offset() + 1),"something1", "im in transaction", cr.key());
acknowledgment.acknowledge();
}
}).build();
consumer.start();
}
this is my "test", you can assume the builder puts the right configuration.
ConsumerMessageLogic is a class that handles the "process" part of the read-process-write that the exactly once semantic is supporting
inside the producer class i have a send message method like so:
public void sendMessage(TopicPartition topicPartition, OffsetAndMetadata offsetAndMetadata,String sendToTopic, V message, PK partitionKey) {
try {
KafkaRecord<PK, V> partitionAndMessagePair = producerMessageLogic.prepareMessage(topicPartition.topic(), partitionKey, message);
if(kafkaTemplate.getProducerFactory().transactionCapable()){
kafkaTemplate.executeInTransaction(operations -> {
sendMessage(message, partitionKey, sendToTopic, partitionAndMessagePair, operations);
operations.sendOffsetsToTransaction(
Map.of(topicPartition, offsetAndMetadata),"bla");
return true;
});
}else{
sendMessage(message, partitionKey, topicPartition.topic(), partitionAndMessagePair, kafkaTemplate);
}
}catch (Exception e){
failureHandler.onFailure(partitionKey, message, e);
}
}
I create my consumer like so:
/**
* Start the message consumer
* The record event will be delegate on the onMessage()
*/
public void start() {
initConsumerMessageListenerContainer();
container.start();
}
/**
* Initialize the kafka message listener
*/
private void initConsumerMessageListenerContainer() {
// start a acknowledge message listener to allow the manual commit
messageListener = consumerMessageLogic::onMessage;
// start and initialize the consumer container
container = initContainer(messageListener);
// sets the number of consumers, the topic partitions will be divided by the consumers
container.setConcurrency(springConcurrency);
springContainerPollTimeoutOpt.ifPresent(p -> container.getContainerProperties().setPollTimeout(p));
if (springAckMode != null) {
container.getContainerProperties().setAckMode(springAckMode);
}
}
private ConcurrentMessageListenerContainer<PK, V> initContainer(AcknowledgingMessageListener<PK, V> messageListener) {
return new ConcurrentMessageListenerContainer<>(
consumerFactory(props),
containerProperties(messageListener));
}
when i create my producer i create it with UUID as transaction prefix like so
public ProducerFactory<PK, V> producerFactory(boolean isTransactional) {
ProducerFactory<PK, V> res = new DefaultKafkaProducerFactory<>(props);
if(isTransactional){
((DefaultKafkaProducerFactory<PK, V>) res).setTransactionIdPrefix(UUID.randomUUID().toString());
((DefaultKafkaProducerFactory<PK, V>) res).setProducerPerConsumerPartition(true);
}
return res;
}
Now after everything is set up, i bring 2 instances up on a topic with 2 partitions
each instance get 1 partitions from the consumed topic.
i send a message and wait in debug for the transaction timeout ( to simulate loss of connection)
in instance A, once the timeout passes the other instance( instance B) automatically processes the record and send it to the target topic cause a re-balance occurred
So far so good.
Now when i release the break point on instance A, it says its re-balancing and couldn't commit, but i still see another output record in my destination topic.
My expectation was that instance A wont continue its work once i release the breakpoint as the record was already processed.
Am i doing something wrong?
Can this scenario be achieved?
edit 2:
after garys remarks about the execute in transaction, i get the duplicate record if i freeze one of the instances till the timeout and release it after the other instance processed the record, then the freezed instance process and produce the same record to the out put topic...
public static void main(String[] args) {
MessageProducer producer = new ProducerBuilder()
.setBootstrapServers("kafka:9992")
.setKeySerializerClass(StringSerializer.class)
.setValueSerializerClass(StringSerializer.class)
.setProducerEnableIdempotence(true).build();
MessageConsumer consumer = new ConsumerBuilder()
.setBootstrapServers("kafka:9992")
.setIsolationLevel("read_committed")
.setTopics("someTopic2")
.setGroupId("bla")
.setKeyDeserializerClass(StringDeserializer.class)
.setValueDeserializerClass(MapDeserializer.class)
.setConsumerMessageLogic(new ConsumerMessageLogic() {
#Override
public void onMessage(ConsumerRecord cr, Acknowledgment acknowledgment) {
producer.sendMessage("something1", "im in transaction");
}
}).build();
consumer.start(producer.getProducerFactory());
}
the new sendMessage method in the producer without executeInTransaction
public void sendMessage(V message, PK partitionKey, String topicName) {
try {
KafkaRecord<PK, V> partitionAndMessagePair = producerMessageLogic.prepareMessage(topicName, partitionKey, message);
sendMessage(message, partitionKey, topicName, partitionAndMessagePair, kafkaTemplate);
}catch (Exception e){
failureHandler.onFailure(partitionKey, message, e);
}
}
as well as i changed the consumer container creation to have a transaction manager with the same producerfactory as suggested
/**
* Initialize the kafka message listener
*/
private void initConsumerMessageListenerContainer(ProducerFactory<PK,V> producerFactory) {
// start a acknowledge message listener to allow the manual commit
acknowledgingMessageListener = consumerMessageLogic::onMessage;
// start and initialize the consumer container
container = initContainer(acknowledgingMessageListener, producerFactory);
// sets the number of consumers, the topic partitions will be divided by the consumers
container.setConcurrency(springConcurrency);
springContainerPollTimeoutOpt.ifPresent(p -> container.getContainerProperties().setPollTimeout(p));
if (springAckMode != null) {
container.getContainerProperties().setAckMode(springAckMode);
}
}
private ConcurrentMessageListenerContainer<PK, V> initContainer(AcknowledgingMessageListener<PK, V> messageListener, ProducerFactory<PK,V> producerFactory) {
return new ConcurrentMessageListenerContainer<>(
consumerFactory(props),
containerProperties(messageListener, producerFactory));
}
#NonNull
private ContainerProperties containerProperties(MessageListener<PK, V> messageListener, ProducerFactory<PK,V> producerFactory) {
ContainerProperties containerProperties = new ContainerProperties(topics);
containerProperties.setMessageListener(messageListener);
containerProperties.setTransactionManager(new KafkaTransactionManager<>(producerFactory));
return containerProperties;
}
my expectation is that the broker once receiving the processed record from the freezed instance, that it'll know that that record was already handled by another instance as it contains the exact same metadata ( or is it? i mean, the PID will be different, but should it be different?)
Maybe the scenario im looking for is not even supported in the current exactly once support kafka and spring provides...
if i have 2 instances of read-process-write - that means i have 2 producers with 2 different PID's.
Now when i freeze one of the instances, when the unfrozen instance gets the record process responsibility due to a rebalance, it will send the record with its own PID and a sequence in the metadata.
Now when i release the frozen instance, he sends the same record but with its own PID, so theres no way the broker will know its a duplicate...
Am i wrong? how can i avoid this scenario? i though the re-balance stops the instance and doesnt let it complete its process ( where he produce the duplicate record) cause he no longer has responsibility about that record
Adding the logs:
frozen instance: you can see the freeze time at 10:53:34 and i released it at 10:54:02 ( rebalance time is 10 secs)
2020-06-16 10:53:34,393 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
Created new Producer: CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]
2020-06-16 10:53:34,394 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]
beginTransaction()
2020-06-16 10:53:34,395 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.doBegin:149] Created
Kafka transaction on producer [CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]]
2020-06-16 10:54:02,157 INFO [${sys:spring.application.name}] [kafka-
coordinator-heartbeat-thread | bla]
[o.a.k.c.c.i.AbstractCoordinator.:] [Consumer clientId=consumer-bla-1,
groupId=bla] Group coordinator X.X.X.X:9992 (id: 2147482646 rack:
null) is unavailable or invalid, will attempt rediscovery
2020-06-16 10:54:02,181 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Sending offsets to transaction: {someTopic2-
0=OffsetAndMetadata{offset=23, leaderEpoch=null, metadata=''}}
2020-06-16 10:54:02,189 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0] [i.i.k.s.p.SimpleSuccessHandler.:] Sent
message=[im in transaction] with offset=[252] to topic something1
2020-06-16 10:54:02,193 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0] [o.a.k.c.p.i.TransactionManager.:]
[Producer clientId=producer-b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0, transactionalId=b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0] Discovered group coordinator
X.X.X.X:9992 (id: 1001 rack: null)
2020-06-16 10:54:02,263 INFO [${sys:spring.application.name}] [kafka-
coordinator-heartbeat-thread | bla]
[o.a.k.c.c.i.AbstractCoordinator.:] [Consumer clientId=consumer-bla-1,
groupId=bla] Discovered group coordinator 192.168.144.1:9992 (id:
2147482646 rack: null)
2020-06-16 10:54:02,295 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.processCommit:740]
Initiating transaction commit
2020-06-16 10:54:02,296 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]
commitTransaction()
2020-06-16 10:54:02,299 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Commit list: {}
2020-06-16 10:54:02,301 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.a.k.c.c.i.AbstractCoordinator.:] [Consumer
clientId=consumer-bla-1, groupId=bla] Attempt to heartbeat failed for
since member id consumer-bla-1-b3ad1c09-ad06-4bc4-a891-47a2288a830f is
not valid.
2020-06-16 10:54:02,302 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.a.k.c.c.i.ConsumerCoordinator.:] [Consumer
clientId=consumer-bla-1, groupId=bla] Giving away all assigned
partitions as lost since generation has been reset,indicating that
consumer is no longer part of the group
2020-06-16 10:54:02,302 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.a.k.c.c.i.ConsumerCoordinator.:] [Consumer
clientId=consumer-bla-1, groupId=bla] Lost previously assigned
partitions someTopic2-0
2020-06-16 10:54:02,302 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.l.ConcurrentMessageListenerContainer.info:279]
bla: partitions lost: [someTopic2-0]
2020-06-16 10:54:02,303 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.l.ConcurrentMessageListenerContainer.info:279]
bla: partitions revoked: [someTopic2-0]
2020-06-16 10:54:02,303 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Commit list: {}
The regular instance that takes over the partation and produce the record after a rebalance
2020-06-16 10:53:46,536 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
Created new Producer: CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]
2020-06-16 10:53:46,537 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]
beginTransaction()
2020-06-16 10:53:46,539 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.doBegin:149] Created
Kafka transaction on producer [CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]]
2020-06-16 10:53:46,556 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Sending offsets to transaction: {someTopic2-
0=OffsetAndMetadata{offset=23, leaderEpoch=null, metadata=''}}
2020-06-16 10:53:46,563 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0] [i.i.k.s.p.SimpleSuccessHandler.:] Sent
message=[im in transaction] with offset=[250] to topic something1
2020-06-16 10:53:46,566 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0] [o.a.k.c.p.i.TransactionManager.:]
[Producer clientId=producer-1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0, transactionalId=1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0] Discovered group coordinator
X.X.X.X:9992 (id: 1001 rack: null)
2020-06-16 10:53:46,668 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.processCommit:740]
Initiating transaction commit
2020-06-16 10:53:46,669 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]
commitTransaction()
2020-06-16 10:53:46,672 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Commit list: {}
2020-06-16 10:53:51,673 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Received: 0 records
I noticed they both note the exact same offset to commit
Sending offsets to transaction: {someTopic2-0=OffsetAndMetadata{offset=23, leaderEpoch=null, metadata=''}}
i thought when they try to commit the exact same thing the broker will abort one of the transactions...
I also noticed that if i reduce the transaction.timeout.ms to just 2 seconds, it doesnt abort the transaction no matter how long i freeze the instance on debug...
maybe the timer of transaction.timeout.ms starts only after i send the message?
You must not use executeInTransaction at all - see its Javadocs; it is used when there is no active transaction or if you explicitly don't want an operation to participate in an existing transaction.
You need to add a KafkaTransactionManager to the listener container; it must have a reference to same ProducerFactory as the template.
Then, the container will start the transaction and, if successful, send the offset to the transaction.

How to handle UnkownProducerIdException

We are having some troubles with Spring Cloud and Kafka, at sometimes our microservice throws an UnkownProducerIdException, this is caused if the parameter transactional.id.expiration.ms is expired in the broker side.
My question, could it be possible to catch that exception and retry the failed message? If yes, what could be the best option to handle it?
I have took a look at:
- https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89068820
- Kafka UNKNOWN_PRODUCER_ID exception
We are using Spring Cloud Hoxton.RELEASE version and Spring Kafka version 2.2.4.RELEASE
We are using AWS Kafka solution so we can't set a new value on that property I mentioned before.
Here is some trace of the exception:
2020-04-07 20:54:00.563 ERROR 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] The broker returned org.apache.kafka.common.errors.UnknownProducerIdException: This exception is raised by the broker if it could not locate the producer metadata associated with the producerId in question. This could happen if, for instance, the producer's records were deleted because their retention time had elapsed. Once the last records of the producerId are removed, the producer's metadata is removed from the broker, and future appends by the producer will return this exception. for topic-partition test.produce.another-2 with producerId 35000, epoch 0, and sequence number 8
2020-04-07 20:54:00.563 INFO 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] ProducerId set to -1 with epoch -1
2020-04-07 20:54:00.565 ERROR 5188 --- [ad | producer-2] o.s.k.support.LoggingProducerListener : Exception thrown when sending a message with key='null' and payload='{...}' to topic <some-topic>:
To reproduce this exception:
- I have used the confluent docker images and set the environment variable KAFKA_TRANSACTIONAL_ID_EXPIRATION_MS to 10 seconds so I wouldn't wait too much for this exception to be thrown.
- In another process, send one by one in interval of 10 seconds 1 message in the topic the java will listen.
Here is a code example:
File Bindings.java
import org.springframework.cloud.stream.annotation.Input;
import org.springframework.cloud.stream.annotation.Output;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.SubscribableChannel;
public interface Bindings {
#Input("test-input")
SubscribableChannel testListener();
#Output("test-output")
MessageChannel testProducer();
}
File application.yml (don't forget to set the environment variable KAFKA_HOST):
spring:
cloud:
stream:
kafka:
binder:
auto-create-topics: true
brokers: ${KAFKA_HOST}
transaction:
producer:
error-channel-enabled: true
producer-properties:
acks: all
retry.backoff.ms: 200
linger.ms: 100
max.in.flight.requests.per.connection: 1
enable.idempotence: true
retries: 3
compression.type: snappy
request.timeout.ms: 5000
key.serializer: org.apache.kafka.common.serialization.StringSerializer
consumer-properties:
session.timeout.ms: 20000
max.poll.interval.ms: 350000
enable.auto.commit: true
allow.auto.create.topics: true
auto.commit.interval.ms: 12000
max.poll.records: 5
isolation.level: read_committed
configuration:
auto.offset.reset: latest
bindings:
test-input:
# contentType: text/plain
destination: test.produce
group: group-input
consumer:
maxAttempts: 3
startOffset: latest
autoCommitOnError: true
queueBufferingMaxMessages: 100000
autoCommitOffset: true
test-output:
# contentType: text/plain
destination: test.produce.another
group: group-output
producer:
acks: all
debug: true
The listener handler:
#SpringBootApplication
#EnableBinding(Bindings.class)
public class PocApplication {
private static final Logger log = LoggerFactory.getLogger(PocApplication.class);
public static void main(String[] args) {
SpringApplication.run(PocApplication.class, args);
}
#Autowired
private BinderAwareChannelResolver binderAwareChannelResolver;
#StreamListener(Topics.TESTLISTENINPUT)
public void listen(Message<?> in, String headerKey) {
final MessageBuilder builder;
MessageChannel messageChannel;
messageChannel = this.binderAwareChannelResolver.resolveDestination("test-output");
Object payload = in.getPayload();
builder = MessageBuilder.withPayload(payload);
try {
log.info("Event received: {}", in);
if (!messageChannel.send(builder.build())) {
log.error("Something happend trying send the message! {}", in.getPayload());
}
log.info("Commit success");
} catch (UnknownProducerIdException e) {
log.error("UnkownProducerIdException catched ", e);
} catch (KafkaException e) {
log.error("KafkaException catched ", e);
}catch (Exception e) {
System.out.println("Commit failed " + e.getMessage());
}
}
}
Regards
} catch (UnknownProducerIdException e) {
log.error("UnkownProducerIdException catched ", e);
To catch exceptions there, you need to set the sync kafka producer property (https://cloud.spring.io/spring-cloud-static/spring-cloud-stream-binder-kafka/3.0.3.RELEASE/reference/html/spring-cloud-stream-binder-kafka.html#kafka-producer-properties). Otherwise, the error comes back asynchronously
You should not "eat" the exception there; it must be thrown back to the container so the container will roll back the transaction.
Also,
}catch (Exception e) {
System.out.println("Commit failed " + e.getMessage());
}
The commit is performed by the container after the stream listener returns to the container so you will never see a commit error here; again, you must let the exception propagate back to the container.
The container will retry the delivery according to the consumer binding's retry configuration.
probably you can also use the callback function to handle the exception, not sure about the springframework lib for kafka, if using kafka client, you can something like this:
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if(e != null) {
e.printStackTrace();
if(e.getClass().equals(UnknownProducerIdException.class)) {
logger.info("UnknownProducerIdException caught");
while(--retry>=0) {
send(topic,partition,msg);
}
}
} else {
logger.info("The offset of the record we just sent is: " + metadata.offset());
}
}
});

Spring Cloud Stream Kafka Binder: "Invalid transition attempted from state IN_TRANSACTION to state IN_TRANSACTION"

I'm trying to do some PoC on "exactly one delivery" concept with Apache Kafka using Spring Cloud Streams + Kafka Binding.
I installed Apache Kafka "kafka_2.11-1.0.0" and defined "transactionIdPrefix" in the producer, which I understand is the only thing I need to do to enable transactions in Spring Kafka, but when I do that and run simple Source & Sink bindings within the same application, I see some messages are received and printed in the consumer and some get an error.
For example, message #6 received:
[49] Received message [Payload String content=FromSource1 6][Headers={kafka_offset=1957, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer#6695c9a9, kafka_timestampType=CREATE_TIME, my-transaction-id=my-id-6, id=302cf3ef-a154-fd42-6b43-983778e275dc, kafka_receivedPartitionId=0, contentType=application/json, kafka_receivedTopic=test10, kafka_receivedTimestamp=1514384106395, timestamp=1514384106419}]
but message #7 had an error "Invalid transition attempted from state IN_TRANSACTION to state IN_TRANSACTION":
2017-12-27 16:15:07.405 ERROR 7731 --- [ask-scheduler-4] o.s.integration.handler.LoggingHandler : org.springframework.messaging.MessageHandlingException: error occurred in message handler [org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder$ProducerConfigurationMessageHandler#7d3bbc0b]; nested exception is org.apache.kafka.common.KafkaException: TransactionalId my-transaction-3: Invalid transition attempted from state IN_TRANSACTION to state IN_TRANSACTION, failedMessage=GenericMessage [payload=byte[13], headers={my-transaction-id=my-id-7, id=d31656af-3286-99b0-c736-d53aa57a5e65, contentType=application/json, timestamp=1514384107399}]
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:153)
at org.springframework.cloud.stream.binder.AbstractMessageChannelBinder$SendingHandler.handleMessageInternal(AbstractMessageChannelBinder.java:575)
What does this error means?
Is Something missing with my configuration?
Do I need to implement my the Source or the Sink differently when transactions is enabled?
UPDATE:
I opened an issue on the project's github, please refer to the discussion there.
Couldn't find an example of how to use Spring Cloud Stream with Kafka binding + Trasanctions enabled
To reproduce, need to created a simple maven project with spring boot version "2.0.0.M5" and "spring-cloud-stream-dependencies" version "Elmhurst.M3", and to created a simple application with this configuration:
server:
port: 8082
spring:
kafka:
producer:
retries: 5555
acks: "all"
cloud:
stream:
kafka:
binder:
autoAddPartitions: true
transaction:
transactionIdPrefix: my-transaction-
bindings:
output1:
destination: test10
group: test111
binder: kafka
input1:
destination: test10
group: test111
binder: kafka
consumer:
partitioned: true
I also created simple Source and Sink classes:
#EnableBinding(SampleSink.MultiInputSink.class)
public class SampleSink {
#StreamListener(MultiInputSink.INPUT1)
public synchronized void receive1(Message<?> message) {
System.out.println("["+Thread.currentThread().getId()+"] Received message " + message);
}
public interface MultiInputSink {
String INPUT1 = "input1";
#Input(INPUT1)
SubscribableChannel input1();
}
}
and:
#EnableBinding(SampleSource.MultiOutputSource.class)
public class SampleSource {
AtomicInteger atomicInteger = new AtomicInteger(1);
#Bean
#InboundChannelAdapter(value = MultiOutputSource.OUTPUT1, poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public synchronized MessageSource<String> messageSource1() {
return new MessageSource<String>() {
public Message<String> receive() {
String message = "FromSource1 "+atomicInteger.getAndIncrement();
m.put("my-transaction-id","my-id-"+ UUID.randomUUID());
return new GenericMessage(message, new MessageHeaders(m));
}
};
}
public interface MultiOutputSource {
String OUTPUT1 = "output1";
#Output(OUTPUT1)
MessageChannel output1();
}
}
I opened a ticket on that to the project's github.
Please refer to the answers and discussion there:
https://github.com/spring-cloud/spring-cloud-stream/issues/1166
but the first answer there was:
The binder doesn't currently support producer-initiated transactions.
Transactions are supported for processors (where the consumer starts
the transaction and the producer participates in that transaction).
You should be able to use spring-kafka directly to initiate a
transaction on the producer side when there is no consumer.

Error reading field 'topic_metadata' in Kafka

I am trying to connect to my broker on aws with auto.create.topics.enable=true in my server.properties file. But when I am trying to connect to broker using Java client producer I am getting the following error.
1197 [kafka-producer-network-thread | producer-1] ERROR
org.apache.kafka.clients.producer.internals.Sender - Uncaught error in
kafka producer I/O thread:
org.apache.kafka.common.protocol.types.SchemaException: Error reading
field 'topic_metadata': Error reading array of size 619631, only 37
bytes available at
org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73) at
org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:380)
at
org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
at
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:229)
at
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:134)
at java.lang.Thread.run(Unknown Source)
Following is my Client producer code.
public static void main(String[] argv){
Properties props = new Properties();
props.put("bootstrap.servers", "http://XX.XX.XX.XX:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("linger.ms", 0);
props.put("buffer.memory", 33554432);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("block.on.buffer.full",true);
Producer<String, String> producer = new KafkaProducer<String, String>(props);
try{ for(int i = 0; i < 10; i++)
{ producer.send(new ProducerRecord<String, String>("topicjava", Integer.toString(i), Integer.toString(i)));
System.out.println("Tried sending:"+i);}
}
catch (Exception e){
e.printStackTrace();
}
producer.close();
}
Can someone help me resolve this?
I have faced the similar issue. The problem here is, when there is a mismatch between kafka clients version in pom file and kafka server is different.
I was using kafka clients 0.10.0.0_1 but the kafka server was still in 0.9.0.0. So i upgraded the kafka server version to 10 the issue got resolved.
<dependency>
<groupId>org.apache.servicemix.bundles</groupId>
<artifactId>org.apache.servicemix.bundles.kafka-clients</artifactId>
<version>0.10.0.0_1</version>
</dependency>
Looks like I was setting wrong properties at the client side also my server.properties file had properties which were not meant for the client I was using.So I decided to change the java client to version 0.9.0 using maven.
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.11</artifactId>
<version>0.9.0.0</version>
</dependency>
my server.properties file is as below.
broker.id=0
port=9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
log.cleaner.enable=false
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=9000
delete.topic.enable=true
advertised.host.name=<aws public Ip>
advertised.port=9092
My producer code looks like
import java.util.Properties;
import java.util.concurrent.ExecutionException;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.common.serialization.StringSerializer;
public class HelloKafkaProducer
{
public static void main(String args[]) throws InterruptedException, ExecutionException {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"IP:9092");
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,StringSerializer.class.getName());
KafkaProducer<String,String> producer = new KafkaProducer<String,String>(props);
boolean sync = false;
String topic="loader1";
String key = "mykey";
for(int i=0;i<1000;i++)
{
String value = "myvaluehasbeensent"+i+i;
ProducerRecord<String,String> producerRecord = new ProducerRecord<String,String>(topic, key, value);
if (sync) {
producer.send(producerRecord).get();
} else {
producer.send(producerRecord);
}
}
producer.close();
}
}
Make sure that you use the correct versions. Lets say you use following maven dependecy:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.8_2.10</artifactId>
<version>${flink.version}</version>
</dependency>
So the artifact equals: flink-connector-kafka-0.8_2.10
Now check if you use the correct Kafka version:
cd /KAFKA_HOME/libs
Now find kafka_YOUR-VERSION-sources.jar.
In my case I have kafka_2.10-0.8.2.1-sources.jar. So it works fine! :)
If you use different versions, just change maven dependecies OR download the correct kafka version.
I solved this problem by editing
/etc/hosts file
Check your hosts file that if zookeeper or other brokers's ip are not in this file .