Kafka Consumer not getting invoked when the kafka Producer is set to Sync - apache-kafka

I have a requirement where there are 2 topics to be maintained 1 with synchronous approach and other with an asynchronous way.
The asynchronous works as expected invoking the consumer record, however in the synchronous approach the consumer code is not getting invoked.
Below is the code declared in the config file
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9093");
props.put(ProducerConfig.RETRIES_CONFIG, 3);
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.LINGER_MS_CONFIG, 1);
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
I have enabled autoFlush true here
#Bean( name="KafkaPayloadSyncTemplate")
public KafkaTemplate<String, KafkaPayload> KafkaPayloadSyncTemplate() {
return new KafkaTemplate<String,KafkaPayload>(producerFactory(),true);
}
The control stops thereafter not making any calls to the consumer after returning the recordMetadataResults object
private List<RecordMetadata> sendPayloadToKafkaTopicInSync() throws InterruptedException, ExecutionException {
final List<RecordMetadata> recordMetadataResults = new ArrayList<RecordMetadata>();
KafkaPayload kafkaPayload = constructKafkaPayload();
ListenableFuture<SendResult<String,KafkaPayload>>
future = KafkaPayloadSyncTemplate.send(TestTopic, kafkaPayload);
SendResult<String, KafkaPayload> results;
results = future.get();
recordMetadataResults.add(results.getRecordMetadata());
return recordMetadataResults;
}
Consumer Code
public class KafkaTestListener {
#Autowired
TestServiceImpl TestServiceImpl;
public final CountDownLatch countDownLatch = new CountDownLatch(1);
#KafkaListener(id="POC", topics = "TestTopic", group = "TestGroup")
public void listen(ConsumerRecord<String,KafkaPayload> record, Acknowledgment acknowledgment) {
countDownLatch.countDown();
TestServiceImpl.consumeKafkaMessage(record);
System.out.println("Acknowledgment : " + acknowledgment);
acknowledgment.acknowledge();
}
}
Based on the issue, I have 2 questions
Should we manually call the listen() inside the Listener Class when its a Sync Producer. If Yes, How to do that ?
If the listener(#KafkaListener) get called automatically, what other setup/configurations do I need to add to make this working.
Thanks for the inputs in advance
-Srikant

You should be sure that you use consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); for Consumer Properties.
Not sure what you mean about sync/async, but produce and consume are fully distinguished operations. And you can't affect consumer from your producer side. Because in between there is Kafka Broker.

Related

error handling when consume as a batch in kafka

i have a kafka consumer written in java spring boot (spirng kafka). My consumer is like below.
#RetryableTopic(
attempts = "4",
backoff = #Backoff(delay = 1000, multiplier = 2.0),
autoCreateTopics = "false",
topicSuffixingStrategy = TopicSuffixingStrategy.SUFFIX_WITH_INDEX_VALUE,
include = {ResourceAccessException.class, MyCustomRetryableException.class})
#KafkaListener(topics = "myTopic", groupId = "myGroup", autoStartup = "true", concurrency = "3")
public void consume(#Header(KafkaHeaders.RECEIVED_TOPIC) String topic,
#Header("custom_header_1") String customHeader1,
#Header("custom_header_2") String customHeader2,
#Header("custom_header_3") String customHeader3,
#Header(required = false, name = KafkaHeaders.RECEIVED_MESSAGE_KEY) String key,
#Payload(required = false) String message) {
log.info("-------------------------");
log.info(key);
log.info(message);
log.info("-------------------------");
}
I have used #RetryableTopic annotation to handle errors. I have written a custom exception class and whatever method that throw my custom exception class (MyCustomRetryableException.class), it will retry according to the backoff with number of attempts defined in the retryable annotation. So in here i dont have to do anything. Kafka will simple publish failing messages to the correct dlt topic. All i have to do is create dlt related topic since i have used autoCreateTopics = "false".
Now i'm trying to consume messages in batch wise. I changed my kafka config like below in order to consume in batch wise.
#Bean
public ConsumerFactory<String, Object> consumerFactory() {
Map<String, Object> config = new HashMap<>();
// default configs like bootstrap servers, key and value deserializers are here
config.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "5");
return new DefaultKafkaConsumerFactory<>(config);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setCommitLogLevel(LogIfLevelEnabled.Level.DEBUG);
factory.setBatchListener(true);
return factory;
}
Now that i have added batch listeners, #RetryableTopic is not supported with it. So how can i achieve the publishing failed messages to DLT task which was previously handled by #RetryableTopic ?
If anyone can answer with an example it would be great. Thank you in advance.
See the documentation.
Use a DefaultErrorHandler with a DeadLetterPublishingRecoverer.
Non blocking retries are not supported; the retries will use the configured BackOff.
Throw a BatchListenerFailedException to indicate which record in the batch failed and just that one will be sent to the DLT.
With any other exception, the whole batch will be retried (and sent to the DLT if retries are exhausted).
https://docs.spring.io/spring-kafka/docs/current/reference/html/#retrying-batch-eh

Kafka transactional producer writes messages even if producer commit in not invoked

For me it seems like kafka transactional producer is behaving like a regular producer, the meesages are visible on the topic as send is called for each message. Maybe I am missing something basic. I was expecting the messages to appear in the topic only after the producer commit method is called. In my code below produce.commitTransactions() is commented out but I still get the messages in the topic. Thanks for any pointers.
public static void main(String[] args) {
try {
Properties producerConfig = new Properties();
producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "...");
producerConfig.put(ProducerConfig.CLIENT_ID_CONFIG, "transactional-producer-1");
producerConfig.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); // enable idempotence
producerConfig.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "test-transactional-id-1"); // set transaction id
producerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
producerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
Producer<String, String> producer = new KafkaProducer<>(producerConfig);
producer.initTransactions(); //initiate transactions
try {
producer.beginTransaction(); //begin transactions
for (Integer i = 0; i < 1000; i++) {
producer.send(new ProducerRecord<String, String>("t_test", i.toString(), "value_" + i));
}
// producer.commitTransaction(); //commit
} catch (KafkaException e) {
// For all other exceptions, just abort the transaction and try again.
producer.abortTransaction();
}
producer.close();
} catch (Exception e) {
System.out.println(e.toString());
}
}
When it comes to transactions in Kafka you need to consider a Producer/Consumer pair. A Producer itself, as you have observed, is just producing data and either committing the transaction or not.
Only in interplay with a consumer you can "complete" a transaction by setting the KafkaConsumer configuration isolation.level set to read_committed (by default it is set to read_uncommitted). This configuration is described as:
isolation.level: Controls how to read messages written transactionally. If set to read_committed, consumer.poll() will only return transactional messages which have been committed. If set to read_uncommitted' (the default), consumer.poll() will return all messages, even transactional messages which have been aborted. Non-transactional messages will be returned unconditionally in either mode.

How to check if Kafka Consumer is ready

I have Kafka commit policy set to latest and missing first few messages. If I give a sleep of 20 seconds before starting to send the messages to the input topic, everything is working as desired. I am not sure if the problem is with consumer taking long time for partition rebalancing. Is there a way to know if the consumer is ready before starting to poll ?
You can use consumer.assignment(), it will return set of partitions and verify whether all of the partitions are assigned which are available for that topic.
If you are using spring-kafka project, you can include spring-kafka-test dependancy and use below method to wait for topic assignment , but you need to have container.
ContainerTestUtils.waitForAssignment(Object container, int partitions);
You can do the following:
I have a test that reads data from kafka topic.
So you can't use KafkaConsumer in multithread environment, but you can pass parameter "AtomicReference assignment", update it in consumer-thread, and read it in another thread.
For example, snipped of working code in project for testing:
private void readAvro(String readFromKafka,
AtomicBoolean needStop,
List<Event> events,
String bootstrapServers,
int readTimeout) {
// print the topic name
AtomicReference<Set<TopicPartition>> assignment = new AtomicReference<>();
new Thread(() -> readAvro(bootstrapServers, readFromKafka, needStop, events, readTimeout, assignment)).start();
long startTime = System.currentTimeMillis();
long maxWaitingTime = 30_000;
for (long time = System.currentTimeMillis(); System.currentTimeMillis() - time < maxWaitingTime;) {
Set<TopicPartition> assignments = Optional.ofNullable(assignment.get()).orElse(new HashSet<>());
System.out.println("[!kafka-consumer!] Assignments [" + assignments.size() + "]: "
+ assignments.stream().map(v -> String.valueOf(v.partition())).collect(Collectors.joining(",")));
if (assignments.size() > 0) {
break;
}
try {
Thread.sleep(1_000);
} catch (InterruptedException e) {
e.printStackTrace();
needStop.set(true);
break;
}
}
System.out.println("Subscribed! Wait summary: " + (System.currentTimeMillis() - startTime));
}
private void readAvro(String bootstrapServers,
String readFromKafka,
AtomicBoolean needStop,
List<Event> events,
int readTimeout,
AtomicReference<Set<TopicPartition>> assignment) {
KafkaConsumer<String, byte[]> consumer = (KafkaConsumer<String, byte[]>) queueKafkaConsumer(bootstrapServers, "latest");
System.out.println("Subscribed to topic: " + readFromKafka);
consumer.subscribe(Collections.singletonList(readFromKafka));
long started = System.currentTimeMillis();
while (!needStop.get()) {
assignment.set(consumer.assignment());
ConsumerRecords<String, byte[]> records = consumer.poll(1_000);
events.addAll(CommonUtils4Tst.readEvents(records));
if (readTimeout == -1) {
if (events.size() > 0) {
break;
}
} else if (System.currentTimeMillis() - started > readTimeout) {
break;
}
}
needStop.set(true);
synchronized (MainTest.class) {
MainTest.class.notifyAll();
}
consumer.close();
}
P.S.
needStop - global flag, to stop all running thread if any in case of failure of success
events - list of object, that i want to check
readTimeout - how much time we will wait until read all data, if readTimeout == -1, then stop when we read anything
Thanks to Alexey (I have also voted up), I seemed to have resolved my issue essentially following the same idea.
Just want to share my experience... in our case we using Kafka in request & response way, somewhat like RPC. Request is being sent on one topic and then waiting for response on another topic. Running into a similar issue i.e. missing out first response.
I have tried ... KafkaConsumer.assignment(); repeatedly (with Thread.sleep(100);) but doesn't seem to help. Adding a KafkaConsumer.poll(50); seems to have primed the consumer (group) and receiving the first response too. Tested few times and it consistently working now.
BTW, testing requires stopping application & deleting Kafka topics and, for a good measure, restarted Kafka too.
PS: Just calling poll(50); without assignment(); fetching logic, like Alexey mentioned, may not guarantee that consumer (group) is ready.
You can modify an AlwaysSeekToEndListener (listens only to new messages) to include a callback:
public class AlwaysSeekToEndListener<K, V> implements ConsumerRebalanceListener {
private final Consumer<K, V> consumer;
private Runnable callback;
public AlwaysSeekToEndListener(Consumer<K, V> consumer) {
this.consumer = consumer;
}
public AlwaysSeekToEndListener(Consumer<K, V> consumer, Runnable callback) {
this.consumer = consumer;
this.callback = callback;
}
#Override
public void onPartitionsRevoked(Collection<TopicPartition> partitions) {
}
#Override
public void onPartitionsAssigned(Collection<TopicPartition> partitions) {
consumer.seekToEnd(partitions);
if (callback != null) {
callback.run();
}
}
}
and subscribe with a latch callback:
CountDownLatch initLatch = new CountDownLatch(1);
consumer.subscribe(singletonList(topic), new AlwaysSeekToEndListener<>(consumer, () -> initLatch.countDown()));
initLatch.await(); // blocks until consumer is ready and listening
then proceed to start your producer.
If your policy is set to latest - which takes effect if there are no previously committed offsets - but you have no previously committed offsets, then you should not worry about 'missing' messages, because you're telling Kafka not to care about messages that were sent 'previously' to your consumers being ready.
If you care about 'previous' messages, you should set the policy to earliest.
In any case, whatever the policy, the behaviour you are seeing is transient, i.e. once committed offsets are saved in Kafka, on every restart the consumers will pick up where they left previoulsy
I needed to know if a kafka consumer was ready before doing some testing, so i tried with consumer.assignment(), but it only returned the set of partitions assigned, but there was a problem, with this i cannot see if this partitions assigned to the group had offset setted, so later when i tried to use the consumer it didnĀ“t have offset setted correctly.
The solutions was to use committed(), this will give you the last commited offsets of the given partitions that you put in the arguments.
So you can do something like: consumer.committed(consumer.assignment())
If there is no partitions assigned yet it will return:
{}
If there is partitions assigned, but no offset yet:
{name.of.topic-0=null, name.of.topic-1=null}
But if there is partitions and offset:
{name.of.topic-0=OffsetAndMetadata{offset=5197881, leaderEpoch=null, metadata=''}, name.of.topic-1=OffsetAndMetadata{offset=5198832, leaderEpoch=null, metadata=''}}
With this information you can use something like:
consumer.committed(consumer.assignment()).isEmpty();
consumer.committed(consumer.assignment()).containsValue(null);
And with this information you can be sure that the kafka consumer is ready.

In Spring-Kafka which method does the same thing like onPartitionsRevoked?

I know that in Sping-Kafka we have below methods:
void registerSeekCallback(ConsumerSeekCallback callback);
void onPartitionsAssigned(Map assignments, ConsumerSeekCallback callback);
void onIdleContainer(Map assignments, ConsumerSeekCallback callback);
But which one it does the same thing like the native ConsumerRebalanceListener method onPartitionsRevoked?
"This method will be called before a rebalance operation starts and
after the consumer stops fetching data. It is recommended that offsets
should be committed in this callback to either Kafka or a custom
offset store to prevent duplicate data."
If I want to implement the ConsumerRebalanceListener, how can I pass the KafkaConsumer reference? I only see the Consumer from Spring-Kafka.
=========update======
Hi,Gary when I add RebalanceListener this into the ContainerProperties. I can see that both methods get triggered. however, I got the exception, saying sth like "Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing" Do you have any idea?
=========== update 2 ============
public ConcurrentMessageListenerContainer<Integer, String> createContainer(
ContainerProperties containerProps, IKafkaConsumer iKafkaConsumer) {
Map<String, Object> props = consumerProps();
DefaultKafkaConsumerFactory<Integer, String> cf = new DefaultKafkaConsumerFactory<Integer, String>(props);
**RebalanceListner rebalanceListner = new RebalanceListner(cf.createConsumer());**
CustomKafkaMessageListener ckml = new CustomKafkaMessageListener(iKafkaConsumer, rebalanceListner);
CustomRecordFilter cff = new CustomRecordFilter();
FilteringAcknowledgingMessageListenerAdapter faml = new FilteringAcknowledgingMessageListenerAdapter(ckml, cff, true);
SimpleRetryPolicy retryPolicy = new SimpleRetryPolicy();
retryPolicy.setMaxAttempts(5);
FixedBackOffPolicy backOffPolicy = new FixedBackOffPolicy();
backOffPolicy.setBackOffPeriod(1500); // 1.5 seconds
RetryTemplate rt = new RetryTemplate();
rt.setBackOffPolicy(backOffPolicy);
rt.setRetryPolicy(retryPolicy);
rt.registerListener(ckml);
RetryingAcknowledgingMessageListenerAdapter rml = new RetryingAcknowledgingMessageListenerAdapter(faml, rt);
containerProps.setConsumerRebalanceListener(rebalanceListner);
containerProps.setMessageListener(rml);
containerProps.setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL_IMMEDIATE);
containerProps.setErrorHandler(ckml);
containerProps.setAckOnError(false);
ConcurrentMessageListenerContainer<Integer, String> container = new ConcurrentMessageListenerContainer<>(
cf, containerProps);
container.setConcurrency(1);
return container;
}
You can add a RebalanceListener to the container's ContainerProperties that are passed into the constructor.

Kafka consumer does not start from latest message

I want to have a Kafka Consumer which starts from the latest message in a topic.
Here is the java code:
private static Properties properties = new Properties();
private static KafkaConsumer<String, String> consumer;
static
{
properties.setProperty("bootstrap.servers","localhost");
properties.setProperty("enable.auto.commit", "true");
properties.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("group.id", "test");
properties.setProperty("auto.offset.reset", "latest");
consumer = new KafkaConsumer<>(properties);
consumer.subscribe(Collections.singletonList("mytopic"));
}
#Override
public StreamHandler call() throws Exception
{
while (true)
{
ConsumerRecords<String, String> consumerRecords = consumer.poll(200);
Iterable<ConsumerRecord<String, String>> records = consumerRecords.records("mytopic");
for(ConsumerRecord<String, String> rec : records)
{
System.out.println(rec.value());
}
}
}
Although the value for auto.offset.reset is latest, but the consumer starts form messages which belong to 2 days ago and then it catches up with the latest messages.
What am I missing?
Have you run this same code before with the same group.id? The auto.offset.reset parameter is only used if there is not an existing offset already stored for your consumer. So if you've run the example previously, say two days ago, and then you run it again, it will start from the last consumed position.
Use seekToEnd() if you would like to manually go to the end of the topic.
See https://stackoverflow.com/a/32392174/1392894 for a slightly more thorough discussion of this.
If you want to manually control the position of your offsets you need to set enable.auto.commit = false.
If you want to position all offsets to the end of each partition then call seekToEnd()
https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#seekToEnd(java.util.Collection)