The precedence of polling of the consumer in batch mode - apache-kafka

spring.kafka.consumer.max-poll-records = 2000 //each record of size 5kb takes 100 ms so to process entire batch takes 500 sec i.e 8 min 20 sec = 900000 //15 min = 600000 //10 min = 10485760 //10MB
spring.kafka.consumer.fetch-min-size = 5242880 // fetch.min.bytes - 5mb
spring.kafka.listener.concurrency = 1
With the above configuration the consumer is continously polling the records its respective intervals like sometimes 2 mins, 3 mins eventhough has been set it as as 10 min.
How this is happening ? May i know the precedence of polling ? is it based on the max-poll-records, ,max.partition.fetch.bytes or fetch-min-size ?
I tried to maximize the below attribute values to fetch maximum records, but still seeing only 200 to 300 records are getting processed. = 900000 //15 min
spring.kafka.consumer.max-poll-records = 2000 //each record of size 5kb takes 100 ms so to process entire batch, takes 2000*100 ms =200sec i.e 3 min 20 sec which is way less than the max poll interval (10min) = 600000 //10 min = 20971520
spring.kafka.consumer.fetch-min-size = 104857600 // fetch.min.bytes - 20mb
Am i doing something wrong?

Assuming the records are exactly 5kb, the poll will return when 1k records are received or 10 minutes elapse, whichever happens first.
You will only ever get max.poll.records if they are immediately available.
It looks to me like these "smaller" batches are the remnants of the previous fetch; with this code:
public class So68201599Application {
private static final Logger log = LoggerFactory.getLogger(So68201599Application.class);
public static void main(String[] args) {, args);
#KafkaListener(id = "so68201599", topics = "so68201599", autoStartup = "false")
public void listen(ConsumerRecords<?, ?> in) {"" + in.count() + "\n"
+ in.partitions().stream()
.map(part -> "" + part.partition() + "(" + in.records(part).size() + ")")
public NewTopic topic() {
public ApplicationRunner runner(KafkaTemplate<String, String> template, KafkaListenerEndpointRegistry registry) {
String msg = new String(new byte[1024*5]);
return args -> {
List<Future<?>> futures = new ArrayList<>();
IntStream.range(0, 9000).forEach(i -> futures.add(template.send("so68201599", msg)));
futures.forEach(fut -> {
try {
fut.get(10, TimeUnit.SECONDS);
catch (InterruptedException e) {
catch (ExecutionException e) {
catch (TimeoutException e) {
spring.kafka.consumer.max-poll-records = 2000 = 10000
spring.kafka.consumer.fetch-min-size = 10240000
I get
2021-07-08 15:04:11.131 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 2000
[1(201), 0(201), 5(201), 4(201), 3(201), 2(201), 9(201), 8(201), 7(201), 6(191)]
2021-07-08 15:04:11.137 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 10
2021-07-08 15:04:21.170 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 1809
[1(201), 0(201), 5(201), 4(201), 3(201), 2(201), 9(201), 8(201), 7(201)]
2021-07-08 15:04:21.214 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 2000
[1(201), 0(201), 5(201), 4(201), 3(201), 2(201), 9(201), 8(201), 7(201), 6(191)]
2021-07-08 15:04:21.215 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 10
2021-07-08 15:04:31.248 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 1809
[1(201), 0(201), 5(201), 4(201), 3(201), 2(201), 9(201), 8(201), 7(201)]
2021-07-08 15:04:41.267 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 1083
[1(27), 0(87), 5(189), 4(93), 3(114), 2(129), 9(108), 8(93), 7(42), 6(201)]
2021-07-08 15:04:51.276 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 201
2021-07-08 15:05:01.279 INFO 45792 --- [o68201599-0-C-1] com.example.demo.So68201599Application : 78
I don't know why the second and the three penultimate fetches timed out, though.


Kafka transaction: Receiving CONCURRENT_TRANSACTIONS on AddPartitionsToTxnRequest

I am trying to publish in a transaction a message on 16 Kafka partitions on 7 brokers.
The flow is like this:
open transaction
write a message to 16 partitions
commit transaction
sleep 25 ms
Sometimes the transaction takes over 1 second to complete, with an average of 50 ms.
After enabling trace logging on producer's side, I noticed the following error:
TRACE internals.TransactionManager [kafka-producer-network-thread | producer-1] - [Producer clientId=producer-1, transactionalId=cma-2]
Received transactional response AddPartitionsToTxnResponse(errors={modelapp-ecb-0=CONCURRENT_TRANSACTIONS, modelapp-ecb-9=CONCURRENT_TRANSACTIONS, modelapp-ecb-10=CONCURRENT_TRANSACTIONS, modelapp-ecb-11=CONCURRENT_TRANSACTIONS, modelapp-ecb-12=CONCURRENT_TRANSACTIONS, modelapp-ecb-13=CONCURRENT_TRANSACTIONS, modelapp-ecb-14=CONCURRENT_TRANSACTIONS, modelapp-ecb-15=CONCURRENT_TRANSACTIONS, modelapp-ecb-1=CONCURRENT_TRANSACTIONS, modelapp-ecb-2=CONCURRENT_TRANSACTIONS, modelapp-ecb-3=CONCURRENT_TRANSACTIONS, modelapp-ecb-4=CONCURRENT_TRANSACTIONS, modelapp-ecb-5=CONCURRENT_TRANSACTIONS, modelapp-ecb-6=CONCURRENT_TRANSACTIONS, modelapp-ecb-=CONCURRENT_TRANSACTIONS, modelapp-ecb-8=CONCURRENT_TRANSACTIONS}, throttleTimeMs=0)
for request (type=AddPartitionsToTxnRequest, transactionalId=cma-2, producerId=59003, producerEpoch=0, partitions=[modelapp-ecb-0, modelapp-ecb-9, modelapp-ecb-10, modelapp-ecb-11, modelapp-ecb-12, modelapp-ecb-13, modelapp-ecb-14, modelapp-ecb-15, modelapp-ecb-1, modelapp-ecb-2, modelapp-ecb-3, modelapp-ecb-4, modelapp-ecb-5, modelapp-ecb-6, modelapp-ecb-7, modelapp-ecb-8])
The Kafka producer retries sending AddPartitionsToTxnRequest(s) several times until it succeeds, but this leads to delays.
The code looks like this:
Properties producerProperties = PropertiesUtil.readPropertyFile(_producerPropertiesFile);
_producer = new KafkaProducer<>(producerProperties);
_producerService = Executors.newSingleThreadExecutor(new NamedThreadFactory(getClass().getSimpleName()));
_producerService.submit(() -> {
while (!Thread.currentThread().isInterrupted()) {
try {
for (int partition = 0; partition < _numberOfPartitions; partition++)
_producer.send(new ProducerRecord<>(_producerTopic, partition, KafkaRecordKeyFormatter.formatControlMessageKey(_messageNumber, token), EMPTY_BYTE_ARRAY));
} catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException | UnsupportedVersionException e) {
} catch (KafkaException e) {
} catch (InterruptedException e) {...}
Looking to broker's code, it seems there are 2 cases when this error is thrown, but I cannot tell why I get there
object TransactionCoordinator {
def handleAddPartitionsToTransaction(...): Unit = {
if (txnMetadata.pendingTransitionInProgress) {
// return a retriable exception to let the client backoff and retry
} else if (txnMetadata.state == PrepareCommit || txnMetadata.state == PrepareAbort) {
Thanks in advance for help!
Later edit:
Enabling trace logging on broker we were able to see that broker sends to the producer END_TXN response before transaction reaches state CompleteCommit. The producer is able to start a new transaction, which is rejected by the broker while it is still in the transition PrepareCommit -> CompleteCommit.

partition count reduced at sink during kafka stream record forward

I am using kafka stream for processing few kafka records , I have two node one is for doing some transformation and other is a final sink.
My the topics are INTER_TOPIC and FINAL_TOPIC are having 20 partitions each. and my producer which writing to INTER_TOPIC is writing in key value and partition-er is round robin.
below is the code at my inter transform node.
public void streamHandler() {
Properties props = getKafkaProperties();
StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> processStream ="INTER_TOPIC",
Consumed.with(Serdes.String(), Serdes.String()));
//processStream.peek((key,value)->System.out.println("key :"+key+" value :"+value));, value) -> getTransformer().transform(key, value)).filter((key,value)->filteroutFailedRequest(key,value)).to("FINAL_TOPIC", Produced.with(Serdes.String(), Serdes.String()));
KafkaStreams IStreams = new KafkaStreams(, props);
IStreams.setUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() {
public void uncaughtException(Thread t, Throw-able e) {
logger.error("Thread Name :" + t.getName() + " Error while processing:", e);
try {;
} catch (IOException e) {
logger.error("Failed streaming ",e);
but my sink is getting data in 2 partitions only, but I have 20 stream thread configured, and I verified my producer is writing to all 20 partitions, How to know that my transform node forwarding to all 20 partitions of my FINAL_TOPIC
30 Sep 2019 10:39:41,416 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-3] Received
30 Sep 2019 10:39:41,416 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-4] Received
30 Sep 2019 10:39:41,416 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-3] Received
30 Sep 2019 10:39:41,416 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-4] Received
30 Sep 2019 10:40:57,427 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-3] Received
30 Sep 2019 10:40:57,427 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-4] Received
30 Sep 2019 10:40:57,427 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-3] Received
30 Sep 2019 10:40:57,427 INFO c.j.m.s.StreamHandler [289] [streams-user-61a77203-9afc-4c66-843d-94c20a509793-StreamThread-4] Received
and partition-er is round robin
Why do you think that the partitioner is round-robin? By default, Kafka Streams applies a hash-based partitioning based on the key.
If you want to change the default partitioner, you can implement interface StreamPartitioner and pass it via:
Produced.with(Serdes.String(), Serdes.String())

How to monitor kafka consumer lag for transactional consumers

There is a useful metric for monitoring Kafka Consumer lag in spring-kafka called kafka_consumer_records_lag_max_records. But this metric is not working for transactional consumers. Is there specific configuration to enable lag metric for transactional consumers?
I have configured my consumer group to work with isolation level read_committed and the metric contains kafka_consumer_records_lag_max_records{client_id="listener-1",} -Inf
What do you mean by "doesn't work"? I just tested it and it works fine...
public class So56540759Application {
public static void main(String[] args) throws IOException {
ConfigurableApplicationContext context =, args);;
private MetricName lagNow;
private MetricName lagMax;
private MeterRegistry meters;
#KafkaListener(id = "so56540759", topics = "so56540759", clientIdPrefix = "so56540759",
properties = "max.poll.records=1")
public void listen(String in, Consumer<?, ?> consumer) {
Map<MetricName, ? extends Metric> metrics = consumer.metrics();
Metric currentLag = metrics.get(this.lagNow);
Metric maxLag = metrics.get(this.lagMax);
+ " lag " + currentLag.metricName().name() + ":" + currentLag.metricValue()
+ " max " + maxLag.metricName().name() + ":" + maxLag.metricValue());
Gauge gauge = meters.get("kafka.consumer.records.lag.max").gauge();
System.out.println("lag-max in Micrometer: " + gauge.value());
public NewTopic topic() {
return new NewTopic("so56540759", 1, (short) 1);
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
Set<String> tags = new HashSet<>();
FetcherMetricsRegistry registry = new FetcherMetricsRegistry(tags, "consumer");
MetricNameTemplate temp = registry.recordsLagMax;
this.lagMax = new MetricName(,, temp.description(),
Collections.singletonMap("client-id", "so56540759-0"));
temp = registry.partitionRecordsLag;
Map<String, String> tagsMap = new LinkedHashMap<>();
tagsMap.put("client-id", "so56540759-0");
tagsMap.put("topic", "so56540759");
tagsMap.put("partition", "0");
this.lagNow = new MetricName(,, temp.description(), tagsMap);
return args -> IntStream.range(0, 10).forEach(i -> template.send("so56540759", "foo" + i));
2019-06-11 12:13:45.803 INFO 32187 --- [ main] o.a.k.clients.consumer.ConsumerConfig : ConsumerConfig values: = 5000
auto.offset.reset = earliest
bootstrap.servers = [localhost:9092]
check.crcs = true = so56540759-0 = 540000 = 60000 = false
exclude.internal.topics = true
fetch.max.bytes = 52428800 = 500
fetch.min.bytes = 1 = so56540759 = 3000
interceptor.classes = [] = true
isolation.level = read_committed
... = 60000
2019-06-11 12:13:45.840 INFO 32187 --- [o56540759-0-C-1] o.s.k.l.KafkaMessageListenerContainer : partitions assigned: [so56540759-0]
foo0 lag records-lag:9.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo1 lag records-lag:8.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo2 lag records-lag:7.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo3 lag records-lag:6.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo4 lag records-lag:5.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo5 lag records-lag:4.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo6 lag records-lag:3.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo7 lag records-lag:2.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo8 lag records-lag:1.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
foo9 lag records-lag:0.0 max records-lag-max:9.0
lag-max in Micrometer: 9.0
I do see it going to -Infinity in the MBean if a transaction times out - i.e. if the listener doesn't exit within 60 seconds in my test.

Producing from localhost to Kafka in HDP Sandbox 2.6.5 not working

I am writing Kafka client producer as:
public class BasicProducerExample {
public static void main(String[] args){
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, 0);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
props.put("batch.size","16384");// maximum size of message
Producer<String, String> producer = new KafkaProducer<String, String>(props);
TestCallback callback = new TestCallback();
Random rnd = new Random();
for (long i = 0; i < 2 ; i++) {
//ProducerRecord<String, String> data = new ProducerRecord<String, String>("dke", "key-" + i, "message-"+i );
//Topci and Message
ProducerRecord<String, String> data = new ProducerRecord<String, String>("dke", ""+i);
producer.send(data, callback);
private static class TestCallback implements Callback {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
if (e != null) {
System.out.println("Error while producing message to topic :" + recordMetadata);
} else {
String message = String.format("sent message to topic:%s partition:%s offset:%s", recordMetadata.topic(), recordMetadata.partition(), recordMetadata.offset());
Error while producing message to topic :null
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
Broker port: localhost:6667 is working.
In your property for BOOTSTRAP_SERVERS_CONFIG, try changing the port number to 6667.
I use Apache Kafka on a Hortonworks (HDP 2.X release) installation. The error message encountered means that Kafka producer was not able to push the data to the segment log file. From a command-line console, that would mean 2 things :
You are using incorrect port for the brokers
Your listener config in are not working
If you encounter the error message while writing via scala api, additionally check connection to kafka cluster using telnet <cluster-host> <broker-port>
NOTE: If you are using scala api to create topic, it takes sometime for the brokers to know about the newly created topic. So, immediately after topic creation, the producers might fail with the error Failed to update metadata after 60000 ms.
I did the following checks in order to resolve this issue:
The first difference once I check via Ambari is that Kafka brokers listen on port 6667 on HDP 2.x (apache kafka uses 9092).
Next, use the ip instead of localhost.
I executed netstat -na | grep 6667
tcp 0 0* LISTEN
tcp 1 0 CLOSE_WAIT
tcp 0 0 TIME_WAIT
So, I modified the producer call to user the IP and not localhost:
./ --broker-list --topic rdl_test_2
To monitor if you have new records being written, monitor the /kafka-logs folder.
cd /kafka-logs/<topic name>/
ls -lart
-rw-r--r--. 1 kafka hadoop 0 Feb 10 07:24 00000000000000000000.log
-rw-r--r--. 1 kafka hadoop 10485756 Feb 10 07:24 00000000000000000000.timeindex
-rw-r--r--. 1 kafka hadoop 10485760 Feb 10 07:24 00000000000000000000.index
Once, the producer successfully writes, the segment log-file 00000000000000000000.log will grow in size.
See the size below:
-rw-r--r--. 1 kafka hadoop 10485760 Feb 10 07:24 00000000000000000000.index
-rw-r--r--. 1 kafka hadoop **45** Feb 10 09:16 00000000000000000000.log
-rw-r--r--. 1 kafka hadoop 10485756 Feb 10 07:24 00000000000000000000.timeindex
At this point, you can run the
./ --bootstrap-server --topic rdl_test_2 --from-beginning
response is hello world
After this step, if you want to produce messages via the Scala API's , then change the listeners value(from localhost to a public IP) and restart Kafka brokers via Ambari:
A Sample producer will be as follows:
package com.scalakafka.sample
import java.util.Properties
import java.util.concurrent.TimeUnit
import org.apache.kafka.clients.producer.{ProducerRecord, KafkaProducer}
import org.apache.kafka.common.serialization.{StringSerializer, StringDeserializer}
class SampleKafkaProducer {
case class KafkaProducerConfigs(brokerList: String = "") {
val properties = new Properties()
val batchsize :java.lang.Integer = 1
properties.put("bootstrap.servers", brokerList)
properties.put("key.serializer", classOf[StringSerializer])
properties.put("value.serializer", classOf[StringSerializer])
// properties.put("serializer.class", classOf[StringDeserializer])
properties.put("batch.size", batchsize)
// properties.put("", 1)
// properties.put("buffer.memory", 33554432)
val producer = new KafkaProducer[String, String](KafkaProducerConfigs().properties)
def produce(topic: String, messages: Iterable[String]): Unit = {
messages.foreach { m =>
println(s"Sending $topic and message is $m")
val result = producer.send(new ProducerRecord(topic, m)).get()
println(s"the write status is ${result}")
producer.close(10L, TimeUnit.MILLISECONDS)
Hope this helps someone.

Not able to see kafka messages produced from Java api in the consumer console

I was trying to produce some messages and put into topic, and then getch the same from console consumer.
Code used :
import java.util.Date;
import java.util.Properties;
import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;
public class SimpleProducer {
private static Producer<String,String> producer;
public SimpleProducer() {
Properties props = new Properties();
// Set the broker list for requesting metadata to find the lead broker
//This specifies the serializer class for keys
props.put("serializer.class", "kafka.serializer.StringEncoder");
// 1 means the producer receives an acknowledgment once the lead replica
// has received the data. This option provides better durability as the
// client waits until the server acknowledges the request as successful.
props.put("request.required.acks", "1");
ProducerConfig config = new ProducerConfig(props);
producer = new Producer<String, String>(config);
public static void main(String[] args) {
int argsCount = args.length;
if (argsCount == 0 || argsCount == 1)
throw new IllegalArgumentException(
"Please provide topic name and Message count as arguments");
String topic = (String) args[0];
String count = (String) args[1];
int messageCount = Integer.parseInt(count);
System.out.println("Topic Name - " + topic);
System.out.println("Message Count - " + messageCount);
SimpleProducer simpleProducer = new SimpleProducer();
simpleProducer.publishMessage(topic, messageCount);
private void publishMessage(String topic, int messageCount) {
for (int mCount = 0; mCount < messageCount; mCount++) {
String runtime = new Date().toString();
String msg = "Message Publishing Time - " + runtime;
// Creates a KeyedMessage instance
KeyedMessage<String, String> data =
new KeyedMessage<String, String>(topic, msg);
// Publish the message
// Close producer connection with broker.
Topic Name - test
Message Count - 10
log4j:WARN No appenders could be found for logger
log4j:WARN Please initialize the log4j system properly.
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
Message Publishing Time - Tue Feb 16 02:00:56 IST 2016
from command line i supply the name of the topic as "kafkatopic" and count of the messages "10". The program runs fine without ant exception but when i try to see the messages from console, they do not appear. The topic is created.
bin/ --zookeeper localhost:2181 --topic kafkatopic --from-beginning
Can you plaease help as what went wrong!!
Two things I would like to point out:
1) You don't specify --zookeeper here - you should you --bootstrap-server argument.
2) You should see what the file say about listeners and advertised.listener. You should correctly point them to the brokers.
I hope this helps.