Using Kafka Broker: 1.0.1
spring-kafka: 2.1.6.RELEASE
I'm using a batched consumer with the following settings:
// Other settings are not shown..
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "100");
I use spring listener in the following way:
#KafkaListener(topics = "${topics}", groupId = "${consumer.group.id}")
public void receive(final List<String> data,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) final List<Integer> partitions,
#Header(KafkaHeaders.RECEIVED_TOPIC) Set<String> topics,
#Header(KafkaHeaders.OFFSET) final List<Long> offsets) { // ......code... }
I always find the a few messages remain in the batch and not received in my listener. It appears to be that if the remaining messages are less than a batch size, it isn't consumed (may be in memory and published to my listener). Is there any way to have a setting to auto-flush the batch after a time interval so as to avoid the messages not being flushed?
What's the best way to deal with such kind of situation with a batch consumer?

I just ran a test without any problems...
public class So50370851Application {
public static void main(String[] args) {
SpringApplication.run(So50370851Application.class, args);
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
for (int i = 0; i < 230; i++) {
template.send("so50370851", "foo" + i);
#KafkaListener(id = "foo", topics = "so50370851")
public void listen(List<String> in) {
public NewTopic topic() {
return new NewTopic("so50370851", 1, (short) 1);
Also, the debug logs shows after a while that it is polling and fetching 0 records (and this gets repeated over and over).
That implies the problem is on the sending side.


Reactive program exiting early before sending all messages to Kafka

This is a subsequent question to a previous reactive kafka issue (Issue while sending the Flux of data to the reactive kafka).
I am trying to send some log records to the kafka using the reactive approach. Here is the reactive code sending messages using reactive kafka.
public class LogProducer {
private final KafkaSender<String, String> sender;
public LogProducer(String bootstrapServers) {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "log-producer");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
SenderOptions<String, String> senderOptions = SenderOptions.create(props);
sender = KafkaSender.create(senderOptions);
public void sendMessages(String topic, Flux<Logs.Data> records) throws InterruptedException {
AtomicInteger sentCount = new AtomicInteger(0);
.map(record -> {
LogRecord lrec = record.getRecords().get(0);
String id = lrec.getId();
Thread.sleep(0, 5); // sleep for 5 ns
return SenderRecord.create(new ProducerRecord<>(topic, id,
lrec.toString()), id);
})).doOnNext(res -> sentCount.incrementAndGet()).then()
.doOnError(e -> {
log.error("[FAIL]: Send to the topic: '{}' failed. "
+ e, topic);
.doOnSuccess(s -> {
log.info("[SUCCESS]: {} records sent to the topic: '{}'", sentCount, topic);
public class ExecuteQuery implements Runnable {
private LogProducer producer = new LogProducer("localhost:9092");
public void run() {
Flux<Logs.Data> records = ...
producer.sendMessages(kafkaTopic, records);
// processing related to the messages sent
So even when the Thread.sleep(0, 5); is there, sometimes it does not send all messages to kafka and the program exists early printing the SUCCESS message (log.info("[SUCCESS]: {} records sent to the topic: '{}'", sentCount, topic);). Is there any more concrete way to solve this problem. For example, using some kind of callback, so that thread will wait for all messages to be sent successfully.
I have a spring console application and running ExecuteQuery through a scheduler at fixed rate, something like this
public class Main {
private ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private ExecutorService executor = Executors.newFixedThreadPool(POOL_SIZE);
public static void main(String[] args) {
QueryScheduler scheduledQuery = new QueryScheduler();
scheduler.scheduleAtFixedRate(scheduledQuery, 0, 5, TimeUnit.MINUTES);
class QueryScheduler implements Runnable {
public void run() {
// preprocessing related to time
executor.execute(new ExecuteQuery());
// postprocessing related to time
Your Thread.sleep(0, 5); // sleep for 5 ns does not have any value for a main thread to be blocked, so it exits when it needs and your ExecuteQuery may not finish its job yet.
It is not clear how you start your application, but I recommended Thread.sleep() exactly in a main thread to block. To be precise in the public static void main(String[] args) { method impl.

How to commit the offsets when using KafkaItemReader in spring batch job, once all the messages are processed and written to the .dat file?

I have developed a Spring Batch Job which read from Kafka topic using KafkaItemReader class. I want to commit the offset only when the messages read in defined chunk are Processed and written successfully to an Output .dat file.
public Job kafkaEventReformatjob(
#Qualifier("MaintStep") Step MainStep,
#Qualifier("moveFileToFolder") Step moveFileToFolder,
#Qualifier("compressFile") Step compressFile,
JobExecutionListener listener)
return jobBuilderFactory.get("kafkaEventReformatJob")
.incrementer(new RunIdIncrementer())
Step MainStep(
ItemProcessor<IncomingRecord, List<Record>> flatFileItemProcessor,
ItemWriter<List<Record>> flatFileWriter)
return stepBuilderFactory.get("mainStep")
.<InputRecord, List<Record>> chunk(5000)
//Reader reads all the messages from akfka topic and sending back in form of IncomingRecord.
KafkaItemReader<String, IncomingRecord> kafkaItemReader() {
Properties props = new Properties();
List<Integer> partitions = new ArrayList<>();
return new KafkaItemReaderBuilder<String, IncomingRecord>()
public ItemWriter<List<Record>> writer() {
ListUnpackingItemWriter<Record> listUnpackingItemWriter = new ListUnpackingItemWriter<>();
return listUnpackingItemWriter;
public ItemWriter<Record> flatWriter() {
FlatFileItemWriter<Record> fileWriter = new FlatFileItemWriter<>();
String tempFileName = "abc";
LOGGER.info("Output File name " + tempFileName + " is in working directory ");
String workingDir = service.getWorkingDir().toAbsolutePath().toString();
Path outputFile = Paths.get(workingDir, tempFileName);
fileWriter.setResource(new FileSystemResource(outputFile.toString()));
LOGGER.info("Successfully created the file writer");
return fileWriter;
public TransformProcessor processor() {
return new TransformProcessor();
Writer Class
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
public void afterStep(StepExecution stepExecution) {
public void write(final List<? extends List<Record>> lists) throws Exception {
List<Record> consolidatedList = new ArrayList<>();
for (List<Record> list : lists) {
if (!list.isEmpty() && null != list)
count += consolidatedList.size(); // to count Trailer record count
Item Processor
public List process(IncomingRecord record) {
List<Record> recordList = new ArrayList<>();
if (null != record.getEventName() and a few other conditions inside this section) {
// setting values of Record Class by extracting from the IncomingRecord.
recordList.add(the valid records which matching the condition);
return null;
Synchronizing a read operation and a write operation between two transactional resources (a queue and a database for instance)
is possible by using a JTA transaction manager that coordinates both transaction managers (2PC protocol).
However, this approach is not possible if one of the resources is not transactional (like the majority of file systems). So unless you use
a transactional file system and a JTA transaction manager that coordinates a kafka transaction manager and a file system transaction manager..
you need another approach, like the Compensating Transaction pattern. In your case, the "undo" operation (compensating action) would be rewinding the offset where it was before the failed chunk.

Kafka: Consumer api: Regression test fails if runs in a group (sequentially)

I have implemented a kafka application using consumer api. And I have 2 regression tests implemented with stream api:
To test happy path: by producing data from the test ( into the input topic that the application is listening to) that will be consumed by the application and application will produce data (into the output topic ) that the test will consume and validate against expected output data.
To test error path: behavior is the same as above. Although this time application will produce data into output topic and test will consume from application's error topic and will validate against expected error output.
My code and the regression-test codes are residing under the same project under expected directory structure. Both time ( for both tests) data should have been picked up by the same listener at the application side.
The problem is :
When I am executing the tests individually (manually), each test is passing. However, If I execute them together but sequentially ( for example: gradle clean build ) , only first test is passing. 2nd test is failing after the test-side-consumer polling for data and after some time it gives up not finding any data.
From debugging, it looks like, the 1st time everything works perfectly ( test-side and application-side producers and consumers). However, during the 2nd test it seems that application-side-consumer is not receiving any data ( It seems that test-side-producer is producing data, but can not say that for sure) and hence no data is being produced into the error topic.
What I have tried so far:
After investigations, my understanding is that we are getting into race conditions and to avoid that found suggestions like :
use #DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
Tear off broker after each test ( Please see the ".destry()" on brokers)
use different topic names for each test
I applied all of them and still could not recover from my issue.
I am providing the code here for perusal. Any insight is appreciated.
Code for 1st test (Testing error path):
#DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
partitions = 1,
controlledShutdown = false,
topics = {
brokerProperties = {
public class AbstractIntegrationFailurePathTest {
private final int retryLimit = 0;
protected EmbeddedKafkaBroker embeddedFailurePathKafkaBroker;
//To produce data
protected KafkaTemplate<PreferredMediaMsgKey, SendEmailCmd> inputProducerTemplate;
//To read from output error
protected Consumer<PreferredMediaMsgKey, ErrorCmd> outputErrorConsumer;
//Service to execute notification-preference
protected AdapterStreamProperties projectProerties;
protected void subscribe(Consumer consumer, String topic, int attempt) {
try {
embeddedFailurePathKafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
} catch (ComparisonFailure ex) {
if (attempt < retryLimit) {
subscribe(consumer, topic, attempt + 1);
public class AdapterStreamFailurePathTestConfig {
private EmbeddedKafkaBroker embeddedKafkaBroker;
private String applicationId;
private String groupId;
//Producer of records that the program consumes
public Map<String, Object> sendEmailCmdProducerConfigs() {
Map<String, Object> results = KafkaTestUtils.producerProps(embeddedKafkaBroker);
return results;
public ProducerFactory<PreferredMediaMsgKey, SendEmailCmd> inputProducerFactory() {
return new DefaultKafkaProducerFactory<>(sendEmailCmdProducerConfigs());
public KafkaTemplate<PreferredMediaMsgKey, SendEmailCmd> inputProducerTemplate() {
return new KafkaTemplate<>(inputProducerFactory());
//Consumer of the error output, generated by the program
public Map<String, Object> outputErrorConsumerConfig() {
Map<String, Object> props = KafkaTestUtils.consumerProps(
applicationId, Boolean.TRUE.toString(), embeddedKafkaBroker);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
public Consumer<PreferredMediaMsgKey, ErrorCmd> outputErrorConsumer() {
DefaultKafkaConsumerFactory<PreferredMediaMsgKey, ErrorCmd> rpf =
new DefaultKafkaConsumerFactory<>(outputErrorConsumerConfig());
return rpf.createConsumer(groupId, "notification-failure");
#SpringBootTest(classes = AdapterStreamFailurePathTestConfig.class)
#ActiveProfiles(profiles = "errtest")
public class ErrorPath400Test extends AbstractIntegrationFailurePathTest {
private DataGenaratorForErrorPath400Test datagen;
private AdapterHttpClient httpClient;
private ErroredEmailCmdDeserializer erroredEmailCmdDeserializer;
public void setup() throws InterruptedException {
new GenericResponse(
System.out.println("producer: "+ projectProerties.getInputTopic());
subscribe(outputErrorConsumer , projectProerties.getErrorTopic(), 0);
public void testWithError() throws InterruptedException, InvalidProtocolBufferException, TextFormat.ParseException {
ConsumerRecords<PreferredMediaMsgKeyBuf.PreferredMediaMsgKey, ErrorCommandBuf.ErrorCmd> records;
List<ConsumerRecord<PreferredMediaMsgKeyBuf.PreferredMediaMsgKey, ErrorCommandBuf.ErrorCmd>> outputListOfErrors = new ArrayList<>();
int attempt = 0;
int expectedRecords = 1;
do {
records = KafkaTestUtils.getRecords(outputErrorConsumer);
} while (attempt < expectedRecords && outputListOfErrors.size() < expectedRecords);
//Verify the recipient event stream size
Assert.assertEquals(expectedRecords, outputListOfErrors.size());
//Validate output
public void tearDown() {
2nd test is almost the same in structure. Although this time the test-side-consumer is consuming from application-side-output-topic( instead of error topic). And I named the consumers,broker,producer,topics differently. Like :
#DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
partitions = 1,
controlledShutdown = false,
topics = {
brokerProperties = {
public class AbstractIntegrationSuccessPathTest {
private final int retryLimit = 0;
protected EmbeddedKafkaBroker embeddedKafkaBroker;
//To produce data
protected KafkaTemplate<PreferredMediaMsgKey,SendEmailCmd> sendEmailCmdProducerTemplate;
//To read from output regular topic
protected Consumer<PreferredMediaMsgKey, NotifiedEmailCmd> ouputConsumer;
//Service to execute notification-preference
protected AdapterStreamProperties projectProerties;
protected void subscribe(Consumer consumer, String topic, int attempt) {
try {
embeddedKafkaBroker.consumeFromAnEmbeddedTopic(consumer, topic);
} catch (ComparisonFailure ex) {
if (attempt < retryLimit) {
subscribe(consumer, topic, attempt + 1);
Please let me know if I should provide any more information.,
Don't use a fixed port; leave that out and the embedded broker will use a random port; the consumer configs are set up in KafkaTestUtils to point to the random port.
You shouldn't need to dirty the context after each test method - use a different group.id for each test and a different topic.
In my case the consumer was not closed properly. I had to do :
public void tearDown() {
// shutdown hook to correctly close the streams application
Runtime.getRuntime().addShutdownHook(new Thread(ouputConsumer::close));
to resolve.

What happens to the timestamp of a message in a stream when it's mapped into another stream?

I've an application where I process a stream and convert it into another. Here is a sample:
public void run(final String... args) {
final Serde<Event> eventSerde = new EventSerde();
final Properties props = streamingConfig.getProperties(
props.put(StreamsConfig.DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG, EventTimestampExtractor.class);
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, Event> eventStream = builder.stream(inputStream);
final Serde<Device> deviceSerde = new DeviceSerde();
.map((key, event) -> {
final Device device = modelMapper.map(event, Device.class);
return new KeyValue<>(key, device);
.to("device_topic", Produced.with(Serdes.String(), deviceSerde));
final Topology topology = builder.build();
final KafkaStreams streams = new KafkaStreams(topology, props);
Here are some details about the app:
Spring Boot 1.5.17
Kafka 2.1.0
Kafka Streams 2.1.0
Spring Kafka 1.3.6
Although a timestamp is set in the messages inside the input stream, I also place an implementation of TimestampExtractor to make sure that a proper timestamp is attached into all messages (as other producers may send messages into the same topic).
Within the code, I receive a stream of events and I basically convert them into different objects and eventually route those objects into different streams.
I'm trying to understand whether the initial timestamp I set is still attached to the messages published into device_topic in this particular case.
The receiving end (of device stream) is like this:
#KafkaListener(topics = "device_topic")
public void onDeviceReceive(final Device device, #Header(KafkaHeaders.RECEIVED_TIMESTAMP) final long timestamp) {
log.trace("[{}] Received device: {}", timestamp, device);
Unfortunetely the printed timestamp seems to be wall clock time. Is this the expected behaviour or am I missing something?
Spring Kafka 1.3.x uses a very old 0.11 client; perhaps it doesn't propagate the timestamp. I just tested with Boot 2.1.3 and Spring Kafka 2.2.4 and the timestamp is propagated ok...
public class So54771130Application {
public static void main(String[] args) {
SpringApplication.run(So54771130Application.class, args);
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so54771130", 0, 42L, null, "baz");
public KStream<String, String> stream(StreamsBuilder builder) {
KStream<String, String> stream = builder.stream("so54771130");
.map((k, v) -> {
System.out.println("Mapping:" + v);
return new KeyValue<>(null, "bar");
return stream;
public NewTopic topic1() {
return new NewTopic("so54771130", 1, (short) 1);
public NewTopic topic2() {
return new NewTopic("so54771130-1", 1, (short) 1);
#KafkaListener(id = "so54771130", topics = "so54771130-1")
public void listen(String in, #Header(KafkaHeaders.RECEIVED_TIMESTAMP) long ts) {
System.out.println(in + "#" + ts);

Kafka Consumer committing manually based on a condition.

#kafkaListener consumer is commiting once a specific condition is met. Let us say a topic gets the following data from a producer
"Message 0" at offset[0]
"Message 1" at offset[1]
They are received at the consumer and commited with help of acknowledgement.acknowledge()
then the below messages come to the topic
"Message 2" at offset[2]
"Message 3" at offset[3]
The consumer which is running receive the above data. Here condition fail and the above offsets are not committed.
Even if new data comes at the topic, then also "Message 2" and "Message 3" should be picked up by any consumer from the same consumer group as they are not committed. But this is not happening,the consumer picks up a new message.
When I restart my consumer then I get back Message2 and Message3. This should have happened while the consumers were running.
The code is as follows -:
KafkaConsumerConfig file
public class KafkaConsumerConfig {
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
return factory;
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
public Map<String, Object> consumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, "group1");
propsMap.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
return propsMap;
public Listener listener() {
return new Listener();
Listner Class
public class Listener {
public CountDownLatch countDownLatch0 = new CountDownLatch(3);
private Logger LOGGER = LoggerFactory.getLogger(Listener.class);
static int count0 =0;
#KafkaListener(topics = "abcdefghi", group = "group1", containerFactory = "kafkaListenerContainerFactory")
public void listenPartition0(String data, #Header(KafkaHeaders.RECEIVED_PARTITION_ID) List<Integer> partitions,
#Header(KafkaHeaders.OFFSET) List<Long> offsets, Acknowledgment acknowledgment) throws InterruptedException {
count0 = count0 + 1;
LOGGER.info("start consumer 0");
LOGGER.info("received message via consumer 0='{}' with partition-offset='{}'", data, partitions + "-" + offsets);
if (count0%2 ==0)
LOGGER.info("end of consumer 0");
How can i achieve my desired result?
That's correct. The offset is a number which is pretty easy to keep tracking in the memory on consumer instance. We need offsets commited for newly arrived consumers in the group for the same partitions. That's why it works as expected when you restart an application or when rebalance happens for the group.
To make it working as you would like you should consider to implement ConsumerSeekAware in your listener and call ConsumerSeekCallback.seek() for the offset you would like to star consume from the next poll cycle.
public class Listener implements ConsumerSeekAware {
private final ThreadLocal<ConsumerSeekCallback> seekCallBack = new ThreadLocal<>();
public void registerSeekCallback(ConsumerSeekCallback callback) {
public void listen(...) {
this.seekCallBack.get().seek(topic, partition, 0);