Kafka reactor - How to disable KAFKA consumer being autostarted? - apache-kafka

Below is my KAFKA consumer
#Bean("kafkaConfluentInboundReceiver")
#ConditionalOnProperty(value = "com.demo.kafka.core.inbound.confluent.topic-name",
matchIfMissing = false)
public KafkaReceiver<String, Object> kafkaInboundReceiver() {
ReceiverOptions<String, Object> receiverOptions = ReceiverOptions.create(inboundConsumerConfigs());
receiverOptions.schedulerSupplier(() -> Schedulers
.fromExecutorService(applicationContext.getBean("inboundKafkaExecutorService", ExecutorService.class)));
receiverOptions.maxCommitAttempts(kafkaProperties.getKafka().getCore().getMaxCommitAttempts());
return KafkaReceiver.create(receiverOptions.addAssignListener(Collection::iterator)
.subscription(Collections.singleton(
kafkaProperties.getKafka()
.getCore().getInbound().getConfluent()
.getTopicName()))
.commitInterval(Duration.ZERO).commitBatchSize(0));
}
My KAFKA consumer is getting started automatically. However I want to disable KAFKA consumer being autostarted.
I got to know that, In spring KAFKA we can do something like this
factory.setAutoStartup(start);
however, I am not sure how I achieve(control auto start/stop behavior) in Kafka reactor. I want to have something like below
Introducing a property to handle the auto start/stop behavior
#Value("${consumer.autostart:true}")
private boolean start;
using the above property I should be able to set the KAFKA Auto-Start flag in Kafka reactor, something like this
return KafkaReceiver.create(receiverOptions.addAssignListener(Collection::iterator)
.subscription(Collections.singleton(
kafkaProperties.getKafka()
.getCore().getInbound().getConfluent()
.getTopicName()))
.commitInterval(Duration.ZERO).commitBatchSize(0)).setAutoStart(start);
Note: .setAutoStart(start);
Is this doable in Kafka reactor, if so, how do I do it?
Update:
protected void inboundEventHubListener(String topicName, List<String> allowedValues) {
Scheduler scheduler = Schedulers.fromExecutorService(kafkaExecutorService);
kafkaEventHubInboundReceiver
.receive()
.publishOn(scheduler)
.groupBy(receiverRecord -> {
try {
return receiverRecord.receiverOffset().topicPartition();
} catch (Throwable throwable) {
log.error("exception in groupby", throwable);
return Flux.empty();
}
}).flatMap(partitionFlux -> partitionFlux.publishOn(scheduler)
.map(record -> {
processMessage(record, topicName, allowedValues).block(
Duration.ofSeconds(60L));//This subscribe is to trigger processing of a message
return record;
}).concatMap(message -> {
log.info("Received message after processing offset: {} partition: {} ",
message.offset(), message.partition());
return message.receiverOffset()
.commit()
.onErrorContinue((t, o) -> log.error(
String.format("exception raised while commit offset %s", o), t)
);
})).onErrorContinue((t, o) -> {
try {
if (null != o) {
ReceiverRecord<String, Object> record = (ReceiverRecord<String, Object>) o;
ReceiverOffset offset = record.receiverOffset();
log.debug("failed to process message: {} partition: {} and message: {} ",
offset.offset(), record.partition(), record.value());
}
log.error(String.format("exception raised while processing message %s", o), t);
} catch (Throwable inner) {
log.error("encountered error in onErrorContinue", inner);
}
}).subscribeOn(scheduler).subscribe();
Can I do something like this?
kafkaEventHubInboundReceiverObj = kafkaEventHubInboundReceiver.....subscribeOn(scheduler);
if(consumer.autostart) {
kafkaEventHubInboundReceiverObj.subscribe();
}

With reactor-kafka there is no concept of "auto start"; you are in complete control.
The consumer is not "started" until you subscribe to the Flux returned from receiver.receive().
Simply delay the flux.subscribe() until you are ready to consume data.

Related

Multi threading on Kafka Send in Spring reactor Kafka

I have a reactive kafka application that reads data from a topic, transforms the message and writes to another topic. I have multiple partitions in the topic so I am creating multiple consumers to read from the topics in parallel. Each consumer runs on a different thread. But looks like kafka send runs on the same thread even though it is called from different consumers.
I tested by logging the thread name to understand the thread workflow, the receive thread name is different for each consumer, but on kafka send [kafkaProducerTemplate.send] the thread name [Thread name: producer-1] is the same for all the consumers. I don't understand how that works, i would expect it to be different for all consumers on send as well. Can someone help me understand how this works.
#Bean
public ReceiverOptions<String, String> kafkaReceiverOptions(String topic, KafkaProperties kafkaProperties) {
ReceiverOptions<String, String> basicReceiverOptions = ReceiverOptions.create(kafkaProperties.buildConsumerProperties());
return basicReceiverOptions.subscription(Collections.singletonList(topic))
.addAssignListener(receiverPartitions -> log.debug("onPartitionAssigned {}", receiverPartitions))
.addRevokeListener(receiverPartitions -> log.debug("onPartitionsRevoked {}", receiverPartitions));
}
#Bean
public ReactiveKafkaConsumerTemplate<String, String> kafkaConsumerTemplate(ReceiverOptions<String, String> kafkaReceiverOptions) {
return new ReactiveKafkaConsumerTemplate<String, String>(kafkaReceiverOptions);
}
#Bean
public ReactiveKafkaProducerTemplate<String, List<Object>> kafkaProducerTemplate(
KafkaProperties properties) {
Map<String, Object> props = properties.buildProducerProperties();
return new ReactiveKafkaProducerTemplate<String, List<Object>>(SenderOptions.create(props));
}
public void run(String... args) {
for(int i = 0; i < topicPartitionsCount ; i++) {
readWrite(destinationTopic).subscribe();
}
}}
public Flux<String> readWrite(String destTopic) {
return kafkaConsumerTemplate
.receiveAutoAck()
.doOnNext(consumerRecord -> log.info("received key={}, value={} from topic={}, offset={}",
consumerRecord.key(),
consumerRecord.value(),
consumerRecord.topic(),
consumerRecord.offset())
)
.doOnNext(consumerRecord -> log.info("Record received from partition {} in thread {}", consumerRecord.partition(),Thread.currentThread().getName()))
.doOnNext(s-> sendToKafka(s,destTopic))
.map(ConsumerRecord::value)
.onErrorContinue((exception,errorConsumer)->{
log.error("Error while consuming : {}", exception.getMessage());
});
}
public void sendToKafka(ConsumerRecord<String, String> consumerRecord, String destTopic){
kafkaProducerTemplate.send(destTopic, consumerRecord.key(), transformRecord(consumerRecord))
.doOnNext(senderResult -> log.info("Record received from partition {} in thread {}", consumerRecord.partition(),Thread.currentThread().getName()))
.doOnSuccess(senderResult -> {
log.debug("Sent {} offset : {}", metrics, senderResult.recordMetadata().offset());
}
.doOnError(exception -> {
log.error("Error while sending message to destination topic : {}", exception.getMessage());
})
.subscribe();
}
All sends for a producer are run on a single-threaded Scheduler (via .publishOn()).
See DefaultKafkaSender.doSend().
You should create a sender for each consumer.

Creating a new consumer when cosumer stops due to an error in Reactor Kafka

I am working on an application where I have multiple consumers for each Topic partition so there is concurrency in reading from the topic. I followed this link to ensure that the consumer gets created again if the existing consumer stops. .repeat will create the new consumer. I have been trying to test this scenario:
Below is my code along with test:
#Bean
public ReceiverOptions<String, String> kafkaReceiverOptions(String topic, KafkaProperties kafkaProperties) {
ReceiverOptions<String, String> basicReceiverOptions = ReceiverOptions.create(kafkaProperties.buildConsumerProperties());
return basicReceiverOptions.subscription(Collections.singletonList(topic))
.addAssignListener(receiverPartitions -> log.debug("onPartitionAssigned {}", receiverPartitions))
.addRevokeListener(receiverPartitions -> log.debug("onPartitionsRevoked {}", receiverPartitions));
}
#Bean
public ReactiveKafkaConsumerTemplate<String, String> kafkaConsumerTemplate(ReceiverOptions<String, String> kafkaReceiverOptions) {
return new ReactiveKafkaConsumerTemplate<String, String>(kafkaReceiverOptions);
}
#Bean
public ReactiveKafkaProducerTemplate<String, List<Object>> kafkaProducerTemplate(
KafkaProperties properties) {
Map<String, Object> props = properties.buildProducerProperties();
return new ReactiveKafkaProducerTemplate<String, List<Object>>(SenderOptions.create(props));
}
public void run(String... args) {
for(int i = 0; i < topicPartitionsCount ; i++) {
readWrite(destinationTopic).subscribe();
}
}}
public Flux<String> readWrite(String destTopic) {
AtomicBoolean repeatConsumer = new AtomicBoolean(false);
return kafkaConsumerTemplate
.receiveAutoAck()
.doOnNext(consumerRecord -> log.debug("received key={}, value={} from topic={}, offset={}",
consumerRecord.key(),
consumerRecord.value(),
consumerRecord.topic(),
consumerRecord.offset())
)
//.doOnNext(consumerRecord -> log.info("Record received from partition {} in thread {}", consumerRecord.partition(),Thread.currentThread().getName()))
.doOnNext(s-> sendToKafka(s,destinationTopic))
.map(ConsumerRecord::value)
.doOnNext(record -> log.debug("successfully consumed {}={}", Metric[].class.getSimpleName(), record))
.doOnError(exception -> log.debug("Error occurred while processing the message, attempting retry. Error message: {}", exception.getMessage()))
.retryWhen(Retry.backoff(Integer.parseInt(retryAttempts), Duration.ofSeconds(Integer.parseInt(retryAttemptsDelay))).transientErrors(true))
.onErrorContinue((exception,errorConsumerRecord)->{
ReceiverRecordException recordException = (ReceiverRecordException)exception;
log.debug("Retries exhausted for : {}", recordException);
recordException.getRecord().receiverOffset().acknowledge();
repeatConsumer.set(true);
})
.repeat(repeatConsumer::get); // will create a new consumer if the existing consumer stops
}
public class ReceiverRecordException extends RuntimeException {
private final ReceiverRecord record;
ReceiverRecordException(ReceiverRecord record, Throwable t) {
super(t);
this.record = record;
}
public ReceiverRecord getRecord() {
return this.record;
}
}
Test:
#Test
public void readWriteCreatesNewConsumerWhenCurrentConsumerStops() {
AtomicInteger recordNumber = new AtomicInteger(0);
Mockito
.when(reactiveKafkaConsumerTemplate.receiveAutoAck())
.thenReturn(
Flux.create(consumerRecordFluxSink -> {
if (recordNumber.getAndIncrement() < 5) {
consumerRecordFluxSink.error(new RuntimeException("Kafka down"));
} else {
consumerRecordFluxSink.next(createConsumerRecord(validMessage));
consumerRecordFluxSink.complete();
}
})
);
Flux<String> actual = service.readWrite();
StepVerifier.create(actual)
.verifyComplete();
}
When I run the test, I get the record retry exception - onError(reactor.core.Exceptions$RetryExhaustedException: Retries exhausted: 3/3 in a row (3 total)))
My understanding was onErrorContinue will catch the exception and then continue with the next records. But it looks like it is throwing an exception.
Since it is throwing an exception how does repeat() work?
I would really appreciate if some one could help me understand how to test this scenario?

Kafka consumer getting stuck

we are using Kafka streams to insert into PostgreSQL since the flow is too high a direct insert is being avoided. The consumer seems to be working well but gets stuck occasionally and cant find the root cause for the same .
The consumer has been running for about 6 months and already consumed billions of records. I don't understand why its getting stuck as of late . I don't even know where to start debugging.
Below is the code for processing the records:
`private static void readFromTopic(DataSource datasource, ConsumerOptions options) {
KafkaConsumer<String, String> consumer = KafkaConsumerConfig.createConsumerGroup(options);
Producer<Long, String> producer = KafkaProducerConfig.createKafkaProducer(options);
if (options.isReadFromAnOffset()) {
// if want to assign particular offsets to consume from
// will work for only a single partition for a consumer
List<TopicPartition> tpartition = new ArrayList<TopicPartition>();
tpartition.add(new TopicPartition(options.getTopicName(), options.getPartition()));
consumer.assign(tpartition);
consumer.seek(tpartition.get(0), options.getOffset());
} else {
// use auto assign partition & offsets
consumer.subscribe(Arrays.asList(options.getTopicName()));
log.debug("subscribed to topic {}", options.getTopicName());
}
List<Payload> payloads = new ArrayList<>();
while (true) {
// timer is the time to wait for messages to be received in the broker
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(50));
if(records.count() != 0 )
log.debug("poll size is {}", records.count());
Set<TopicPartition> partitions = records.partitions();
// reading normally as per round robin and the last committed offset
for (ConsumerRecord<String, String> r : records) {
log.debug(" Parition : {} Offset : {}", r.partition(), r.offset());
try {
JSONArray arr = new JSONArray(r.value());
for (Object o : arr) {
Payload p = JsonIterator.deserialize(((JSONObject) o).toString(), Payload.class);
payloads.add(p);
}
List<Payload> steplist = new ArrayList<>();
steplist.addAll(payloads);
// Run a task specified by a Runnable Object asynchronously.
CompletableFuture<Void> future = CompletableFuture.runAsync(new Runnable() {
#Override
public void run() {
try {
Connection conn = datasource.getConnection();
PgInsert.insertIntoPg(steplist, conn, consumer, r, options.getTopicName(),
options.getErrorTopic(), producer);
} catch (Exception e) {
log.error("error in processing future {}", e);
}
}
}, executorService);
// used to combine all futures
allfutures.add(future);
payloads.clear();
} catch (Exception e) {
// pushing into new topic for records which have failed
log.debug("error in kafka consumer {}", e);
ProducerRecord<Long, String> record = new ProducerRecord<Long, String>(options.getErrorTopic(),
r.offset(), r.value());
producer.send(record);
}
}
// commiting after every poll
consumer.commitSync();
if (records.count() != 0) {
Map<TopicPartition, OffsetAndMetadata> metadata = consumer.committed(partitions);
// reading the committed offsets for each partition after polling
for (TopicPartition tpartition : partitions) {
OffsetAndMetadata offsetdata = metadata.get(tpartition);
if (offsetdata != null && tpartition != null)
log.debug("committed offset is " + offsetdata.offset() + " for topic partition "
+ tpartition.partition());
}
}
// waiting for all threads to complete after each poll
try {
waitForFuturesToEnd();
allfutures.clear();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
}`
Earlier i thought reason for it getting stuck is the size of the records being consumed , so i have reduced the MAX_POLL_RECORDS_CONFIG to 10. This will ensure the records fetched in the poll wont be more than 200kb since each record can have a max size of 20kb.
Thinking of using Spring framework to resolve this issue but before that would like to know why exactly the consumer gets stuck .Any insights on this will be helpful.

Reactor Kafka - At-Least-Once - handling failures and offsets in multi partition

Below is the consumer code to receive messages from kafka topic (8 partition) and processing it.
#Component
public class MessageConsumer {
private static final String TOPIC = "mytopic.t";
private static final String GROUP_ID = "mygroup";
private final ReceiverOptions consumerSettings;
private static final Logger LOG = LoggerFactory.getLogger(MessageConsumer.class);
#Autowired
public MessageConsumer(#Qualifier("consumerSettings") ReceiverOptions consumerSettings)
{
this.consumerSettings=consumerSettings;
consumerMessage();
}
private void consumerMessage()
{
KafkaReceiver<String, String> receiver = KafkaReceiver.create(receiverOptions(Collections.singleton(TOPIC)));
Scheduler scheduler = Schedulers.newElastic("FLUX_DEFER", 10, true);
Flux.defer(receiver::receive)
.groupBy(m -> m.receiverOffset().topicPartition())
.flatMap(partitionFlux ->
partitionFlux.publishOn(scheduler)
.concatMap(m -> {
LOG.info("message received from kafka : " + "key : " + m.key()+ " partition: " + m.partition());
return process(m.key(), m.value())
.thenEmpty(m.receiverOffset().commit());
}))
.retryBackoff(5, Duration.ofSeconds(2), Duration.ofHours(2))
.doOnError(err -> {
handleError(err);
}).retry()
.doOnCancel(() -> close()).subscribe();
}
private void close() {
}
private void handleError(Throwable err) {
LOG.error("kafka stream error : ",err);
}
private Mono<Void> process(String key, String value)
{
if(key.equals("error"))
return Mono.error(new Exception("process error : "));
LOG.error("message consumed : "+key);
return Mono.empty();
}
public ReceiverOptions<String, String> receiverOptions(Collection<String> topics) {
return consumerSettings
.commitInterval(Duration.ZERO)
.commitBatchSize(0)
.addAssignListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.addRevokeListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.subscription(topics);
}
}
#Bean(name="consumerSettings")
public ReceiverOptions<String, String> getConsumerSettings() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, GROUP_ID);
props.put(ConsumerConfig.CLIENT_ID_CONFIG, GROUP_ID);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put("max.block.ms", "3000");
props.put("request.timeout.ms", "3000");
return ReceiverOptions.create(props);
}
On receiving each message, my processing logic returns on empty mono if the consumed message processed successfully.
Everything works as expected if there is no error returned in the processing logic.
But if i throw an error to simulate the exception behaviour in my processing logic for a particular message then i am missing to process that message which caused the exception. The stream moves to the next message.
What i want to achieve is, process the current message and commit the offset if its successful then move to the next record.
If any exception in processing the message don't commit the current offset and retry the same message until its successful. Don't move to the next message until the current message is successful.
Please let me know how to handle process failures without skipping the message and make the stream start from the offset where the exception is thrown.
Regards,
Vinoth
The below code works for me. The idea is to retry the failed messages configured number of time and if its still fails then move it to failed queue and commit the message. At the same time process the messages from other partitions concurrently.
If a message from a particular partition fails configured number of time then restart the stream after a delay so that we can handle dependency failures by not hitting them continuously.
#Autowired
public ReactiveMessageConsumer(#Qualifier("consumerSettings") ReceiverOptions consumerSettings,MessageProducer producer)
{
this.consumerSettings=consumerSettings;
this.fraudCheckService=fraudCheckService;
this.producer=producer;
consumerMessage();
}
private void consumerMessage() {
int numRetries=3;
Scheduler scheduler = Schedulers.newElastic("FLUX_DEFER", 10, true);
KafkaReceiver<String, String> receiver = KafkaReceiver.create(receiverOptions(Collections.singleton(TOPIC)));
Flux<GroupedFlux<TopicPartition, ReceiverRecord<String, String>>> f = Flux.defer(receiver::receive)
.groupBy(m -> m.receiverOffset().topicPartition());
Flux f1 = f.publishOn(scheduler).flatMap(r -> r.publishOn(scheduler).concatMap(b ->
Flux.just(b)
.concatMap(a -> {
LOG.error("processing message - order: {} offset: {} partition: {}",a.key(),a.receiverOffset().offset(),a.receiverOffset().topicPartition().partition());
return process(a.key(), a.value()).
then(a.receiverOffset().commit())
.doOnSuccess(d -> LOG.info("committing order {}: offset: {} partition: {} ",a.key(),a.receiverOffset().offset(),a.receiverOffset().topicPartition().partition()))
.doOnError(d -> LOG.info("committing offset failed for order {}: offset: {} partition: {} ",a.key(),a.receiverOffset().offset(),a.receiverOffset().topicPartition().partition()));
})
.retryWhen(companion -> companion
.doOnNext(s -> LOG.info(" --> Exception processing message for order {}: offset: {} partition: {} message: {} " , b.key() , b.receiverOffset().offset(),b.receiverOffset().topicPartition().partition(),s.getMessage()))
.zipWith(Flux.range(1, numRetries), (error, index) -> {
if (index < numRetries) {
LOG.info(" --> Retying {} order: {} offset: {} partition: {} ", index, b.key(),b.receiverOffset().offset(),b.receiverOffset().topicPartition().partition());
return index;
} else {
LOG.info(" --> Retries Exhausted: {} - order: {} offset: {} partition: {}. Message moved to error queue. Commit and proceed to next", index, b.key(),b.receiverOffset().offset(),b.receiverOffset().topicPartition().partition());
producer.sendMessages(ERROR_TOPIC,b.key(),b.value());
b.receiverOffset().commit();
//return index;
throw Exceptions.propagate(error);
}
})
.flatMap(index -> Mono.delay(Duration.ofSeconds((long) Math.pow(1.5, index - 1) * 3)))
.doOnNext(s -> LOG.info(" --> Retried at: {} ", LocalTime.now()))
))
);
f1.doOnError(a -> {
LOG.info("Moving to next message because of : ", a);
try {
Thread.sleep(5000); // configurable
} catch (InterruptedException e) {
e.printStackTrace();
}
}
).retry().subscribe();
}
public ReceiverOptions<String, String> receiverOptions(Collection<String> topics) {
return consumerSettings
.commitInterval(Duration.ZERO)
.commitBatchSize(0)
.addAssignListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.addRevokeListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.subscription(topics);
}
private Mono<Void> process(OrderId orderId, TraceId traceId)
{
try {
Thread.sleep(500); // simulate slow response
} catch (InterruptedException e) {
// Causes the restart
e.printStackTrace();
}
if(orderId.getId().startsWith("error")) // simulate error scenario
return Mono.error(new Exception("processing message failed for order: " + orderId.getId()));
return Mono.empty();
}
Create different consumer groups.
Each consumer group would be related to one database.
Create your consumer so that they only process relevant event and push it to related database. If database is down then configure consumer to retry infinite amount of time.
For any reason, if your consumer dies then make sure that they start from where earlier consumer left. There is small possibility that your consumer dies right after committing data to database and sending ack to kafka broker. You need to update consumer code to make sure that you process messages exactly-once (if needed).

Apache Kafka: Producer Not Producing All Data

I am new in kafka. My requirement is, I have two table in database source and destination. Now I want to fetch data from source table and store it into destination between these kafka will be work as a producer and consumer. I have done the code but problem is that when producer produces the data some data are missed to produce. For example if I have 100 records in source table then it's not produces all 100 records. I am using Kafka-0.10
MyProducer Config-
bootstrap.servers=192.168.1.XXX:9092,192.168.1.XXX:9093,192.168.1.XXX:9094
acks=all
retries=2
batch.size=16384
linger.ms=2
buffer.memory=33554432
key.serializer=org.apache.kafka.common.serialization.IntegerSerializer
value.serializer=org.apache.kafka.common.serialization.StringSerializer
My Producer Code:-
public void run() {
SourceDAO sourceDAO = new SourceDAO();
Source source;
int id;
try {
logger.debug("INSIDE RUN");
List<Source> listOfEmployee = sourceDAO.getAllSource();
Iterator<Source> sourceIterator = listOfEmployee.iterator();
String sourceJson;
Gson gson = new Gson();
while(sourceIterator.hasNext()) {
source = sourceIterator.next();
sourceJson = gson.toJson(source);
id = source.getId();
producerRecord = new ProducerRecord<Integer, String>(TOPIC, id, sourceJson);
producerRecords.add(producerRecord);
}
for(ProducerRecord<Integer, String> record : producerRecords) {
logger.debug("Producer Record: " + record.value());
producer.send(record, new Callback() {
#Override
public void onCompletion(RecordMetadata metadata, Exception exception) {
logger.debug("Exception: " + exception);
if (exception != null)
throw new RuntimeException(exception.getMessage());
logger.info("The offset of the record we just sent is: " + metadata.offset()
+ " In Partition : " + metadata.partition());
}
});
}
producer.close();
producer.flush();
logger.info("Size of Record: " + producerRecords.size());
} catch (SourceServiceException e) {
logger.error("Unable to Produce data...", e);
throw new RuntimeException("Unable to Produce data...", e);
}
}
My Consumer Config:-
bootstrap.servers=192.168.1.XXX:9092,192.168.1.231:XXX,192.168.1.232:XXX
group.id=consume
client.id=C1
enable.auto.commit=true
auto.commit.interval.ms=1000
max.partition.fetch.bytes=10485760
session.timeout.ms=35000
consumer.timeout.ms=35000
auto.offset.reset=earliest
message.max.bytes=10000000
key.deserializer=org.apache.kafka.common.serialization.IntegerDeserializer
value.deserializer=org.apache.kafka.common.serialization.StringDeserializer
Consumer Code:-
public void doWork() {
logger.debug("Inside doWork of DestinationConsumer");
DestinationDAO destinationDAO = new DestinationDAO();
consumer.subscribe(Collections.singletonList(this.TOPIC));
while(true) {
ConsumerRecords<String, String> consumerRecords = consumer.poll(1000);
int minBatchSize = 1;
for(ConsumerRecord<String, String> rec : consumerRecords) {
logger.debug("Consumer Recieved Record: " + rec);
consumerRecordsList.add(rec);
}
logger.debug("Record Size: " + consumerRecordsList.size());
if(consumerRecordsList.size() >= minBatchSize) {
try {
destinationDAO.insertSourceDataIntoDestination(consumerRecordsList);
} catch (DestinationServiceException e) {
logger.error("Unable to update destination table");
}
}
}
}
From what could be seens here I would guess that you did not flush or close the producer. You should note that send runs async and just prepare a batch which is send later on (depending on the configuration of your producer):
From the kafka documentation
The send() method is asynchronous. When called it adds the record to a buffer of pending record sends and immediately returns. This allows the producer to batch together individual records for efficiency.
What you should try is to call producer.close() after you iterated over all producerRecords (BTW: why are you caching the entire producerRecords that might causes problems when you have to many records).
If that does not help you should try to use a e.g. a console consumer to figure out what is missing. Please offer some more code. How is the producer configured? How does your consumer look like? What is the type of producerRecords?
Hope that helps.