#KafkaListener : behavior and tracking processing of events - apache-kafka

We are using spring-kafka 2.3.0 in our app . Have observed some processing glitches in the scenarios below with
#Service
#EnableScheduling
public class KafkaService {
public void sendToKafkaProducer(String data) {
kafkaTemplate.send(configuration.getProducer().getTopicName(), data);
}
#KafkaListener(id = "consumer_grpA_id",
topics = "#{__listener.getEnvironmentConfiguration().getConsumer().getTopicName()}", groupId = "consumer_grpA", autoStartup = "false")
public void onMessage(ConsumerRecord<String, String> data) throws Exception {
passA(data);
}
private void passB(String message) {
//counter to keep track of retry attempts
if (counter.containsKey(message.getEventID())) {
//RETRY_COUNT = 5
if (counter.get(message.getEventID()) < RETRY_COUNT) {
retryAgain(message);
}
} else {
firstRetryPass(message);
}
}
private void retryAgain(String message) {
counter.put(message.getEventID(), counter.get(message.getEventID()) + 1);
try {
registry.stop(); //pause the listener
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void firstRetryPass(String message) {
// First Time Entry for count and time
counter.put(message.getEventID(), 1);
try {
registry.stop();//pause the listener
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void passA(String message) {
try {
passToTarget(message); //Call target processor
LOGGER.info("Message Processed Successfully to the target");
} catch (Exception e) {
targetUnavailable= true;
passB(message);
}
}
private void passToTarget(String message){
//processor logic, if the target is not available, retry after 15 mins, call passB func
}
#Scheduled(cron = "0 0/15 * 1/1 * ?")
public void scheduledMethod() {
try {
if (targetUnavailable) {
registry.start();
firstTimeStart = false;
}
LOGGER.info(">>>Scheduler Running ?>>>" + registry.isRunning());
} catch (Exception e) {
LOGGER.error(e.getMessage());
}
}
}
On receipt of the first message after a gap in processing, the consumer doesn't pick up the first message. The subsequent messages are processed.
As we don't have the direct access to Kafka topics, we aren't able to identify the process that didn't get picked up from consumer.
How do we track those events that arenot picked up and why is it so.?
We also configured a scheduler whose job is to keep the registry for Kafka running . So is this scheduler required when we already have a listener configured ?
What is the mem and CPU utilization metrics if we keep the listener running. That was one of the reason we used the Kafka registry to stop the listener explicitly whenever the target is down. So need to validate if this approach is sustainable. My hunch is this is against the basic working of Listener, as it's main job is to continue listening for new events irrespective of target status
Edited*

You shouldn't stop the registry on the listener thread unless you use stop(Runnable) - otherwise there will be a deadlock and a delay since the container waits for the listener to exit.
Stopping the container (via the registry) won't actually take effect until any remaining records fetched by the last poll have been processed (unless you set max.poll.records=1.
When the listener exits normally, the record's offset will be committed so that record will not be redelivered on the next start.
You can use the ContainerStoppingErrorHandler for this use case. See here.
Throw an exception and the error handler will stop the container for you.
But that will stop the container on the first try.
If you want retries, use a SeekToCurrentErrorHandler and call the ContainerStoppingErrorHandler from the recoverer after retries are exhausted.

Related

#Transactional with handling error and db-inserts in catch block (Spring Boot)

I would like to rollback a transaction for the data in case of errors and at the same time write the error to db.
I can't manage to do with Transactional Annotations.
Following code produces a runtime-error (1/0) and still writes the data into the db. And also writes the data into the error table.
I tried several variations and followed similar questions in StackOverflow but I didn't succeed to do.
Anyone has a hint, how to do?
#Service
public class MyService{
#Transactional(rollbackFor = Exception.class)
public void updateData() {
try{
processAndPersist(); // <- db operation with inserts
int i = 1/0; // <- Runtime error
}catch (Exception e){
persistError()
trackReportError(filename, e.getMessage());
}
}
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void persistError(String message) {
persistError2Db(message); // <- db operation with insert
}
You need the way to throw an exception in updateData() method to rollback a transaction. And you need to not rollback persistError() transaction at the same time.
#Transactional(rollbackFor = Exception.class)
public void updateData() {
try{
processAndPersist(); // <- db operation with inserts
int i = 1/0; // <- Runtime error
}catch (Exception e){
persistError()
trackReportError(filename, e.getMessage());
throw ex; // if throw error here, will not work
}
}
Just throwing an error will not help because persistError() will have the same transaction as updateData() has. Because persistError() is called using this reference, not a reference to a proxy.
Options to solve
Using self reference.
Using self injection Spring self injection for transactions
Move the call of persistError() outside updateData() (and transaction). Remove #Transactional from persistError() (it will not work) and use transaction of Repository in persistError2Db().
Move persistError() to a separate serface. It will be called using a proxy in this case.
Don't use declarative transactions (with #Transactional annotation). Use Programmatic transaction management to set transaction boundaries manually https://docs.spring.io/spring-framework/docs/3.0.0.M3/reference/html/ch11s06.html
Also keep in mind that persistError() can produce error too (and with high probability will do it).
Using self reference
You can use self reference to MyService to have a transaction, because you will be able to call not a method of MyServiceImpl, but a method of Spring proxy.
#Service
public class MyServiceImpl implements MyService {
public void doWork(MyService self) {
DataEntity data = loadData();
try {
self.updateData(data);
} catch (Exception ex) {
log.error("Error for dataId={}", data.getId(), ex);
self.persistError("Error");
trackReportError(filename, ex);
}
}
#Transactional
public void updateData(DataEntity data) {
persist(data); // <- db operation with inserts
}
#Transactional
public void persistError(String message) {
try {
persistError2Db(message); // <- db operation with insert
} catch (Exception ex) {
log.error("Error for message={}", message, ex);
}
}
}
public interface MyService {
void doWork(MyService self);
void updateData(DataEntity data);
void persistError(String message);
}
To use
MyService service = ...;
service.doWork(service);

PubSub not re-delivering message after ack deadline

I have two subscriber pointing to same subscription of topic use case.
As per document pub sub redeliver the message if subscriber took more time than Acknowledgement deadline to acknowledge the message.
I have configure the default value which is 10 sec. But processing takes approx ~ 1 min to complete and acknowledge.
Below is my sample code
public class SubscribeAsyncExample {
private Subscriber subscriber = null;
#PostConstruct
public void init() throws Exception {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-project-id";
String subscriptionId = "your-subscription-id";
subscribeAsyncExample(projectId, subscriptionId);
}
public void subscribeAsyncExample(String projectId, String subscriptionId) {
ProjectSubscriptionName subscriptionName = ProjectSubscriptionName.of(projectId, subscriptionId);
// Instantiate an asynchronous message receiver.
MessageReceiver receiver = (PubsubMessage message, AckReplyConsumer consumer) -> {
// Handle incoming message, then ack the received message.
System.out.println("Id: " + message.getMessageId());
System.out.println("Data: " + message.getData().toStringUtf8());
int sleepingTime = 20000;
System.out.println("sleepingTime:" + sleepingTime);
try {
Thread.sleep(sleepingTime);
} catch (InterruptedException e) {
e.printStackTrace();
}
consumer.ack();
System.out.println("test completed");
};
try {
subscriber = Subscriber.newBuilder(subscriptionName, receiver).build();
// Start the subscriber.
subscriber.startAsync().awaitRunning();
System.out.printf("Listening for messages on %s:\n", subscriptionName.toString());
// Allow the subscriber to run for 30s unless an unrecoverable error occurs.
// subscriber.awaitTerminated(30, TimeUnit.SECONDS);
} catch (Exception e) {
e.printStackTrace();
}
}
#PreDestroy
public void preDestroy() throws Exception {
// Shut down the subscriber after 30s. Stop receiving messages.
subscriber.stopAsync();
}
}```
Below is Response
20:53:24,300 INFO [stdout] (Thread-128) Id: 1288313732423842
20:53:24,300 INFO [stdout] (Thread-128) Data: abc13
**20:53:24,300 INFO** [stdout] (Thread-128) sleepingTime:20000
**20:53:44,300 INFO** [stdout] (Thread-128) test completed
When using the Cloud Pub/Sub client libraries, the deadline is automatically extended up to the MaxAckExtensionPeriod specified in the Subscriber.Buider. This extension period defaults to one hour. To change this value, you'd want to change the line that creates the subscriber as follows:
subscriber = Subscriber.newBuilder(subscriptionName, receiver)
.setMaxAckExtensionPeriod(Duration.ofSeconds(10))
.build();

Stop processing kafka messages if something goes wrong during process

In my processor API I store the messages in a key value store and every 100 messages I make a POST request. If something fails while trying to send the messages (api is not responding etc.) I want to stop processing messages. Until there is evidence the API calls work.
Here is my code:
public class BulkProcessor implements Processor<byte[], UserEvent> {
private KeyValueStore<Integer, ArrayList<UserEvent>> keyValueStore;
private BulkAPIClient bulkClient;
private String storeName;
private ProcessorContext context;
private int count;
#Autowired
public BulkProcessor(String storeName, BulkClient bulkClient) {
this.storeName = storeName;
this.bulkClient = bulkClient;
}
#Override
public void init(ProcessorContext context) {
this.context = context;
keyValueStore = (KeyValueStore<Integer, ArrayList<UserEvent>>) context.getStateStore(storeName);
count = 0;
// to check every 15 minutes if there are any remainders in the store that are not sent yet
this.context.schedule(Duration.ofMinutes(15), PunctuationType.WALL_CLOCK_TIME, (timestamp) -> {
if (count > 0) {
sendEntriesFromStore();
}
});
}
#Override
public void process(byte[] key, UserEvent value) {
int userGroupId = Integer.valueOf(value.getUserGroupId());
ArrayList<UserEvent> userEventArrayList = keyValueStore.get(userGroupId);
if (userEventArrayList == null) {
userEventArrayList = new ArrayList<>();
}
userEventArrayList.add(value);
keyValueStore.put(userGroupId, userEventArrayList);
if (count == 100) {
sendEntriesFromStore();
}
}
private void sendEntriesFromStore() {
KeyValueIterator<Integer, ArrayList<UserEvent>> iterator = keyValueStore.all();
while (iterator.hasNext()) {
KeyValue<Integer, ArrayList<UserEvent>> entry = iterator.next();
BulkRequest bulkRequest = new BulkRequest(entry.key, entry.value);
if (bulkRequest.getLocation() != null) {
URI url = bulkClient.buildURIPath(bulkRequest);
try {
bulkClient.postRequestBulkApi(url, bulkRequest);
keyValueStore.delete(entry.key);
} catch (BulkApiException e) {
logger.warn(e.getMessage(), e.fillInStackTrace());
}
}
}
iterator.close();
count = 0;
}
#Override
public void close() {
}
}
Currently in my code if a call to the API fails it will iterate the next 100 (and this will keep happening as long as it fails) and add them to the keyValueStore. I don't want this to happen. Instead I would prefer to stop the stream and continue once the keyValueStore is emptied. Is that possible?
Could I throw a StreamsException?
try {
bulkClient.postRequestBulkApi(url, bulkRequest);
keyValueStore.delete(entry.key);
} catch (BulkApiException e) {
throw new StreamsException(e);
}
Would that kill my stream app and so the process dies?
You should only delete the record from state store after you make sure your record is successfully processed by the API, so remove the first keyValueStore.delete(entry.key); and keep the second one. If not then you can potentially lost some messages when keyValueStore.delete is committed to underlying changelog topic but your messages are not successfully process yet, so it's only at most one guarantee.
Just wrap the calling API code around an infinite loop and keep trying until the record successfully processed, your processor will not consume new message from above processor node cause it's running in a same StreamThread:
private void sendEntriesFromStore() {
KeyValueIterator<Integer, ArrayList<UserEvent>> iterator = keyValueStore.all();
while (iterator.hasNext()) {
KeyValue<Integer, ArrayList<UserEvent>> entry = iterator.next();
//remove this state store delete code : keyValueStore.delete(entry.key);
BulkRequest bulkRequest = new BulkRequest(entry.key, entry.value);
if (bulkRequest.getLocation() != null) {
URI url = bulkClient.buildURIPath(bulkRequest);
while (true) {
try {
bulkClient.postRequestBulkApi(url, bulkRequest);
keyValueStore.delete(entry.key);//only delete after successfully process the message to achieve at least one processing guarantee
break;
} catch (BulkApiException e) {
logger.warn(e.getMessage(), e.fillInStackTrace());
}
}
}
}
iterator.close();
count = 0;
}
Yes you could throw a StreamsException, this StreamTask will be migrate to another StreamThread during re-balance, maybe on the sample application instance. If the API keep causing Exception until all StreamThread had died, your application will not automatically exit and receive below Exception, you should add a custom StreamsException handler to exit your app when all stream threads had died using KafkaStreams#setUncaughtExceptionHandler or listen to Stream State change (to ERROR state):
All stream threads have died. The instance will be in error state and should be closed.
In the end I used a simple KafkaConsumer instead of KafkaStreams, but the bottom line was that I changed the BulkApiException to extend RuntimeException, which I throw again after I log it. So now it looks as follows:
} catch (BulkApiException bae) {
logger.error(bae.getMessage(), bae.fillInStackTrace());
throw new BulkApiException();
} finally {
consumer.close();
int exitCode = SpringApplication.exit(ctx, () -> 1);
System.exit(exitCode);
}
This way the application is exited and the k8s restarts the pod. That was because if the api where I'm trying to forward the requests is down, then there is no point on continue reading messages. So until the other api is back up k8s will restart a pod.

Exception Handling on sending a message via Kafka Template

I was wondering if there is a way to catch an exception/throwable when producing a kafka message via the Kafka Template.
I don't see anything where it would throw a KafkaException if something fails. I need to find out if the message was committed to Kafka before I can continue with my application flow. I know the ListenableFuture would log a failure but I don't know how to capture that onFailure.
public void sendToKafkaTopic(KafkaMessage data) {
ListenableFuture<SendResult<String, KafkaMessage>> future = kafkaTemplate.send(primaryKafkaTopic, data);
future.addCallback(new ListenableFutureCallback<SendResult<String, KafkaMessage>>() {
#Override
public void onSuccess(SendResult<String, KafkaMessage> result) {
log.info("sent message='{}' with offset={}", data,
result.getRecordMetadata().offset());
}
#Override
public void onFailure(Throwable ex) {
log.error("unable to send message='{}'", data, ex);
}
});
}
I was trying to find something like this:
public void sendToKafkaTopic(KafkaMessage data) {
try {
kafkaTemplate.send(primaryKafkaTopic, data);
} catch (RuntimeException err) {
log.error("unable to send message='{}'", data, ex);
throw new CustomException(err);
}
}
}
before I can continue with my application flow
There is nothing wrong if you just do Future.get(). When an exception happens downstream, it is going to be thrown to you from this blocking get().

rxjava: queue scheduler with default idle job

I have a client server application and I'm using rxjava to do server requests from the client. The client should only do one request at a time so I intent to use a thread queue scheduler similar to the trampoline scheduler.
Now I try to implement a mechanism to watch changes on the server. Therefore I send a long living request that blocks until the server has some changes and sends back the result (long pull).
This long pull request should only run when the job queue is idle. I'm looking for a way to automatically stop the watch request when a regular request is scheduled and start it again when the queue becomes empty. I thought about modifying the trampoline scheduler to get this behavior but I have the feeling that this is a common problem and there might be an easier solution?
You can hold onto the Subscription returned by scheduling the long poll task, unsubscribe it if the queue becomes non-empty and re-schedule if the queue becomes empty.
Edit: here is an example with the basic ExecutorScheduler:
import java.util.concurrent.*;
import java.util.concurrent.atomic.*;
public class IdleScheduling {
static final class TaskQueue {
final ExecutorService executor;
final AtomicReference<Future<?>> idleFuture;
final Runnable idleRunnable;
final AtomicInteger wip;
public TaskQueue(Runnable idleRunnable) {
this.executor = Executors.newFixedThreadPool(1);
this.idleRunnable = idleRunnable;
this.idleFuture = new AtomicReference<>();
this.wip = new AtomicInteger();
this.idleFuture.set(executor.submit(idleRunnable));
}
public void shutdownNow() {
executor.shutdownNow();
}
public Future<?> enqueue(Runnable task) {
if (wip.getAndIncrement() == 0) {
idleFuture.get().cancel(true);
}
return executor.submit(() -> {
task.run();
if (wip.decrementAndGet() == 0) {
startIdle();
}
});
}
void startIdle() {
idleFuture.set(executor.submit(idleRunnable));
}
}
public static void main(String[] args) throws Exception {
TaskQueue tq = new TaskQueue(() -> {
while (!Thread.currentThread().isInterrupted()) {
try {
Thread.sleep(1000);
} catch (InterruptedException ex) {
System.out.println("Idle interrupted...");
return;
}
System.out.println("Idle...");
}
});
try {
Thread.sleep(1500);
tq.enqueue(() -> System.out.println("Work 1"));
Thread.sleep(500);
tq.enqueue(() -> {
System.out.println("Work 2");
try {
Thread.sleep(500);
} catch (InterruptedException ex) {
}
});
tq.enqueue(() -> System.out.println("Work 3"));
Thread.sleep(1500);
} finally {
tq.shutdownNow();
}
}
}