How to read the Header values in the Batch listener error handling scenario - apache-kafka

I am trying to handle the exception at the listener
#KafkaListener(id = PropertiesUtil.ID,
topics = "#{'${kafka.consumer.topic}'}",
groupId = "${kafka.consumer.group.id.config}",
containerFactory = "containerFactory",
errorHandler = "errorHandler")
public void receiveEvents(#Payload List<ConsumerRecord<String, String>> recordList,
Acknowledgment acknowledgment) {
try {
log.info("Consuming the batch of size {} from kafka topic {}", consumerRecordList.size(),
consumerRecordList.get(0).topic());
processEvent(consumerRecordList);
incrementOffset(acknowledgment);
} catch (Exception exception) {
throwOrHandleExceptions(exception, recordList, acknowledgment);
.........
}
}
The Kafka container config:
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>>
containerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConcurrency(this.numberOfConsumers);
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
factory.setConsumerFactory(getConsumerFactory());
factory.setBatchListener(true);
return factory;
}
}
the listener error handler impl
#Bean
public ConsumerAwareListenerErrorHandler errorHandler() {
return (m, e, c) -> {
MessageHeaders headers = m.getHeaders();
List<String> topics = headers.get(KafkaHeaders.RECEIVED_TOPIC, List.class);
List<Integer> partitions = headers.get(KafkaHeaders.RECEIVED_PARTITION_ID, List.class);
List<Long> offsets = headers.get(KafkaHeaders.OFFSET, List.class);
Map<TopicPartition, Long> offsetsToReset = new HashMap<>();
for (int i = 0; i < topics.size(); i++) {
int index = i;
offsetsToReset.compute(new TopicPartition(topics.get(i), partitions.get(i)),
(k, v) -> v == null ? offsets.get(index) : Math.min(v, offsets.get(index)));
}
...
};
}
when i try to run the same without the batching processing then i am able to fetch the partition,topic and offset values but when i enable batch processing and try to test it then i am getting only two values inside the headers i.e id and timestamp and other values are not set. Am i missing anything here??

What version are you using? I just tested it with Boot 2.2.4 (SK 2.3.5) and it works fine...
#SpringBootApplication
public class So60152179Application {
public static void main(String[] args) {
SpringApplication.run(So60152179Application.class, args);
}
#KafkaListener(id = "so60152179", topics = "so60152179", errorHandler = "eh")
public void listen(List<String> in) {
throw new RuntimeException("foo");
}
#Bean
public ConsumerAwareListenerErrorHandler eh() {
return (m, e, c) -> {
System.out.println(m);
return null;
};
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so60152179", "foo");
};
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so60152179").partitions(1).replicas(1).build();
}
}
spring.kafka.listener.type=batch
spring.kafka.consumer.auto-offset-reset=earliest
GenericMessage [payload=[foo], headers={kafka_offset=[0], kafka_nativeHeaders=[RecordHeaders(headers = [], isReadOnly = false)], kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer#2f2e787f, kafka_timestampType=[CREATE_TIME], kafka_receivedMessageKey=[null], kafka_receivedPartitionId=[0], kafka_receivedTopic=[so60152179], kafka_receivedTimestamp=[1581351585253], kafka_groupId=so60152179}]

Related

Mixing Kafka Streams DSL with Processor API to get offset

I am trying to find a way to log the offset when an exception occurs.
Here is what I am trying to achieve:
void createTopology(StreamsBuilder builder) {
builder.stream(topic, Consumed.with(Serdes.String(), new JsonSerde()))
.filter(...)
.mapValues(value -> {
Map<String, Object> output;
try {
output = decode(value.get("data"));
} catch (DecodingException e) {
LOGGER.error(e.getMessage());
// TODO: LOG OFFSET FOR FAILED DECODE HERE
return new ArrayList<>();
}
...
return output;
})
.filter((k, v) -> !(v instanceof List && ((List<?>) v).isEmpty()))
.to(sink_topic);
}
I found this: https://docs.confluent.io/platform/current/streams/developer-guide/dsl-api.html#streams-developer-guide-dsl-transformations-stateful
and it is in my understanding that I need to use the Processor API but still haven't found a solution for my issue.
A ValueTransfomer can also access the offset via the ProcessorContext passed via init, and I believe it's much easier.
Here is the solution, as suggested by IUSR: https://stackoverflow.com/a/73465691/14945779 (thank you):
static class InjectOffsetTransformer implements ValueTransformer<JsonObject, JsonObject> {
private ProcessorContext context;
#Override
public void init(ProcessorContext context) {
this.context = context;
}
#Override
public JsonObject transform(JsonObject value) {
value.addProperty("offset", context.offset());
return value;
}
#Override
public void close() {
}
}
void createTopology(StreamsBuilder builder) {
builder.stream(topic, Consumed.with(Serdes.String(), new JsonSerde()))
.filter(...)
.transformValues(InjectOffsetTransformer::new)
.mapValues(value -> {
Map<String, Object> output;
try {
output = decode(value.get("data"));
} catch (DecodingException e) {
LOGGER.warn(String.format("Error reading from topic %s. Last read offset %s:", topic, lastReadOffset), e);
return new ArrayList<>();
}
lastReadOffset = value.get("offset").getAsLong();
return output;
})
.filter((k, v) -> !(v instanceof List && ((List<?>) v).isEmpty()))
.to(sink_topic);
}

Creating a new consumer when cosumer stops due to an error in Reactor Kafka

I am working on an application where I have multiple consumers for each Topic partition so there is concurrency in reading from the topic. I followed this link to ensure that the consumer gets created again if the existing consumer stops. .repeat will create the new consumer. I have been trying to test this scenario:
Below is my code along with test:
#Bean
public ReceiverOptions<String, String> kafkaReceiverOptions(String topic, KafkaProperties kafkaProperties) {
ReceiverOptions<String, String> basicReceiverOptions = ReceiverOptions.create(kafkaProperties.buildConsumerProperties());
return basicReceiverOptions.subscription(Collections.singletonList(topic))
.addAssignListener(receiverPartitions -> log.debug("onPartitionAssigned {}", receiverPartitions))
.addRevokeListener(receiverPartitions -> log.debug("onPartitionsRevoked {}", receiverPartitions));
}
#Bean
public ReactiveKafkaConsumerTemplate<String, String> kafkaConsumerTemplate(ReceiverOptions<String, String> kafkaReceiverOptions) {
return new ReactiveKafkaConsumerTemplate<String, String>(kafkaReceiverOptions);
}
#Bean
public ReactiveKafkaProducerTemplate<String, List<Object>> kafkaProducerTemplate(
KafkaProperties properties) {
Map<String, Object> props = properties.buildProducerProperties();
return new ReactiveKafkaProducerTemplate<String, List<Object>>(SenderOptions.create(props));
}
public void run(String... args) {
for(int i = 0; i < topicPartitionsCount ; i++) {
readWrite(destinationTopic).subscribe();
}
}}
public Flux<String> readWrite(String destTopic) {
AtomicBoolean repeatConsumer = new AtomicBoolean(false);
return kafkaConsumerTemplate
.receiveAutoAck()
.doOnNext(consumerRecord -> log.debug("received key={}, value={} from topic={}, offset={}",
consumerRecord.key(),
consumerRecord.value(),
consumerRecord.topic(),
consumerRecord.offset())
)
//.doOnNext(consumerRecord -> log.info("Record received from partition {} in thread {}", consumerRecord.partition(),Thread.currentThread().getName()))
.doOnNext(s-> sendToKafka(s,destinationTopic))
.map(ConsumerRecord::value)
.doOnNext(record -> log.debug("successfully consumed {}={}", Metric[].class.getSimpleName(), record))
.doOnError(exception -> log.debug("Error occurred while processing the message, attempting retry. Error message: {}", exception.getMessage()))
.retryWhen(Retry.backoff(Integer.parseInt(retryAttempts), Duration.ofSeconds(Integer.parseInt(retryAttemptsDelay))).transientErrors(true))
.onErrorContinue((exception,errorConsumerRecord)->{
ReceiverRecordException recordException = (ReceiverRecordException)exception;
log.debug("Retries exhausted for : {}", recordException);
recordException.getRecord().receiverOffset().acknowledge();
repeatConsumer.set(true);
})
.repeat(repeatConsumer::get); // will create a new consumer if the existing consumer stops
}
public class ReceiverRecordException extends RuntimeException {
private final ReceiverRecord record;
ReceiverRecordException(ReceiverRecord record, Throwable t) {
super(t);
this.record = record;
}
public ReceiverRecord getRecord() {
return this.record;
}
}
Test:
#Test
public void readWriteCreatesNewConsumerWhenCurrentConsumerStops() {
AtomicInteger recordNumber = new AtomicInteger(0);
Mockito
.when(reactiveKafkaConsumerTemplate.receiveAutoAck())
.thenReturn(
Flux.create(consumerRecordFluxSink -> {
if (recordNumber.getAndIncrement() < 5) {
consumerRecordFluxSink.error(new RuntimeException("Kafka down"));
} else {
consumerRecordFluxSink.next(createConsumerRecord(validMessage));
consumerRecordFluxSink.complete();
}
})
);
Flux<String> actual = service.readWrite();
StepVerifier.create(actual)
.verifyComplete();
}
When I run the test, I get the record retry exception - onError(reactor.core.Exceptions$RetryExhaustedException: Retries exhausted: 3/3 in a row (3 total)))
My understanding was onErrorContinue will catch the exception and then continue with the next records. But it looks like it is throwing an exception.
Since it is throwing an exception how does repeat() work?
I would really appreciate if some one could help me understand how to test this scenario?

How to Handle a Kafka Record with a Class-Level #KafkaListener with no #KafkaHandler for the Record Value

Normally, when we define a class-level #KafkaListener and method level #KafkaHandlers, we can define a default #KafkaHandler to handle unexpected payloads.
https://docs.spring.io/spring-kafka/docs/current/reference/html/#class-level-kafkalistener
But, what should we do if we don't have a default method?
With version 2.6 and later, you can configure a SeekToCurrentErrorHandler to immediately send such messages to a dead letter topic, by examining the exception.
Here is a simple Spring Boot application that demonstrates the technique:
#SpringBootApplication
public class So59256214Application {
public static void main(String[] args) {
SpringApplication.run(So59256214Application.class, args);
}
#Bean
public NewTopic topic1() {
return TopicBuilder.name("so59256214").partitions(1).replicas(1).build();
}
#Bean
public NewTopic topic2() {
return TopicBuilder.name("so59256214.DLT").partitions(1).replicas(1).build();
}
#KafkaListener(id = "so59256214.DLT", topics = "so59256214.DLT")
void listen(ConsumerRecord<?, ?> in) {
System.out.println("dlt: " + in);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, Object> template) {
return args -> {
template.send("so59256214", 42);
template.send("so59256214", 42.0);
template.send("so59256214", "No handler for this");
};
}
#Bean
ErrorHandler eh(KafkaOperations<String, Object> template) {
SeekToCurrentErrorHandler eh = new SeekToCurrentErrorHandler(new DeadLetterPublishingRecoverer(template));
BackOff neverRetryOrBackOff = new FixedBackOff(0L, 0);
BackOff normalBackOff = new FixedBackOff(2000L, 3);
eh.setBackOffFunction((rec, ex) -> {
if (ex.getMessage().contains("No method found for class")) {
return neverRetryOrBackOff;
}
else {
return normalBackOff;
}
});
return eh;
}
}
#Component
#KafkaListener(id = "so59256214", topics = "so59256214")
class Listener {
#KafkaHandler
void integerHandler(Integer in) {
System.out.println("int: " + in);
}
#KafkaHandler
void doubleHandler(Double in) {
System.out.println("double: " + in);
}
}
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
Result:
int: 42
double: 42.0
dlt: ConsumerRecord(topic = so59256214.DLT, ...

How can we club all the values for same key as a list and return Kafka Streams with key and value as String

I have a data coming on kafka topic as (key:id, {id:1, body:...})
means key for the message is same as id. however there can be multiple messages with the same id but different body.
so I am getting the kstream <String, String>
Now I want to get all the messages having same id (key) and club all the values as a list and return as
Kstream<String, List<String>>
Any sugessions?
//Create a Stream with a state store
StreamsBuilder builder = new StreamsBuilder();
StoreBuilder<KeyValueStore<String, List<String>>> logTracerStateStore = Stores.keyValueStoreBuilder(
Stores.persistentKeyValueStore(LOG_TRACE_STATE_STORE), Serdes.String(),
new ListSerde<String>(Serdes.String()));
//add this to stream builder
builder.addStateStore(logTracerStateStore);
KStream<String, String> kafkaStream = builder.stream(TOPIC);
splitProcessor(kafkaStream);
logger.info("creating stream for topic {} ..", TOPIC);
final Topology topology = builder.build();
return new KafkaStreams(topology, streamConfiguration(bootstrapServers));
// Stream List Serde
public class ListSerde<T> implements Serde<List<T>> {
private final Serde<List<T>> inner;
public ListSerde( final Serde<T> avroSerde) {
inner = Serdes.serdeFrom(new ListSerializer<>( avroSerde.serializer()),
new ListDeserializer<>( avroSerde.deserializer()));
}
#Override
public Serializer<List<T>> serializer() {
return inner.serializer();
}
#Override
public Deserializer<List<T>> deserializer() {
return inner.deserializer();
}
#Override
public void configure(final Map<String, ?> configs, final boolean isKey) {
inner.serializer().configure(configs, isKey);
inner.deserializer().configure(configs, isKey);
}
#Override
public void close() {
inner.serializer().close();
inner.deserializer().close();
}
}
// Serializer & deserializers
public class ListSerializer<T> implements Serializer<List<T>> {
// private final Comparator<T> comparator;
private final Serializer<T> valueSerializer;
public ListSerializer( final Serializer<T> valueSerializer) {
// this.comparator = comparator;
this.valueSerializer = valueSerializer;
}
#Override
public void configure(final Map<String, ?> configs, final boolean isKey) {
// do nothing
}
#Override
public byte[] serialize(final String topic, final List<T> list) {
final int size = list.size();
final ByteArrayOutputStream baos = new ByteArrayOutputStream();
final DataOutputStream out = new DataOutputStream(baos);
final Iterator<T> iterator = list.iterator();
try {
out.writeInt(size);
while (iterator.hasNext()) {
final byte[] bytes = valueSerializer.serialize(topic, iterator.next());
out.writeInt(bytes.length);
out.write(bytes);
}
out.close();
} catch (final IOException e) {
throw new RuntimeException("unable to serialize List", e);
}
return baos.toByteArray();
}
#Override
public void close() {
}
}
//------------
public class ListDeserializer<T> implements Deserializer<List<T>> {
// private final Comparator<T> comparator;
private final Deserializer<T> valueDeserializer;
public ListDeserializer(final Deserializer<T> valueDeserializer) {
// this.comparator = comparator;
this.valueDeserializer = valueDeserializer;
}
#Override
public void configure(final Map<String, ?> configs, final boolean isKey) {
// do nothing
}
#Override
public List<T> deserialize(final String s, final byte[] bytes) {
if (bytes == null || bytes.length == 0) {
return null;
}
final List<T> list = new ArrayList<>();
final DataInputStream dataInputStream = new DataInputStream(new ByteArrayInputStream(bytes));
try {
final int records = dataInputStream.readInt();
for (int i = 0; i < records; i++) {
final byte[] valueBytes = new byte[dataInputStream.readInt()];
dataInputStream.read(valueBytes);
list.add(valueDeserializer.deserialize(s, valueBytes));
}
// dataInputStream.close();
} catch (final IOException e) {
throw new RuntimeException("Unable to deserialize PriorityQueue", e);
}finally {
try {
dataInputStream.close();
} catch (Exception e2) {
// TODO: handle exception
}
}
return list;
}
#Override
public void close() {
}
}
/// Now create Stream Processors
public class LogTraceStreamStateProcessor implements Processor<String, String>{
private static final Logger logger = Logger.getLogger(LogTraceStreamStateProcessor.class);
IStore stateStore;
/**
* Initialize the transformer.
*/
#Override
public void init(ProcessorContext context) {
logger.info("initializing processor and looking for monitoring store");
stateStore = MonitoringStateStoreFactory.getInstance().getStore();
logger.debug("found the monitoring store - {} ", stateStore);
stateStore.initLogTraceStoreProcess(context);
logger.debug("initalizing monitoring store.");
}
#Override
public void process(String key, String value) {
logger.debug("Storing the value for logtrace storage - {} ", value);
stateStore.storeLogTrace(value);
logger.debug("finished Storing the value for logtrace storage - {} ", value);
}
#Override
public void close() {
// TODO Auto-generated method stub
}
}
// access the key value state store like below
KeyValueStore<String, List<String>> stateStore = (KeyValueStore<String, List<String>>) traceStreamContext.getStateStore(EXEID_REQ_REL_STORE);
//Now add a list to new key for a new message and if the key exists then add a new message in the list
public void storeTraceData(String traceData) {
try {
TraceEvent tracer = new TraceEvent();
logger.debug("Received the Trace value - {}", traceData);
tracer = mapper.readValue(traceData, TraceEvent.class);
logger.debug("trace unmarshelling has been completed successfully !!!");
String key = tracer.getExecutionId();
List<String> listEvents = stateStore.get(key);
if (listEvents != null && !listEvents.isEmpty()) {
logger.debug("event is already in store so storing in the list for execution id - {}", key);
listEvents.add(requestId);
stateStore.put(key, listEvents);
} else {
logger.debug(
"event is not present in the store so creating a new list and adding into store for execution id - {}",
key);
List<String> list = new ArrayList<>();
list.add(requestId);
stateStore.put(key, list);
}
} catch (Throwable e) {
logger.error("exception while processing the trace event .. ", e);
} finally {
try {
traceStreamContext.commit();
} catch (Exception e2) {
e2.printStackTrace();
}
}
}
/// now this is how you can access the message from state store
public ReadOnlyKeyValueStore<String, List<String>> tracerStore() {
return waitUntilStoreIsQueryable(KEY_NAME);
}

How to evaluate consuming time in kafka stream application

I have 1.0.0 kafka stream application with two classes as below 'class FilterByPolicyStreamsApp' and 'class FilterByPolicyTransformerSupplier'. In my application, I read the events, perform some conditional checks and forward to same kafka in another topic. I able to get the producing time with 'eventsForwardTimeInMs' variable in FilterByPolicyTransformerSupplier class. But I unable to get the consuming time (with and without (de)serialization). How will I get this time? Please help me.
FilterByPolicyStreamsApp .java:
public class FilterByPolicyStreamsApp implements CommandLineRunner {
String policyKafkaTopicName="policy";
String policyFilterDataKafkaTopicName = "policy.filter.data";
String bootstrapServers="11.1.1.1:9092";
String sampleEventsKafkaTopicName = 'sample-.*";
String applicationId="filter-by-policy-app";
String policyFilteredEventsKafkaTopicName = "policy.filter.events";
public static void main(String[] args) {
SpringApplication.run(FilterByPolicyStreamsApp.class, args);
}
#Override
public void run(String... arg0) {
String policyGlobalTableName = policyKafkaTopicName + ".table";
String policyFilterDataGlobalTable = policyFilterDataKafkaTopicName + ".table";
Properties config = new Properties();
config.put(StreamsConfig.APPLICATION_ID_CONFIG, applicationId);
config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
config.put(StreamsConfig.DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG, WallclockTimestampExtractor.class);
KStreamBuilder builder = new KStreamBuilder();
builder.globalTable(Serdes.String(), new JsonSerde<>(List.class), policyKafkaTopicName,
policyGlobalTableName);
builder.globalTable(Serdes.String(), new JsonSerde<>(PolicyFilterData.class), policyFilterDataKafkaTopicName,
policyFilterDataGlobalTable);
KStream<String, SampleEvent> events = builder.stream(Serdes.String(),
new JsonSerde<>(SampleEvent.class), Pattern.compile(sampleEventsKafkaTopicName));
events = events.transform(new FilterByPolicyTransformerSupplier(policyGlobalTableName,
policyFilterDataGlobalTable));
events.to(Serdes.String(), new JsonSerde<>(SampleEvent.class), policyFilteredEventsKafkaTopicName);
KafkaStreams streams = new KafkaStreams(builder, config);
streams.start();
streams.setUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() {
#Override
public void uncaughtException(Thread t, Throwable e) {
logger.error(e.getMessage(), e);
}
});
}
}
FilterByPolicyTransformerSupplier.java:
public class FilterByPolicyTransformerSupplier
implements TransformerSupplier<String, SampleEvent, KeyValue<String, SampleEvent>> {
private String policyGlobalTableName;
private String policyFilterDataGlobalTable;
public FilterByPolicyTransformerSupplier(String policyGlobalTableName,
String policyFilterDataGlobalTable) {
this.policyGlobalTableName = policyGlobalTableName;
this.policyFilterDataGlobalTable = policyFilterDataGlobalTable;
}
#Override
public Transformer<String, SampleEvent, KeyValue<String, SampleEvent>> get() {
return new Transformer<String, SampleEvent, KeyValue<String, SampleEvent>>() {
private KeyValueStore<String, List<String>> policyStore;
private KeyValueStore<String, PolicyFilterData> policyMetadataStore;
private ProcessorContext context;
#Override
public void close() {
}
#Override
public void init(ProcessorContext context) {
this.context = context;
// Call punctuate every 1 second
this.context.schedule(1000);
policyStore = (KeyValueStore<String, List<String>>) this.context
.getStateStore(policyGlobalTableName);
policyMetadataStore = (KeyValueStore<String, PolicyFilterData>) this.context
.getStateStore(policyFilterDataGlobalTable);
}
#Override
public KeyValue<String, SampleEvent> punctuate(long arg0) {
return null;
}
#Override
public KeyValue<String, SampleEvent> transform(String key, SampleEvent event) {
long eventsForwardTimeInMs = 0;
long forwardedEventCouunt = 0;
List<String> policyIds = policyStore.get(event.getCustomerCode().toLowerCase());
if (policyIds != null) {
for (String policyId : policyIds) {
/*
PolicyFilterData policyFilterMetadata = policyMetadataStore.get(policyId);
Do some condition checks on the event. If it satisfies then will forward them.
if(policyFilterMetadata == null){
continue;
}
*/
// Using context forward as event can map to multiple policies
long startForwardTime = System.currentTimeMillis();
context.forward(policyId, event);
forwardedEventCouunt++;
eventsForwardTimeInMs += System.currentTimeMillis() - startForwardTime;
}
}
return null;
}
};
}
}