How to send avro message to new Topic using functional Kstream (processor) - apache-kafka

I am getting "could not be established. Broker may not be available" when the message sent to new topic "failureTopic" or any exception. I am using Kafka version: 3.0.0
#Bean
#SuppressWarnings("unchecked")
public Function<KStream<String, AvroClass>, KStream<String, AvroClass>> process() {
final AvroClass[] finalMessage = {null};
return input -> input.branch( (k, avroMessage) -> {
try {
finalMessage[0] = subprocess();
if (finalMessage[0] != null)
return true;
else {
Message<NewAvroClass> mess = MessageBuilder.withPayload(avroMessage).build();
streamBridge.send("failureTopic", mess);
return false;
}
} catch (Exception e) {
handleProcessingException(k, avroMessage);
return false;
}
},
(k, v) -> true)[0].map((k, v) -> new KeyValue<>(k, finalMessage[0] ) );
}

Related

Mixing Kafka Streams DSL with Processor API to get offset

I am trying to find a way to log the offset when an exception occurs.
Here is what I am trying to achieve:
void createTopology(StreamsBuilder builder) {
builder.stream(topic, Consumed.with(Serdes.String(), new JsonSerde()))
.filter(...)
.mapValues(value -> {
Map<String, Object> output;
try {
output = decode(value.get("data"));
} catch (DecodingException e) {
LOGGER.error(e.getMessage());
// TODO: LOG OFFSET FOR FAILED DECODE HERE
return new ArrayList<>();
}
...
return output;
})
.filter((k, v) -> !(v instanceof List && ((List<?>) v).isEmpty()))
.to(sink_topic);
}
I found this: https://docs.confluent.io/platform/current/streams/developer-guide/dsl-api.html#streams-developer-guide-dsl-transformations-stateful
and it is in my understanding that I need to use the Processor API but still haven't found a solution for my issue.
A ValueTransfomer can also access the offset via the ProcessorContext passed via init, and I believe it's much easier.
Here is the solution, as suggested by IUSR: https://stackoverflow.com/a/73465691/14945779 (thank you):
static class InjectOffsetTransformer implements ValueTransformer<JsonObject, JsonObject> {
private ProcessorContext context;
#Override
public void init(ProcessorContext context) {
this.context = context;
}
#Override
public JsonObject transform(JsonObject value) {
value.addProperty("offset", context.offset());
return value;
}
#Override
public void close() {
}
}
void createTopology(StreamsBuilder builder) {
builder.stream(topic, Consumed.with(Serdes.String(), new JsonSerde()))
.filter(...)
.transformValues(InjectOffsetTransformer::new)
.mapValues(value -> {
Map<String, Object> output;
try {
output = decode(value.get("data"));
} catch (DecodingException e) {
LOGGER.warn(String.format("Error reading from topic %s. Last read offset %s:", topic, lastReadOffset), e);
return new ArrayList<>();
}
lastReadOffset = value.get("offset").getAsLong();
return output;
})
.filter((k, v) -> !(v instanceof List && ((List<?>) v).isEmpty()))
.to(sink_topic);
}

How to unregister two vertx consumers and returns an rxjava completable?

I need a small help with Rxjava . currently I have two hash maps . Each hash map contains vertex message consumers against a subscription key. I want to return a completable object only if I am able to unregister both vertex message consumers. How can I achieve it .
I can post the code i am working on.
#Override
public Completable deregisterKeyEvents(String subscriptionId) {
MessageConsumer<JsonObject> messageConsumer = consumerMap.get(subscriptionId);
MessageConsumer<JsonObject> subscriptionConsumer = subscriptionConsumerMap.get(subscriptionId);
if( subscriptionConsumer != null) {
subscriptionConsumerMap.remove(subscriptionId);
subscriptionConsumer.unregister( res-> {
if(res.succeeded()) {
LOGGER.debug("Subscription channel consumer deregistered successfully!");
} else {
LOGGER.error("Unable to de-register Subscription channel consumer");
}
});
}
if (messageConsumer != null) {
consumerMap.remove(subscriptionId);
return Completable.create(emitter -> {
messageConsumer.unregister(res -> {
if (res.succeeded()) {
emitter.onComplete();
} else {
emitter.onError(res.cause());
}
});
});
} else {
LOGGER.warn("There was no consumer registered!");
return Completable.create(emitter -> emitter.onError(new KvNoSuchElementException("Subscription '" + subscriptionId + "' not found")));
}
}
I want to rewrite the above code in such a way
subscriptionConsumer.unregister() & messageConsumer.unregister() is successful then return a completable
The MessageConsumer class is from vert.x libary io.vertx.core.eventbus.MessageConsumer.
appreciate if you can help
thank you
If you're willing to add Vert.x RxJava2 to your dependencies, you could do this with toCompletable:
#Override
public Completable deregisterKeyEvents(String subscriptionId) {
MessageConsumer<JsonObject> messageConsumer = consumerMap.get(subscriptionId);
MessageConsumer<JsonObject> subscriptionConsumer = subscriptionConsumerMap.get(subscriptionId);
Completable c1;
if( subscriptionConsumer != null) {
subscriptionConsumerMap.remove(subscriptionId);
c1 = CompletableHelper.toCompletable(handler -> subscriptionConsumer.unregister(handler))
.doOnSuccess(() -> LOGGER.debug("Subscription channel consumer deregistered successfully!"))
.doOnError(t-> LOGGER.error("Unable to de-register Subscription channel consumer", t));
} else {
c1 = Completable.complete();
}
Completable c2;
if (messageConsumer != null) {
consumerMap.remove(subscriptionId);
c2 = CompletableHelper.toCompletable(handler -> messageConsumer.unregister(handler));
} else {
LOGGER.warn("There was no consumer registered!");
c2 = Completable.error(new KvNoSuchElementException("Subscription '" + subscriptionId + "' not found"));
}
return c1.concatWith(c2);
}
Note that this is a bit different than your original code because:
the messageConsumer unregistration happens only after the unregistration of subscriptionConsumer,
the messageConsumer unregistration happens only if unregistration of subscriptionConsumer was successful.
You can use a different method of Completable if that's not the behavior you want.

Creating a new consumer when cosumer stops due to an error in Reactor Kafka

I am working on an application where I have multiple consumers for each Topic partition so there is concurrency in reading from the topic. I followed this link to ensure that the consumer gets created again if the existing consumer stops. .repeat will create the new consumer. I have been trying to test this scenario:
Below is my code along with test:
#Bean
public ReceiverOptions<String, String> kafkaReceiverOptions(String topic, KafkaProperties kafkaProperties) {
ReceiverOptions<String, String> basicReceiverOptions = ReceiverOptions.create(kafkaProperties.buildConsumerProperties());
return basicReceiverOptions.subscription(Collections.singletonList(topic))
.addAssignListener(receiverPartitions -> log.debug("onPartitionAssigned {}", receiverPartitions))
.addRevokeListener(receiverPartitions -> log.debug("onPartitionsRevoked {}", receiverPartitions));
}
#Bean
public ReactiveKafkaConsumerTemplate<String, String> kafkaConsumerTemplate(ReceiverOptions<String, String> kafkaReceiverOptions) {
return new ReactiveKafkaConsumerTemplate<String, String>(kafkaReceiverOptions);
}
#Bean
public ReactiveKafkaProducerTemplate<String, List<Object>> kafkaProducerTemplate(
KafkaProperties properties) {
Map<String, Object> props = properties.buildProducerProperties();
return new ReactiveKafkaProducerTemplate<String, List<Object>>(SenderOptions.create(props));
}
public void run(String... args) {
for(int i = 0; i < topicPartitionsCount ; i++) {
readWrite(destinationTopic).subscribe();
}
}}
public Flux<String> readWrite(String destTopic) {
AtomicBoolean repeatConsumer = new AtomicBoolean(false);
return kafkaConsumerTemplate
.receiveAutoAck()
.doOnNext(consumerRecord -> log.debug("received key={}, value={} from topic={}, offset={}",
consumerRecord.key(),
consumerRecord.value(),
consumerRecord.topic(),
consumerRecord.offset())
)
//.doOnNext(consumerRecord -> log.info("Record received from partition {} in thread {}", consumerRecord.partition(),Thread.currentThread().getName()))
.doOnNext(s-> sendToKafka(s,destinationTopic))
.map(ConsumerRecord::value)
.doOnNext(record -> log.debug("successfully consumed {}={}", Metric[].class.getSimpleName(), record))
.doOnError(exception -> log.debug("Error occurred while processing the message, attempting retry. Error message: {}", exception.getMessage()))
.retryWhen(Retry.backoff(Integer.parseInt(retryAttempts), Duration.ofSeconds(Integer.parseInt(retryAttemptsDelay))).transientErrors(true))
.onErrorContinue((exception,errorConsumerRecord)->{
ReceiverRecordException recordException = (ReceiverRecordException)exception;
log.debug("Retries exhausted for : {}", recordException);
recordException.getRecord().receiverOffset().acknowledge();
repeatConsumer.set(true);
})
.repeat(repeatConsumer::get); // will create a new consumer if the existing consumer stops
}
public class ReceiverRecordException extends RuntimeException {
private final ReceiverRecord record;
ReceiverRecordException(ReceiverRecord record, Throwable t) {
super(t);
this.record = record;
}
public ReceiverRecord getRecord() {
return this.record;
}
}
Test:
#Test
public void readWriteCreatesNewConsumerWhenCurrentConsumerStops() {
AtomicInteger recordNumber = new AtomicInteger(0);
Mockito
.when(reactiveKafkaConsumerTemplate.receiveAutoAck())
.thenReturn(
Flux.create(consumerRecordFluxSink -> {
if (recordNumber.getAndIncrement() < 5) {
consumerRecordFluxSink.error(new RuntimeException("Kafka down"));
} else {
consumerRecordFluxSink.next(createConsumerRecord(validMessage));
consumerRecordFluxSink.complete();
}
})
);
Flux<String> actual = service.readWrite();
StepVerifier.create(actual)
.verifyComplete();
}
When I run the test, I get the record retry exception - onError(reactor.core.Exceptions$RetryExhaustedException: Retries exhausted: 3/3 in a row (3 total)))
My understanding was onErrorContinue will catch the exception and then continue with the next records. But it looks like it is throwing an exception.
Since it is throwing an exception how does repeat() work?
I would really appreciate if some one could help me understand how to test this scenario?

How to read the Header values in the Batch listener error handling scenario

I am trying to handle the exception at the listener
#KafkaListener(id = PropertiesUtil.ID,
topics = "#{'${kafka.consumer.topic}'}",
groupId = "${kafka.consumer.group.id.config}",
containerFactory = "containerFactory",
errorHandler = "errorHandler")
public void receiveEvents(#Payload List<ConsumerRecord<String, String>> recordList,
Acknowledgment acknowledgment) {
try {
log.info("Consuming the batch of size {} from kafka topic {}", consumerRecordList.size(),
consumerRecordList.get(0).topic());
processEvent(consumerRecordList);
incrementOffset(acknowledgment);
} catch (Exception exception) {
throwOrHandleExceptions(exception, recordList, acknowledgment);
.........
}
}
The Kafka container config:
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>>
containerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConcurrency(this.numberOfConsumers);
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
factory.setConsumerFactory(getConsumerFactory());
factory.setBatchListener(true);
return factory;
}
}
the listener error handler impl
#Bean
public ConsumerAwareListenerErrorHandler errorHandler() {
return (m, e, c) -> {
MessageHeaders headers = m.getHeaders();
List<String> topics = headers.get(KafkaHeaders.RECEIVED_TOPIC, List.class);
List<Integer> partitions = headers.get(KafkaHeaders.RECEIVED_PARTITION_ID, List.class);
List<Long> offsets = headers.get(KafkaHeaders.OFFSET, List.class);
Map<TopicPartition, Long> offsetsToReset = new HashMap<>();
for (int i = 0; i < topics.size(); i++) {
int index = i;
offsetsToReset.compute(new TopicPartition(topics.get(i), partitions.get(i)),
(k, v) -> v == null ? offsets.get(index) : Math.min(v, offsets.get(index)));
}
...
};
}
when i try to run the same without the batching processing then i am able to fetch the partition,topic and offset values but when i enable batch processing and try to test it then i am getting only two values inside the headers i.e id and timestamp and other values are not set. Am i missing anything here??
What version are you using? I just tested it with Boot 2.2.4 (SK 2.3.5) and it works fine...
#SpringBootApplication
public class So60152179Application {
public static void main(String[] args) {
SpringApplication.run(So60152179Application.class, args);
}
#KafkaListener(id = "so60152179", topics = "so60152179", errorHandler = "eh")
public void listen(List<String> in) {
throw new RuntimeException("foo");
}
#Bean
public ConsumerAwareListenerErrorHandler eh() {
return (m, e, c) -> {
System.out.println(m);
return null;
};
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so60152179", "foo");
};
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so60152179").partitions(1).replicas(1).build();
}
}
spring.kafka.listener.type=batch
spring.kafka.consumer.auto-offset-reset=earliest
GenericMessage [payload=[foo], headers={kafka_offset=[0], kafka_nativeHeaders=[RecordHeaders(headers = [], isReadOnly = false)], kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer#2f2e787f, kafka_timestampType=[CREATE_TIME], kafka_receivedMessageKey=[null], kafka_receivedPartitionId=[0], kafka_receivedTopic=[so60152179], kafka_receivedTimestamp=[1581351585253], kafka_groupId=so60152179}]

Is it possible to transfer files using Kafka?

I have thousands of files generated each day which I want to stream using Kafka.
When I try to read the file, each line is taken as a separate message.
I would like to know how can I make each file's content as a single message in Kafka topic and with consumer how to write each message from Kafka topic in a separate file.
You can write your own serializer/deserializer for handling files.
For example :
Producer Props :
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, YOUR_FILE_SERIALIZER_URI);
Consumer Props :
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, YOUR_FILE_DESERIALIZER_URI);
Serializer
public class FileMapSerializer implements Serializer<Map<?,?>> {
#Override
public void close() {
}
#Override
public void configure(Map configs, boolean isKey) {
}
#Override
public byte[] serialize(String topic, Map data) {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutput out = null;
byte[] bytes = null;
try {
out = new ObjectOutputStream(bos);
out.writeObject(data);
bytes = bos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (out != null) {
out.close();
}
} catch (IOException ex) {
// ignore close exception
}
try {
bos.close();
} catch (IOException ex) {
// ignore close exception
}
}
return bytes;
}
}
Deserializer
public class MapDeserializer implements Deserializer<Map> {
#Override
public void close() {
}
#Override
public void configure(Map config, boolean isKey) {
}
#Override
public Map deserialize(String topic, byte[] message) {
ByteArrayInputStream bis = new ByteArrayInputStream(message);
ObjectInput in = null;
try {
in = new ObjectInputStream(bis);
Object o = in.readObject();
if (o instanceof Map) {
return (Map) o;
} else
return new HashMap<String, String>();
} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
bis.close();
} catch (IOException ex) {
}
try {
if (in != null) {
in.close();
}
} catch (IOException ex) {
// ignore close exception
}
}
return new HashMap<String, String>();
}
}
Compose messages in the following form
final Object kafkaMessage = new ProducerRecord<String, Map>((String) <TOPIC>,Integer.toString(messageId++), messageMap);
messageMap will contain fileName as key and the file content as value.
Value can be serializable object.
Hence each message will contain a Map with File_Name versus FileContent map.Can be single value or multiple value.