Kafka headers with type of string - apache-kafka

Kafka headers values are type of byte array and for some reason I need to use string type of value for one of the header. Is it possible to manage it somehow instead of handling it in the listener ?

The framework takes care of the conversion automically:
#SpringBootApplication
public class So71941853Application {
public static void main(String[] args) {
SpringApplication.run(So71941853Application.class, args);
}
#KafkaListener(id = "so71941853", topics = "so71941853")
void listen(String in, #Header("hdr") String foo) {
System.out.println(in + " - " + foo);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so71941853").partitions(1).replicas(1).build();
}
#Bean
ApplicationRunner runner(KafkaTemplate<String, String> template) {
template.setDefaultTopic("so71941853");
return args -> {
template.send(new GenericMessage<>("foo", Collections.singletonMap("hdr", "bar".getBytes())));
};
}
}
foo - bar

Related

Flink - KafkaSink not writing data to kafka topic

I'm trying to read JSON events from Kafka, aggregate it on a eventId and its category and write them to a different kafka topic through flink. The program is able to read messages from kafka, but KafkaSink is not writing the data back to the other kafka topic. I'm not sure on the mistake I'm doing. Can someone please check and let me know, where I'm wrong. Here is the code I'm using.
KafkaSource<EventMessage> source = KafkaSource.<EventMessage>builder()
.setBootstrapServers(LOCAL_KAFKA_BROKER)
.setTopics(INPUT_KAFKA_TOPIC)
.setGroupId(LOCAL_GROUP)
.setStartingOffsets(OffsetsInitializer.earliest())
.setValueOnlyDeserializer(new InputDeserializationSchema())
.build();
WindowAssigner<Object, TimeWindow> windowAssigner = TumblingEventTimeWindows.of(WINDOW_SIZE);
DataStream<EventMessage> eventStream = env.fromSource(source, WatermarkStrategy.noWatermarks(), "Event Source");
DataStream<EventSummary> events =
eventStream
.keyBy(eventMessage -> eventMessage.getCategory() + eventMessage.getEventId())
.window(windowAssigner)
.aggregate(new EventAggregator())
.name("EventAggregator test >> ");
KafkaSink<EventSummary> sink = KafkaSink.<EventSummary>builder()
.setBootstrapServers(LOCAL_KAFKA_BROKER)
.setRecordSerializer(KafkaRecordSerializationSchema.builder()
.setTopic(OUTPUT_KAFKA_TOPIC)
.setValueSerializationSchema(new OutputSummarySerializationSchema())
.build())
.setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.build();
events.sinkTo(sink);
These are the POJO's I've created for input message and output.
# EventMessage POJO
public class EventMessage implements Serializable {
private Long timestamp;
private int eventValue;
private String eventId;
private String category;
public EventMessage() { }
public EventMessage(Long timestamp, int eventValue, String eventId, String category) {
this.timestamp = timestamp;
this.eventValue = eventValue;
this.eventId = eventId;
this.category = category;
}
.....
}
# EventSummary POJO
public class EventSummary {
public EventMessage eventMessage;
public int sum;
public int count;
public EventSummary() { }
....
}
These are the deserialization and serialization schemas I'm using.
public class InputDeserializationSchema implements DeserializationSchema<EventMessage> {
static ObjectMapper objectMapper = new ObjectMapper();
#Override
public EventMessage deserialize(byte[] bytes) throws IOException {
return objectMapper.readValue(bytes, EventMessage.class);
}
#Override
public boolean isEndOfStream(EventMessage inputMessage) {
return false;
}
#Override
public TypeInformation<EventMessage> getProducedType() {
return TypeInformation.of(EventMessage.class);
}
}
public class OutputSummarySerializationSchema implements SerializationSchema<EventSummary> {
static ObjectMapper objectMapper = new ObjectMapper();
Logger logger = LoggerFactory.getLogger(OutputSummarySerializationSchema.class);
#Override
public byte[] serialize(EventSummary eventSummary) {
if (objectMapper == null) {
objectMapper.setVisibility(PropertyAccessor.FIELD, JsonAutoDetect.Visibility.ANY);
objectMapper = new ObjectMapper();
}
try {
String json = objectMapper.writeValueAsString(eventSummary);
return json.getBytes();
} catch (com.fasterxml.jackson.core.JsonProcessingException e) {
logger.error("Failed to parse JSON", e);
}
return new byte[0];
}
}
I'm using this aggregator for aggregating the JSON messages.
public class EventAggregator implements AggregateFunction<EventMessage, EventSummary, EventSummary> {
private static final Logger log = LoggerFactory.getLogger(EventAggregator.class);
#Override
public EventSummary createAccumulator() {
return new EventSummary();
}
#Override
public EventSummary add(EventMessage eventMessage, EventSummary eventSummary) {
eventSummary.eventMessage = eventMessage;
eventSummary.count += 1;
eventSummary.sum += eventMessage.getEventValue();
return eventSummary;
}
#Override
public EventSummary getResult(EventSummary eventSummary) {
return eventSummary;
}
#Override
public EventSummary merge(EventSummary summary1, EventSummary summary2) {
return new EventSummary(null,
summary1.sum + summary2.sum,
summary1.count + summary2.count);
}
}
Can someone help me on this?
Thanks in advance.
In order for event time windowing to work, you must specify a proper WatermarkStrategy. Otherwise, the windows will never close, and no results will be produced.
The role that watermarks play is to mark a place in a stream, and indicate that the stream is, at that point, complete through some specific timestamp. Until receiving this indicator of stream completeness, windows continue to wait for more events to be assigned to them.
To simply the debugging the watermarks, you might switch to a PrintSink until you get the watermarking working properly. Or to simplify debugging the KafkaSink, you could switch to using processing time windows until the sink is working.

Consuming again messages from kafka log compaction topic

I have a spring application with a Kafka consumer using a #KafkaListerner annotation. The topic being consumed is log compacted and we might have the scenario where we must consume again the topic messages. What's the best way to achieve this programmatically? We don't control the Kafka topic configuration.
#KafkaListener(...)
public void listen(String in, #Header(KafkaHeaders.CONSUMER) Consumer<?, ?> consumer) {
System.out.println(in);
if (this.resetNeeded) {
consumer.seekToBeginning(consumer.assignment());
this.resetNeeded = false;
}
}
If you want to reset when the listener is idle (no records) you can enable idle events and perform the seeks by listening for a ListenerContainerIdleEvent in an ApplicationListener or #EventListener method.
The event has a reference to the consumer.
EDIT
#SpringBootApplication
public class So58769796Application {
public static void main(String[] args) {
SpringApplication.run(So58769796Application.class, args);
}
#KafkaListener(id = "so58769796", topics = "so58769796")
public void listen1(String value, #Header(KafkaHeaders.RECEIVED_MESSAGE_KEY) String key) {
System.out.println("One:" + key + ":" + value);
}
#KafkaListener(id = "so58769796a", topics = "so58769796")
public void listen2(String value, #Header(KafkaHeaders.RECEIVED_MESSAGE_KEY) String key) {
System.out.println("Two:" + key + ":" + value);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so58769796")
.compact()
.partitions(1)
.replicas(1)
.build();
}
boolean reset;
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template) {
return args -> {
template.send("so58769796", "foo", "bar");
System.out.println("Hit enter to rewind");
System.in.read();
this.reset = true;
};
}
#EventListener
public void listen(ListenerContainerIdleEvent event) {
System.out.println(event);
if (this.reset && event.getListenerId().startsWith("so58769796-")) {
event.getConsumer().seekToBeginning(event.getConsumer().assignment());
}
}
}
and
spring.kafka.listener.idle-event-interval=5000
EDIT2
Here's another technique - in this case we rewind each time the app starts (and on demand)...
#SpringBootApplication
public class So58769796Application implements ConsumerSeekAware {
public static void main(String[] args) {
SpringApplication.run(So58769796Application.class, args);
}
#KafkaListener(id = "so58769796", topics = "so58769796")
public void listen(String value, #Header(KafkaHeaders.RECEIVED_MESSAGE_KEY) String key) {
System.out.println(key + ":" + value);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so58769796")
.compact()
.partitions(1)
.replicas(1)
.build();
}
#Bean
public ApplicationRunner runner(KafkaTemplate<String, String> template,
KafkaListenerEndpointRegistry registry) {
return args -> {
template.send("so58769796", "foo", "bar");
System.out.println("Hit enter to rewind");
System.in.read();
registry.getListenerContainer("so58769796").stop();
registry.getListenerContainer("so58769796").start();
};
}
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
assignments.keySet().forEach(tp -> callback.seekToBeginning(tp.topic(), tp.partition()));
}
}

spring-kafka Request Reply: Different Types for Request and Reply

The documentation for ReplyingKafkaTemplate which provides Request-Reply support (introduced in Spring-Kafka 2.1.3) suggests that different types may be used for the Request and Reply:
ReplyingKafkaTemplate<K, V, R>
where the parameterised type K designates the message Key, V designates the Value (i.e the Request), and R designates the Reply.
So good so far. But the corresponding supporting classes for implementing the server side Request-Reply doesn't seem to support different types for V, R. The documentation suggests using a KafkaListener with an added #SendTo annotation, which behind the scene uses a configured replyTemplate on the MessageListenerContainer. But the AbstractKafkaListenerEndpoint only supports a single type for the listener as well as the replyTemplate:
public abstract class AbstractKafkaListenerEndpoint<K, V>
implements KafkaListenerEndpoint, BeanFactoryAware, InitializingBean {
...
/**
* Set the {#link KafkaTemplate} to use to send replies.
* #param replyTemplate the template.
* #since 2.0
*/
public void setReplyTemplate(KafkaTemplate<K, V> replyTemplate) {
this.replyTemplate = replyTemplate;
}
...
}
hence V and R needs to be the same type.
The example used in the documentation indeed uses String for both Request and Reply.
Am I missing something, or is this a design flaw in the Spring-Kafka Request-Reply support that should be reported and corrected?
This is fixed in the 2.2 release.
For earlier versions, simply inject a raw KafkaTemplate (with no generics).
EDIT
#SpringBootApplication
public class So53151961Application {
public static void main(String[] args) {
SpringApplication.run(So53151961Application.class, args);
}
#KafkaListener(id = "so53151961", topics = "so53151961")
#SendTo
public Bar handle(Foo foo) {
System.out.println(foo);
return new Bar(foo.getValue().toUpperCase());
}
#Bean
public ReplyingKafkaTemplate<String, Foo, Bar> replyingTemplate(ProducerFactory<String, Foo> pf,
ConcurrentKafkaListenerContainerFactory<String, Bar> factory) {
ConcurrentMessageListenerContainer<String, Bar> replyContainer =
factory.createContainer("so53151961-replyTopic");
replyContainer.getContainerProperties().setGroupId("so53151961.reply");
ReplyingKafkaTemplate<String, Foo, Bar> replyingKafkaTemplate = new ReplyingKafkaTemplate<>(pf, replyContainer);
return replyingKafkaTemplate;
}
#Bean
public KafkaTemplate<String, Bar> replyTemplate(ProducerFactory<String, Bar> pf,
ConcurrentKafkaListenerContainerFactory<String, Bar> factory) {
KafkaTemplate<String, Bar> kafkaTemplate = new KafkaTemplate<>(pf);
factory.setReplyTemplate(kafkaTemplate);
return kafkaTemplate;
}
#Bean
public ApplicationRunner runner(ReplyingKafkaTemplate<String, Foo, Bar> template) {
return args -> {
ProducerRecord<String, Foo> record = new ProducerRecord<>("so53151961", null, "key", new Foo("foo"));
RequestReplyFuture<String, Foo, Bar> future = template.sendAndReceive(record);
System.out.println(future.get(10, TimeUnit.SECONDS).value());
};
}
#Bean
public NewTopic topic() {
return new NewTopic("so53151961", 1, (short) 1);
}
#Bean
public NewTopic reply() {
return new NewTopic("so53151961-replyTopic", 1, (short) 1);
}
public static class Foo {
public String value;
public Foo() {
super();
}
public Foo(String value) {
this.value = value;
}
public String getValue() {
return this.value;
}
public void setValue(String value) {
this.value = value;
}
#Override
public String toString() {
return "Foo [value=" + this.value + "]";
}
}
public static class Bar {
public String value;
public Bar() {
super();
}
public Bar(String value) {
this.value = value;
}
public String getValue() {
return this.value;
}
public void setValue(String value) {
this.value = value;
}
#Override
public String toString() {
return "Bar [value=" + this.value + "]";
}
}
}
spring.kafka.producer.value-serializer=org.springframework.kafka.support.serializer.JsonSerializer
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer
spring.kafka.consumer.enable-auto-commit=false
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.properties.spring.json.trusted.packages=com.example
result
Foo [value=foo]
Bar [value=FOO]

Spring Batch: File not being read

I am trying to create an application that uses the spring-batch-excel extension to be able to read Excel files uploaded through a web interface by it's users in order to parse the Excel file for addresses.
When the code runs, there is no error, but all I get is the following in my log. Even though I have log/syso throughout my Processor and Writer (these are never being called, and all I can imagine is it's not properly reading the file, and returning no data to process/write). And yes, the file has data, several thousand records in fact.
Job: [FlowJob: [name=excelFileJob]] launched with the following parameters: [{file=Book1.xlsx}]
Executing step: [excelFileStep]
Job: [FlowJob: [name=excelFileJob]] completed with the following parameters: [{file=Book1.xlsx}] and the following status: [COMPLETED]
Below is my JobConfig
#Configuration
#EnableBatchProcessing
public class AddressExcelJobConfig {
#Bean
public BatchConfigurer configurer(EntityManagerFactory entityManagerFactory) {
return new CustomBatchConfigurer(entityManagerFactory);
}
#Bean
Step excelFileStep(ItemReader<AddressExcel> excelAddressReader,
ItemProcessor<AddressExcel, AddressExcel> excelAddressProcessor,
ItemWriter<AddressExcel> excelAddressWriter,
StepBuilderFactory stepBuilderFactory) {
return stepBuilderFactory.get("excelFileStep")
.<AddressExcel, AddressExcel>chunk(1)
.reader(excelAddressReader)
.processor(excelAddressProcessor)
.writer(excelAddressWriter)
.build();
}
#Bean
Job excelFileJob(JobBuilderFactory jobBuilderFactory,
#Qualifier("excelFileStep") Step excelAddressStep) {
return jobBuilderFactory.get("excelFileJob")
.incrementer(new RunIdIncrementer())
.flow(excelAddressStep)
.end()
.build();
}
}
Below is my AddressExcelReader
The late binding works fine, there is no error. I have tried loading the resource given the file name, in addition to creating a new ClassPathResource and FileSystemResource. All are giving me the same results.
#Component
#StepScope
public class AddressExcelReader implements ItemReader<AddressExcel> {
private PoiItemReader<AddressExcel> itemReader = new PoiItemReader<AddressExcel>();
#Override
public AddressExcel read()
throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
return itemReader.read();
}
public AddressExcelReader(#Value("#{jobParameters['file']}") String file, StorageService storageService) {
//Resource resource = storageService.loadAsResource(file);
//Resource testResource = new FileSystemResource("upload-dir/Book1.xlsx");
itemReader.setResource(new ClassPathResource("/upload-dir/Book1.xlsx"));
itemReader.setLinesToSkip(1);
itemReader.setStrict(true);
itemReader.setRowMapper(excelRowMapper());
}
public RowMapper<AddressExcel> excelRowMapper() {
BeanWrapperRowMapper<AddressExcel> rowMapper = new BeanWrapperRowMapper<>();
rowMapper.setTargetType(AddressExcel.class);
return rowMapper;
}
}
Below is my AddressExcelProcessor
#Component
public class AddressExcelProcessor implements ItemProcessor<AddressExcel, AddressExcel> {
private static final Logger log = LoggerFactory.getLogger(AddressExcelProcessor.class);
#Override
public AddressExcel process(AddressExcel item) throws Exception {
System.out.println("Converting " + item);
log.info("Convert {}", item);
return item;
}
}
Again, this is never coming into play (no logs generated). And if it matters, this is how I'm launching my job from a FileUploadController from a #PostMapping("/") to handle the file upload, which first stores the file, then runs the job:
#PostMapping("/")
public String handleFileUpload(#RequestParam("file") MultipartFile file, RedirectAttributes redirectAttributes) {
storageService.store(file);
try {
JobParameters jobParameters = new JobParametersBuilder()
.addString("file", file.getOriginalFilename().toString()).toJobParameters();
jobLauncher.run(job, jobParameters);
} catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException
| JobParametersInvalidException e) {
e.printStackTrace();
}
redirectAttributes.addFlashAttribute("message",
"You successfully uploaded " + file.getOriginalFilename() + "!");
return "redirect:/";
}
And last by not least
Here is my AddressExcel POJO
import lombok.Data;
#Data
public class AddressExcel {
private String address1;
private String address2;
private String city;
private String state;
private String zip;
public AddressExcel() {}
}
UPDATE (10/13/2016)
From Nghia Do's comments, I also created my own RowMapper instead of using the BeanWrapper to see if that was the issue. Still the same results.
public class AddressExcelRowMapper implements RowMapper<AddressExcel> {
#Override
public AddressExcel mapRow(RowSet rs) throws Exception {
AddressExcel temp = new AddressExcel();
temp.setAddress1(rs.getColumnValue(0));
temp.setAddress2(rs.getColumnValue(1));
temp.setCity(rs.getColumnValue(2));
temp.setState(rs.getColumnValue(3));
temp.setZip(rs.getColumnValue(4));
return temp;
}
}
All it seems I needed was to add the following to my ItemReader:
itemReader.afterPropertiesSet();
itemReader.open(new ExecutionContext());

Spring multiple imapAdapter

I am novice in Spring and I don't like code duplication.
I wrote one ImapAdapter that works fine:
#Component
public class GeneralImapAdapter {
private Logger logger = LoggerFactory.getLogger(getClass());
#Autowired
private EmailReceiverService emailReceiverService;
#Bean
#InboundChannelAdapter(value = "emailChannel", poller = #Poller(fixedDelay = "10000", taskExecutor = "asyncTaskExecutor"))
public MessageSource<javax.mail.Message> mailMessageSource(MailReceiver imapMailReceiver) {
return new MailReceivingMessageSource(imapMailReceiver);
}
#Bean
#Value("imaps://<login>:<pass>#<url>:993/inbox")
public MailReceiver imapMailReceiver(String imapUrl) {
ImapMailReceiver imapMailReceiver = new ImapMailReceiver(imapUrl);
imapMailReceiver.setShouldMarkMessagesAsRead(true);
imapMailReceiver.setShouldDeleteMessages(false);
// other setters here
return imapMailReceiver;
}
#ServiceActivator(inputChannel = "emailChannel", poller = #Poller(fixedDelay = "10000", taskExecutor = "asyncTaskExecutor"))
public void emailMessageSource(javax.mail.Message message) {
emailReceiverService.receive(message);
}
}
But I want about 20 adapters like that, the only difference is imapUrl.
How to do that without code duplication?
Use multiple application contexts, configured with properties.
This sample is an example; it uses XML for its configuration, but the same techniques apply with Java configuration.
If you need them to feed into a common emailReceiverService; make the individual adapter contexts child contexts; see the sample readme for pointers about how to do that.
EDIT:
Here's an example, with the service (and channel) in a shared parent context...
#Configuration
#EnableIntegration
public class MultiImapAdapter {
public static void main(String[] args) throws Exception {
AnnotationConfigApplicationContext parent = new AnnotationConfigApplicationContext(MultiImapAdapter.class);
parent.setId("parent");
String[] urls = { "imap://foo", "imap://bar" };
List<ConfigurableApplicationContext> children = new ArrayList<ConfigurableApplicationContext>();
int n = 0;
for (String url : urls) {
AnnotationConfigApplicationContext child = new AnnotationConfigApplicationContext();
child.setId("child" + ++n);
children.add(child);
child.setParent(parent);
child.register(GeneralImapAdapter.class);
StandardEnvironment env = new StandardEnvironment();
Properties props = new Properties();
// populate properties for this adapter
props.setProperty("imap.url", url);
PropertiesPropertySource pps = new PropertiesPropertySource("imapprops", props);
env.getPropertySources().addLast(pps);
child.setEnvironment(env);
child.refresh();
}
System.out.println("Hit enter to terminate");
System.in.read();
for (ConfigurableApplicationContext child : children) {
child.close();
}
parent.close();
}
#Bean
public MessageChannel emailChannel() {
return new DirectChannel();
}
#Bean
public EmailReceiverService emailReceiverService() {
return new EmailReceiverService();
}
}
and
#Configuration
#EnableIntegration
public class GeneralImapAdapter {
#Bean
public static PropertySourcesPlaceholderConfigurer pspc() {
return new PropertySourcesPlaceholderConfigurer();
}
#Bean
#InboundChannelAdapter(value = "emailChannel", poller = #Poller(fixedDelay = "10000") )
public MessageSource<javax.mail.Message> mailMessageSource(MailReceiver imapMailReceiver) {
return new MailReceivingMessageSource(imapMailReceiver);
}
#Bean
#Value("${imap.url}")
public MailReceiver imapMailReceiver(String imapUrl) {
// ImapMailReceiver imapMailReceiver = new ImapMailReceiver(imapUrl);
// imapMailReceiver.setShouldMarkMessagesAsRead(true);
// imapMailReceiver.setShouldDeleteMessages(false);
// // other setters here
// return imapMailReceiver;
MailReceiver receiver = mock(MailReceiver.class);
Message message = mock(Message.class);
when(message.toString()).thenReturn("Message from " + imapUrl);
Message[] messages = new Message[] {message};
try {
when(receiver.receive()).thenReturn(messages);
}
catch (MessagingException e) {
e.printStackTrace();
}
return receiver;
}
}
and
#MessageEndpoint
public class EmailReceiverService {
#ServiceActivator(inputChannel="emailChannel")
public void handleMessage(javax.mail.Message message) {
System.out.println(message);
}
}
Hope that helps.
Notice that you don't need a poller on the service activator - use a DirectChannel and the service will be invoked on the poller executor thread - no need for another async handoff.