Flink - KafkaSink not writing data to kafka topic - apache-kafka

I'm trying to read JSON events from Kafka, aggregate it on a eventId and its category and write them to a different kafka topic through flink. The program is able to read messages from kafka, but KafkaSink is not writing the data back to the other kafka topic. I'm not sure on the mistake I'm doing. Can someone please check and let me know, where I'm wrong. Here is the code I'm using.
KafkaSource<EventMessage> source = KafkaSource.<EventMessage>builder()
.setBootstrapServers(LOCAL_KAFKA_BROKER)
.setTopics(INPUT_KAFKA_TOPIC)
.setGroupId(LOCAL_GROUP)
.setStartingOffsets(OffsetsInitializer.earliest())
.setValueOnlyDeserializer(new InputDeserializationSchema())
.build();
WindowAssigner<Object, TimeWindow> windowAssigner = TumblingEventTimeWindows.of(WINDOW_SIZE);
DataStream<EventMessage> eventStream = env.fromSource(source, WatermarkStrategy.noWatermarks(), "Event Source");
DataStream<EventSummary> events =
eventStream
.keyBy(eventMessage -> eventMessage.getCategory() + eventMessage.getEventId())
.window(windowAssigner)
.aggregate(new EventAggregator())
.name("EventAggregator test >> ");
KafkaSink<EventSummary> sink = KafkaSink.<EventSummary>builder()
.setBootstrapServers(LOCAL_KAFKA_BROKER)
.setRecordSerializer(KafkaRecordSerializationSchema.builder()
.setTopic(OUTPUT_KAFKA_TOPIC)
.setValueSerializationSchema(new OutputSummarySerializationSchema())
.build())
.setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.build();
events.sinkTo(sink);
These are the POJO's I've created for input message and output.
# EventMessage POJO
public class EventMessage implements Serializable {
private Long timestamp;
private int eventValue;
private String eventId;
private String category;
public EventMessage() { }
public EventMessage(Long timestamp, int eventValue, String eventId, String category) {
this.timestamp = timestamp;
this.eventValue = eventValue;
this.eventId = eventId;
this.category = category;
}
.....
}
# EventSummary POJO
public class EventSummary {
public EventMessage eventMessage;
public int sum;
public int count;
public EventSummary() { }
....
}
These are the deserialization and serialization schemas I'm using.
public class InputDeserializationSchema implements DeserializationSchema<EventMessage> {
static ObjectMapper objectMapper = new ObjectMapper();
#Override
public EventMessage deserialize(byte[] bytes) throws IOException {
return objectMapper.readValue(bytes, EventMessage.class);
}
#Override
public boolean isEndOfStream(EventMessage inputMessage) {
return false;
}
#Override
public TypeInformation<EventMessage> getProducedType() {
return TypeInformation.of(EventMessage.class);
}
}
public class OutputSummarySerializationSchema implements SerializationSchema<EventSummary> {
static ObjectMapper objectMapper = new ObjectMapper();
Logger logger = LoggerFactory.getLogger(OutputSummarySerializationSchema.class);
#Override
public byte[] serialize(EventSummary eventSummary) {
if (objectMapper == null) {
objectMapper.setVisibility(PropertyAccessor.FIELD, JsonAutoDetect.Visibility.ANY);
objectMapper = new ObjectMapper();
}
try {
String json = objectMapper.writeValueAsString(eventSummary);
return json.getBytes();
} catch (com.fasterxml.jackson.core.JsonProcessingException e) {
logger.error("Failed to parse JSON", e);
}
return new byte[0];
}
}
I'm using this aggregator for aggregating the JSON messages.
public class EventAggregator implements AggregateFunction<EventMessage, EventSummary, EventSummary> {
private static final Logger log = LoggerFactory.getLogger(EventAggregator.class);
#Override
public EventSummary createAccumulator() {
return new EventSummary();
}
#Override
public EventSummary add(EventMessage eventMessage, EventSummary eventSummary) {
eventSummary.eventMessage = eventMessage;
eventSummary.count += 1;
eventSummary.sum += eventMessage.getEventValue();
return eventSummary;
}
#Override
public EventSummary getResult(EventSummary eventSummary) {
return eventSummary;
}
#Override
public EventSummary merge(EventSummary summary1, EventSummary summary2) {
return new EventSummary(null,
summary1.sum + summary2.sum,
summary1.count + summary2.count);
}
}
Can someone help me on this?
Thanks in advance.

In order for event time windowing to work, you must specify a proper WatermarkStrategy. Otherwise, the windows will never close, and no results will be produced.
The role that watermarks play is to mark a place in a stream, and indicate that the stream is, at that point, complete through some specific timestamp. Until receiving this indicator of stream completeness, windows continue to wait for more events to be assigned to them.
To simply the debugging the watermarks, you might switch to a PrintSink until you get the watermarking working properly. Or to simplify debugging the KafkaSink, you could switch to using processing time windows until the sink is working.

Related

Springboot Generic JPA AttributeConverter

I have below sample code, i am trying to write generic JPA converter which could convert,
Collection of user defined objects to Json
vice versa
Below is sample code I was trying to achieve the result but looks like it's not correct.
Please take a look.
To be more clear i need like below
List To string
Json String to List
Please suggest
#Converter(autoApply = true)
public class SetJsonConverter<E extends Collections> implements AttributeConverter<E, Object> {
#Override
public Object convertToDatabaseColumn(E e) {
return null;
}
#Override
public E convertToEntityAttribute(Object o) {
ObjectMapper objectMapper=new ObjectMapper();
return null;
}
}
JPA will not automatically handle generic converters. Each collection type and element type will require subclassing. You will need to define the base converter the following way:
public class AbstractJsonConverter<T, C extends Collection<T>> implements AttributeConverter<C, String> {
private final ObjectMapper objectMapper;
private final TypeReference<C> collectionType;
public AbstractJsonConverter(ObjectMapper objectMapper, Class<T> elementType, TypeReference<C> collectionType) {
this.objectMapper = objectMapper;
this.collectionType = collectionType;
}
#Override
public String convertToDatabaseColumn(C collection) {
try {
return objectMapper.writeValueAsString(collection);
} catch (JsonProcessingException e) {
throw new RuntimeException(e);
}
}
#Override
public C convertToEntityAttribute(String jsonString) {
try {
return objectMapper.readValue(jsonString, collectionType);
} catch (IOException e) {
throw new RuntimeException();
}
}
}
You then define specific converters as:
#Converter(autoApply = true)
public class UserSetConverter extends AbstractJsonConverter<User, Set<User>> {
public UserSetConverter(ObjectMapper objectMapper) {
super(objectMapper, User.class, new TypeReference<Set<User>>() {});
}
}

Dynamic Merge of Infinite Reactor streams

Usecase:
There is a module which Listens for events in synchronous mode. In the same module using the EmitterProccessor, the event is converted to Flux and made as infinite stream of events. Now there is a upstream module which can subscribes for these event streams. The problem here is how can I dynamically merge these streams to one and then subscribe in a single stream. A simple example is, let us say there are N number of sensors, we can dynamically register these sensors and start listening for measurements as stream of data in single stream after merging them into one stream. Here is the code sample written to mock this behavior.
Create callback and start listening for events
public interface CallBack {
void callBack(int name);
void done();
}
#Slf4j
#RequiredArgsConstructor
public class CallBackService {
private CallBack callBack;
private final Function<Integer, Integer> func;
public void register(CallBack intf) {
this.callBack = intf;
}
public void startServer() {
log.info("Callback started..");
IntStream.range(0, 10).forEach(i -> {
callBack.callBack(func.apply(i));
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
log.info("Callback finished..");
callBack.done();
}
}
Convert the events to streams using event proccessor
#Slf4j
public class EmitterService implements CallBack {
private EmitterProcessor<Integer> emitterProcessor;
public EmitterService(){
emitterProcessor = EmitterProcessor.create();
}
public EmitterProcessor<Integer> getEmmitor() {
return emitterProcessor;
}
#Override
public void callBack(int name) {
log.info("callbakc {} invoked", name);
//fluxSink.next(name);
emitterProcessor.onNext(name);
}
public void done() {
//fluxSink.complete();
emitterProcessor.onComplete();
}
}
public class WrapperService {
EmitterService service1;
ExecutorService service2;
public Flux<Integer> startService(Function<Integer, Integer> func) {
CallBackService service = new CallBackService(func);
service1 = new EmitterService();
service.register(service1);
service2 = Executors.newSingleThreadExecutor();
service2.submit(service::startServer);
return service1.getEmmitor();
}
public void shutDown() {
service1.getEmmitor().onComplete();
service2.shutdown();
}
}
Subscribe for the events
#Slf4j
public class MainService {
public static void main(String[] args) throws InterruptedException {
TopicProcessor<Integer> stealer = TopicProcessor.<Integer>builder().share(true).build();
CountDownLatch latch = new CountDownLatch(20);
WrapperService n1 =new WrapperService();
WrapperService n2 =new WrapperService();
// n1.startService(i->i).mergeWith(n2.startService(i->i*2)).subscribe(stealer);
n1.startService(i->i).subscribe(stealer);
n2.startService(i->i*2).subscribe(stealer);
stealer.subscribeOn(Schedulers.boundedElastic())
.subscribe(x->{
log.info("Stole=>{}", x);
latch.countDown();
log.info("Latch count=>{}", latch.getCount());
});
latch.await();
n1.shutDown();
n2.shutDown();
stealer.shutdown();
}
}
Tried to use TopicProccessor with no success. In the above code subscription happens for first source, for second source there is no subscription. however if use n1.startService(i->i).mergeWith(n2.startService(i->i*2)).subscribe(stealer); subscription works, but there is no dynamic behavior in this case. Need to change subscriber every time.

multi-tenant application in spring - connecting to DB

Hi Experts,
I am working on a multi-tenant project. It's a table per tenant architecture.
We are using spring and JPA (eclipse-link) for this purpose.
Here our use case is when ever a new customer subscribes to our application a new data base would be created for the customer.
As spring configuration would be loaded only during start-up how to load this new db configuration at run time?
Could some one please give some pointers?
Thanks in advance.
BR,
kitty
For multitenan, first you need to create MultitenantConfig.java
like below file.
here tenants.get("Musa") is my tenant name, comes from application.properties file
#Configuration
#EnableConfigurationProperties(MultitenantProperties.class)
public class MultiTenantConfig extends WebMvcConfigurerAdapter {
/** The Constant log. */
private static final Logger log = LoggerFactory.getLogger(MultiTenantConfig.class);
/** The multitenant config. */
#Autowired
private MultitenantProperties multitenantConfig;
#Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(new MultiTenancyInterceptor());
}
/**
* Data source.
*
* #return the data source
*/
#Bean
public DataSource dataSource() {
Map<Object, Object> tenants = getTenants();
MultitenantDataSource multitenantDataSource = new MultitenantDataSource();
multitenantDataSource.setDefaultTargetDataSource(tenants.get("Musa"));
multitenantDataSource.setTargetDataSources(tenants);
// Call this to finalize the initialization of the data source.
multitenantDataSource.afterPropertiesSet();
return multitenantDataSource;
}
/**
* Gets the tenants.
*
* #return the tenants
*/
private Map<Object, Object> getTenants() {
Map<Object, Object> resolvedDataSources = new HashMap<>();
for (Tenant tenant : multitenantConfig.getTenants()) {
DataSourceBuilder dataSourceBuilder = new DataSourceBuilder(this.getClass().getClassLoader());
dataSourceBuilder.driverClassName(tenant.getDriverClassName()).url(tenant.getUrl())
.username(tenant.getUsername()).password(tenant.getPassword());
DataSource datasource = dataSourceBuilder.build();
for (String prop : tenant.getTomcat().keySet()) {
try {
BeanUtils.setProperty(datasource, prop, tenant.getTomcat().get(prop));
} catch (IllegalAccessException | InvocationTargetException e) {
log.error("Could not set property " + prop + " on datasource " + datasource);
}
}
log.info(datasource.toString());
resolvedDataSources.put(tenant.getName(), datasource);
}
return resolvedDataSources;
}
}
public class MultitenantDataSource extends AbstractRoutingDataSource {
#Override
protected Object determineCurrentLookupKey() {
return TenantContext.getCurrentTenant();
}
}
public class MultiTenancyInterceptor extends HandlerInterceptorAdapter {
#Override
public boolean preHandle(HttpServletRequest req, HttpServletResponse res, Object handler) {
TenantContext.setCurrentTenant("Musa");
return true;
}
}
#ConfigurationProperties(prefix = "multitenancy")
public class MultitenantProperties {
public static final String CURRENT_TENANT_IDENTIFIER = "tenantId";
public static final int CURRENT_TENANT_SCOPE = 0;
private List<Tenant> tenants;
public List<Tenant> getTenants() {
return tenants;
}
public void setTenants(List<Tenant> tenants) {
this.tenants = tenants;
}
}
public class Tenant {
private String name;
private String url;
private String driverClassName;
private String username;
private String password;
private Map<String,String> tomcat;
//setter gettter
public class TenantContext {
private static ThreadLocal<Object> currentTenant = new ThreadLocal<>();
public static void setCurrentTenant(Object tenant) {
currentTenant.set(tenant);
}
public static Object getCurrentTenant() {
return currentTenant.get();
}
}
add below properties in application.properties
multitenancy.tenants[0].name=Musa
multitenancy.tenants[0].url<url>
multitenancy.tenants[0].username=<username>
multitenancy.tenants[0].password=<password>
multitenancy.tenants[0].driver-class-name=<driverclass>

Kafka custom deserializer converting to Java object

I'm using Spring Kafka integration and I've my own value generic serializer/deserializer as shown below
Serializer:
public class KafkaSerializer<T> implements Serializer<T> {
private ObjectMapper mapper;
#Override
public void close() {
}
#Override
public void configure(final Map<String, ?> settings, final boolean isKey) {
mapper = new ObjectMapper();
}
#Override
public byte[] serialize(final String topic, final T object) {
try {
return mapper.writeValueAsBytes(object);
} catch (final JsonProcessingException e) {
throw new IllegalArgumentException(e);
}
}
}
Deserializer:
public class KafkaDeserializer<T> implements Deserializer<T> {
private ObjectMapper mapper;
#Override
public void close() {
}
#Override
public void configure(final Map<String, ?> settings, final boolean isKey) {
mapper = new ObjectMapper();
}
#Override
public T deserialize(final String topic, final byte[] bytes) {
try {
return mapper.readValue(bytes, new TypeReference<T>() {
});
} catch (final IOException e) {
throw new IllegalArgumentException(e);
}
}
}
The serializer is working perfectly but when it comes to deserialization of values while consuming message I get a LinkedHashMap instead of desired object, please enlighten me where I'm mistaking, thanks in advance.
Some situations need be confirmed:
your Serializer is works
the Deserializer is just works but it returned a LinkedHashMap instead of a object that you expected, right? and you can't convert that LinkedHashMap to your object.
I find the question transfers to How to Convert/Cast a LinkedHashMap to a Object, and you used ObjectMapper. If all situations can be confirmed, I found here a good post may be answer your question Casting LinkedHashMap to Complex Object
mapper.convertValue(desiredObject, new TypeReference<type-of-desiredObject>() { })
ObjectMapper's API at [here](https://fasterxml.github.io/jackson-databind/javadoc/2.3.0/com/fasterxml/jackson/databind/ObjectMapper.html#convertValue(java.lang.Object, com.fasterxml.jackson.core.type.TypeReference))
And I hopes I don't missing your intention, and you can complement necessary situations, so someone or me can improve this answer.

Get current resource name using MultiResourceItemReader Spring batch

I am using MultiResourceItemReader in Spring Batch for reading multiple XML files and I want to get current resource.Here is my configuration:
public class MultiFileResourcePartitioner extends MultiResourceItemReader<MyObject> {
#Override
public void update(final ExecutionContext pExecutionContext) throws ItemStreamException {
super.update(pExecutionContext);
if (getCurrentResource() != null && getCurrentResource().getFilename() != null) {
System.out.println("update:" + getCurrentResource().getFilename());
}
}
}
And my reader:
<bean id="myMultiSourceReader"
class="mypackage.MultiFileResourcePartitioner">
<property name="resources" value="file:${input.directory}/*.xml" />
</bean>
The code above read XML files correctly but the method getCurrentResources() return null.
By debugging, the batch enter to update method
Please help!
There is a specific interface for this problem called ResourceAware: it's purpouse is to inject current resource into objects read from a MultiResourceItemReader.
Check this thread for further information.
I tried it with a simple Listener for logging the current resource from a injected {#link MultiResourceItemReader}. Saves the value to the StepExecutionContext.
To get it working with a step scoped MultiResourceItemReader i access the proxy directly, see http://forum.springsource.org/showthread.php?120775-Accessing-the-currently-processing-filename, https://gist.github.com/1582202 and https://jira.springsource.org/browse/BATCH-1831.
public class GetCurrentResourceChunkListener implements ChunkListener, StepExecutionListener {
private StepExecution stepExecution;
private Object proxy;
private final List<String> fileNames = new ArrayList<>();
public void setProxy(Object mrir) {
this.proxy = mrir;
}
#Override
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
return stepExecution.getExitStatus();
}
#Override
public void beforeChunk(ChunkContext cc) {
if (proxy instanceof Advised) {
try {
Advised advised = (Advised) proxy;
Object obj = advised.getTargetSource().getTarget();
MultiResourceItemReader mrirTarget = (MultiResourceItemReader) obj;
if (mrirTarget != null
&& mrirTarget.getCurrentResource() != null
&& !fileNames.contains(mrirTarget.getCurrentResource().getFilename())) {
String fileName = mrirTarget.getCurrentResource().getFilename();
fileNames.add(fileName);
String index = String.valueOf(fileNames.indexOf(fileName));
stepExecution.getExecutionContext().put("current.resource" + index, fileName);
}
} catch (Exception ex) {
throw new RuntimeException(ex);
}
}
}
#Override
public void afterChunk(ChunkContext cc) {
}
#Override
public void afterChunkError(ChunkContext cc) {
}
}
see https://github.com/langmi/spring-batch-examples-playground for a working example - look for "GetCurrentResource..."
public class CpsFileItemProcessor implements ItemProcessor<T, T> {
#Autowired
MultiResourceItemReader multiResourceItemReader;
private String fileName;
#Override
public FileDetailsEntityTemp process(T item) {
if(multiResourceItemReader.getCurrentResource()!=null){
fileName = multiResourceItemReader.getCurrentResource().getFilename();
}
item.setFileName(fileName);
return item;
}
}