Kafka custom deserializer converting to Java object - apache-kafka

I'm using Spring Kafka integration and I've my own value generic serializer/deserializer as shown below
Serializer:
public class KafkaSerializer<T> implements Serializer<T> {
private ObjectMapper mapper;
#Override
public void close() {
}
#Override
public void configure(final Map<String, ?> settings, final boolean isKey) {
mapper = new ObjectMapper();
}
#Override
public byte[] serialize(final String topic, final T object) {
try {
return mapper.writeValueAsBytes(object);
} catch (final JsonProcessingException e) {
throw new IllegalArgumentException(e);
}
}
}
Deserializer:
public class KafkaDeserializer<T> implements Deserializer<T> {
private ObjectMapper mapper;
#Override
public void close() {
}
#Override
public void configure(final Map<String, ?> settings, final boolean isKey) {
mapper = new ObjectMapper();
}
#Override
public T deserialize(final String topic, final byte[] bytes) {
try {
return mapper.readValue(bytes, new TypeReference<T>() {
});
} catch (final IOException e) {
throw new IllegalArgumentException(e);
}
}
}
The serializer is working perfectly but when it comes to deserialization of values while consuming message I get a LinkedHashMap instead of desired object, please enlighten me where I'm mistaking, thanks in advance.

Some situations need be confirmed:
your Serializer is works
the Deserializer is just works but it returned a LinkedHashMap instead of a object that you expected, right? and you can't convert that LinkedHashMap to your object.
I find the question transfers to How to Convert/Cast a LinkedHashMap to a Object, and you used ObjectMapper. If all situations can be confirmed, I found here a good post may be answer your question Casting LinkedHashMap to Complex Object
mapper.convertValue(desiredObject, new TypeReference<type-of-desiredObject>() { })
ObjectMapper's API at [here](https://fasterxml.github.io/jackson-databind/javadoc/2.3.0/com/fasterxml/jackson/databind/ObjectMapper.html#convertValue(java.lang.Object, com.fasterxml.jackson.core.type.TypeReference))
And I hopes I don't missing your intention, and you can complement necessary situations, so someone or me can improve this answer.

Related

Custom avro message deserialization with Flink

The Flink consumer application I am developing reads from multiple Kafka topics. The messages published in the different topics adhere to the same schema (formatted as Avro). For schema management, I am using the Confluent Schema Registry.
I have been using the following snippet for the KafkaSource and it works just fine.
KafkaSource<MyObject> source = KafkaSource.<MyObject>builder()
.setBootstrapServers(BOOTSTRAP_SERVERS)
.setTopics(TOPIC-1, TOPIC-2)
.setGroupId(GROUP_ID)
.setStartingOffsets(OffsetsInitializer.earliest())
.setValueOnlyDeserializer(ConfluentRegistryAvroDeserializationSchema.forSpecific(MyObject.class, SCHEMA_REGISTRY_URL))
.build();
Now, I want to determine the topic-name for each message that I process. Since the current deserializer is ValueOnly, I started looking into the setDeserializer() method which I felt would give me access to the whole ConsumerRecord object and I can fetch the topic-name from that.
However, I am unable to figure out how to use that implementation. Should I implement my own deserializer? If so, how does the Schema registry fit into that implementation?
You can use the setDeserializer method with a KafkaRecordDeserializationSchema that might look something like this:
public class KafkaUsageRecordDeserializationSchema
implements KafkaRecordDeserializationSchema<UsageRecord> {
private static final long serialVersionUID = 1L;
private transient ObjectMapper objectMapper;
#Override
public void open(DeserializationSchema.InitializationContext context) throws Exception {
KafkaRecordDeserializationSchema.super.open(context);
objectMapper = JsonMapper.builder().build();
}
#Override
public void deserialize(
ConsumerRecord<byte[], byte[]> consumerRecord,
Collector<UsageRecord> collector) throws IOException {
collector.collect(objectMapper.readValue(consumerRecord.value(), UsageRecord.class));
}
#Override
public TypeInformation<UsageRecord> getProducedType() {
return TypeInformation.of(UsageRecord.class);
}
}
Then you can use the ConsumerRecord to access the topic and other metadata.
I took inspiration from the above answer (by David) and added the following custom deserializer -
KafkaSource<MyObject> source = KafkaSource.<MyObject>builder()
.setBootstrapServers(BOOTSTRAP_SERVERS)
.setTopics(TOPIC-1, TOPIC-2)
.setGroupId(GROUP_ID)
.setStartingOffsets(OffsetsInitializer.earliest())
.setDeserializer(KafkaRecordDeserializationSchema.of(new KafkaDeserializationSchema<Event>{
DeserializationSchema deserialzationSchema = ConfluentRegistryAvroDeserializationSchema.forSpecific(MyObject.class, SCHEMA_REGISTRY_URL);
#Override
public boolean isEndOfStream(Event nextElement) {
return false;
}
#Override
public String deserialize(ConsumerRecord<byte[], byte[]> consumerRecord) throws Exception {
Event event = new Event();
event.setTopicName(record.topic());
event.setMyObject((MyObject) deserializationSchema.deserialize(record.value()));
return event;
}
#Override
public TypeInformation<String> getProducedType() {
return TypeInformation.of(Event.class);
}
})).build();
The Event class is a wrapper over the MyObject class with additional field for storing the topic name.

Flink - KafkaSink not writing data to kafka topic

I'm trying to read JSON events from Kafka, aggregate it on a eventId and its category and write them to a different kafka topic through flink. The program is able to read messages from kafka, but KafkaSink is not writing the data back to the other kafka topic. I'm not sure on the mistake I'm doing. Can someone please check and let me know, where I'm wrong. Here is the code I'm using.
KafkaSource<EventMessage> source = KafkaSource.<EventMessage>builder()
.setBootstrapServers(LOCAL_KAFKA_BROKER)
.setTopics(INPUT_KAFKA_TOPIC)
.setGroupId(LOCAL_GROUP)
.setStartingOffsets(OffsetsInitializer.earliest())
.setValueOnlyDeserializer(new InputDeserializationSchema())
.build();
WindowAssigner<Object, TimeWindow> windowAssigner = TumblingEventTimeWindows.of(WINDOW_SIZE);
DataStream<EventMessage> eventStream = env.fromSource(source, WatermarkStrategy.noWatermarks(), "Event Source");
DataStream<EventSummary> events =
eventStream
.keyBy(eventMessage -> eventMessage.getCategory() + eventMessage.getEventId())
.window(windowAssigner)
.aggregate(new EventAggregator())
.name("EventAggregator test >> ");
KafkaSink<EventSummary> sink = KafkaSink.<EventSummary>builder()
.setBootstrapServers(LOCAL_KAFKA_BROKER)
.setRecordSerializer(KafkaRecordSerializationSchema.builder()
.setTopic(OUTPUT_KAFKA_TOPIC)
.setValueSerializationSchema(new OutputSummarySerializationSchema())
.build())
.setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.build();
events.sinkTo(sink);
These are the POJO's I've created for input message and output.
# EventMessage POJO
public class EventMessage implements Serializable {
private Long timestamp;
private int eventValue;
private String eventId;
private String category;
public EventMessage() { }
public EventMessage(Long timestamp, int eventValue, String eventId, String category) {
this.timestamp = timestamp;
this.eventValue = eventValue;
this.eventId = eventId;
this.category = category;
}
.....
}
# EventSummary POJO
public class EventSummary {
public EventMessage eventMessage;
public int sum;
public int count;
public EventSummary() { }
....
}
These are the deserialization and serialization schemas I'm using.
public class InputDeserializationSchema implements DeserializationSchema<EventMessage> {
static ObjectMapper objectMapper = new ObjectMapper();
#Override
public EventMessage deserialize(byte[] bytes) throws IOException {
return objectMapper.readValue(bytes, EventMessage.class);
}
#Override
public boolean isEndOfStream(EventMessage inputMessage) {
return false;
}
#Override
public TypeInformation<EventMessage> getProducedType() {
return TypeInformation.of(EventMessage.class);
}
}
public class OutputSummarySerializationSchema implements SerializationSchema<EventSummary> {
static ObjectMapper objectMapper = new ObjectMapper();
Logger logger = LoggerFactory.getLogger(OutputSummarySerializationSchema.class);
#Override
public byte[] serialize(EventSummary eventSummary) {
if (objectMapper == null) {
objectMapper.setVisibility(PropertyAccessor.FIELD, JsonAutoDetect.Visibility.ANY);
objectMapper = new ObjectMapper();
}
try {
String json = objectMapper.writeValueAsString(eventSummary);
return json.getBytes();
} catch (com.fasterxml.jackson.core.JsonProcessingException e) {
logger.error("Failed to parse JSON", e);
}
return new byte[0];
}
}
I'm using this aggregator for aggregating the JSON messages.
public class EventAggregator implements AggregateFunction<EventMessage, EventSummary, EventSummary> {
private static final Logger log = LoggerFactory.getLogger(EventAggregator.class);
#Override
public EventSummary createAccumulator() {
return new EventSummary();
}
#Override
public EventSummary add(EventMessage eventMessage, EventSummary eventSummary) {
eventSummary.eventMessage = eventMessage;
eventSummary.count += 1;
eventSummary.sum += eventMessage.getEventValue();
return eventSummary;
}
#Override
public EventSummary getResult(EventSummary eventSummary) {
return eventSummary;
}
#Override
public EventSummary merge(EventSummary summary1, EventSummary summary2) {
return new EventSummary(null,
summary1.sum + summary2.sum,
summary1.count + summary2.count);
}
}
Can someone help me on this?
Thanks in advance.
In order for event time windowing to work, you must specify a proper WatermarkStrategy. Otherwise, the windows will never close, and no results will be produced.
The role that watermarks play is to mark a place in a stream, and indicate that the stream is, at that point, complete through some specific timestamp. Until receiving this indicator of stream completeness, windows continue to wait for more events to be assigned to them.
To simply the debugging the watermarks, you might switch to a PrintSink until you get the watermarking working properly. Or to simplify debugging the KafkaSink, you could switch to using processing time windows until the sink is working.

spring.json.type.mapping property ignored when specifying a custom ObjectMapper

I had an issue with injecting a custom ObjectMapper into Spring Kafka serializer which I have resolved with this answer, LocalDateTime are getting serialized with right pattern.
#Configuration
public class KafkaCustomizer implements DefaultKafkaProducerFactoryCustomizer {
#Bean
public ObjectMapper objectMapper() {
var mapper = new ObjectMapper();
var module = new JavaTimeModule();
var serializer = new LocalDateTimeSerializer(
DateTimeFormatter.ofPattern(DateConstants.DATETIME_FORMAT_PATTERN));
module.addSerializer(LocalDateTime.class, serializer);
mapper.registerModule(module);
mapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
return mapper;
}
#Override
public void customize(DefaultKafkaProducerFactory<?, ?> producerFactory) {
producerFactory.setValueSerializer(new JsonSerializer<>(objectMapper()));
}
}
But now I face another problem, the spring.kafka.producer.properties.spring.json.type.mapping property is being ignored.
The __TypeId__ header of my record is set with FQCN and not with the token I have put in spring.json.type.mapping property : foo > com.foo.package.Foo
When I did debbug it seems that the configure method of org.springframework.kafka.support.serializer.JsonSerializer class is not being invoked :
#Override
public void configure(Map<String, ?> configs, boolean isKey) {
...
if (configs.containsKey(TYPE_MAPPINGS) && !this.typeMapperExplicitlySet
&& this.typeMapper instanceof AbstractJavaTypeMapper) {
((AbstractJavaTypeMapper) this.typeMapper)
.setIdClassMapping(createMappings((String) configs.get(TYPE_MAPPINGS)));
}
}
But when I disable the customization
#Override
public void customize(DefaultKafkaProducerFactory<?, ?> producerFactory) {
// producerFactory.setValueSerializer(new JsonSerializer<>(objectMapper()));
}
Then the __TypeId__ header is set with right token But as expected I loose the date format with my custom ObjectMapper
So how to handle this whole situation ?
If you do your own new JsonSerializer<>, you are on your own to feed it with appropriate producer configs. When the instance of serialized is not controlled by Kafka Client, that configure() is not called.
I would say it is possible to do it like this in your case:
public void customize(DefaultKafkaProducerFactory<?, ?> producerFactory) {
JsonSerializer<Object> jsonSerializer = new JsonSerializer<>(objectMapper());
jsonSerializer.configure(producerFactory.getConfigurationProperties(), false);
producerFactory.setValueSerializer(jsonSerializer);
}
There is some info in docs: https://docs.spring.io/spring-kafka/docs/current/reference/html/#tip-json, but probably we need to extend it for the programmatic configuration case...

Springboot Generic JPA AttributeConverter

I have below sample code, i am trying to write generic JPA converter which could convert,
Collection of user defined objects to Json
vice versa
Below is sample code I was trying to achieve the result but looks like it's not correct.
Please take a look.
To be more clear i need like below
List To string
Json String to List
Please suggest
#Converter(autoApply = true)
public class SetJsonConverter<E extends Collections> implements AttributeConverter<E, Object> {
#Override
public Object convertToDatabaseColumn(E e) {
return null;
}
#Override
public E convertToEntityAttribute(Object o) {
ObjectMapper objectMapper=new ObjectMapper();
return null;
}
}
JPA will not automatically handle generic converters. Each collection type and element type will require subclassing. You will need to define the base converter the following way:
public class AbstractJsonConverter<T, C extends Collection<T>> implements AttributeConverter<C, String> {
private final ObjectMapper objectMapper;
private final TypeReference<C> collectionType;
public AbstractJsonConverter(ObjectMapper objectMapper, Class<T> elementType, TypeReference<C> collectionType) {
this.objectMapper = objectMapper;
this.collectionType = collectionType;
}
#Override
public String convertToDatabaseColumn(C collection) {
try {
return objectMapper.writeValueAsString(collection);
} catch (JsonProcessingException e) {
throw new RuntimeException(e);
}
}
#Override
public C convertToEntityAttribute(String jsonString) {
try {
return objectMapper.readValue(jsonString, collectionType);
} catch (IOException e) {
throw new RuntimeException();
}
}
}
You then define specific converters as:
#Converter(autoApply = true)
public class UserSetConverter extends AbstractJsonConverter<User, Set<User>> {
public UserSetConverter(ObjectMapper objectMapper) {
super(objectMapper, User.class, new TypeReference<Set<User>>() {});
}
}

How can i return hash map from server to client

I have the following method on sever.
public HashMap<String,Set> select()
{
HashMap <String,Set> mp = new HashMap();
//some code
return mp;
}
whenver I am trying to return
<String , Set>
it is going onFailur
but I did this
<String , String >
then its success why this happning
i am using gwt RPC and my client code is
greetingService.select(usertextbox.getText(),new AsyncCallback<HashMap<String,Set>>()
{
public void onFailure(Throwable caught) {
Window.alert("not done");
}
#Override
public void onSuccess(HashMap hm) {
Window.alert("done");
}
Service code is
HashMap<String, Set> select(String user);
service implmentation is
public HashMap<String,Set> select(String user)
{
try {
Session studentDbSession = new Session("localhost",5984);
Database db = studentDbSession.getDatabase("hello");
Document d = db.getDocument("xyz");
JSONArray key = d.names().discard(0).discard(0);
for(int i=0;i<key.size();i++)
{
if(d.containsKey(key.get(i)))
{
k=key.getString(i);
Set aaa=d.getJSONObject(key.getString(i)).entrySet();
System.out.println("----------------");
mp.put(k,aaa);
return mp;
}
Always try to avoid Raw type. Let me share you a sample code. Try it at you end with this sample first or validate all the classes of your code.
Sample code:
RemoteService interface
#RemoteServiceRelativePath("greet")
public interface GreetingService extends RemoteService {
public HashMap<String, Set<String>> select(String input) throws IllegalArgumentException;
}
GreetingServiceAsync interface
public interface GreetingServiceAsync {
void select(String input, AsyncCallback<HashMap<String, Set<String>>> callback);
}
GreetingServiceImpl class
public class GreetingServiceImpl extends RemoteServiceServlet implements GreetingService {
#Override
public HashMap<String, Set<String>> select(String input) throws IllegalArgumentException {
HashMap<String, Set<String>> output = new HashMap<String, Set<String>>();
Set<String> set = new HashSet<String>();
set.add("Hello " + input);
output.put("greeting", set);
return output;
}
}
Entry Point class
public void greetService() {
GreetingServiceAsync greetingService = GWT.create(GreetingService.class);
greetingService.select("Mark", new AsyncCallback<HashMap<String, Set<String>>>() {
#Override
public void onSuccess(HashMap<String, Set<String>> result) {
Window.alert(result.get("greeting").iterator().next());
}
#Override
public void onFailure(Throwable caught) {
Window.alert("fail");
}
});
}
web.xml:
<servlet>
<servlet-name>gwtService</servlet-name>
<servlet-class>com.x.y.z.server.GWTServiceImpl</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>gwtService</servlet-name>
<url-pattern>/moduleName/gwtService</url-pattern>
</servlet-mapping>
output:
What your GWT RPC call HashMap<String, Set> select(String user); does is following:
client-side: serialize String user in order to send it to server
server-side: deserialize RPC call, find implementation of select(String user) and execute it
server-side: serialize return value HashMap<String, Set> in order to return it to client
client-side: deserialize return value and call AsyncCallback
The problem lies in step 3), the serializing of HashMap<String, Set>. The HashMap itself is not the issue; it is the Set which causes the error. When serializing a raw class, GWT usually assumes that the generic type is <Object>. And since Object is not serializable in GWT, an exception is thrown.
Fix: As Braj already mentioned -- give your Set a serializible generic type, e. g. Set<String>, or define your own interface in a package which is accessable from both client- and server-side
public interface UserProperty extends IsSerializable{
}
and change the RPC method like this:
HashMap<String, Set<UserProperty> select(String user);
Have a look at Braj's answer for where to find all the places you need to change after changing your RPC method!