Unable to deserialize a custom Serde using Kafka-Streams - scala

I am trying to create a simple topology to upperCase a person entity using Kafka-streams.
case class Person(id: Int, name: String, age: Int)
My custom Serializer and Deserializer are like this:
class KafkaBytesSerializer[T] extends Serializer[T] {
override def configure(configs: util.Map[String, _], isKey: Boolean): Unit = 0
override def serialize(topic: String, data: T): Array[Byte] = {
val stream: ByteArrayOutputStream = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(stream)
oos.writeObject(data)
oos.close()
stream.toByteArray
}
override def close(): Unit = 0
}
class KafkaBytesDeserializer[T] extends Deserializer[T]{
override def configure(configs: util.Map[String, _], isKey: Boolean): Unit = 0
override def deserialize(topic: String, data: Array[Byte]): T = {
val objIn = new ObjectInputStream(new ByteArrayInputStream(data))
val obj = objIn.readObject().asInstanceOf[T]
objIn.close
obj
}
override def close(): Unit = 0
}
The main calling code of the streaming app is this:
val personSerde: Serde[Person] =
Serdes.serdeFrom(new KafkaBytesSerializer[Person], new KafkaBytesDeserializer[Person])
val builder = new StreamsBuilder()
builder
.stream[String, Person](INPUT_TOPIC)(Consumed.`with`(Serdes.String(), personSerde))
.map[String, Person]((k,p) => (k, Person(p.id, p.name.toUpperCase(), p.age)))
.peek((k, p) => println("Key" + k + " Person: " + p))
.to(OUTPUT_TOPIC)(Produced.`with`(Serdes.String(), personSerde))
When I run the application, I am getting the class cast exception:
[MainApp-consumer-group-b45b436d-1412-494b-9733-f75a61c9b9e3-StreamThread-1] ERROR org.apache.kafka.streams.processor.internals.StreamThread - stream-thread [MainApp-consumer-group-b45b436d-1412-494b-9733-f75a61c9b9e3-StreamThread-1] Encountered the following error during processing:
java.lang.ClassCastException: [B cannot be cast to models.Person
at org.apache.kafka.streams.scala.FunctionsCompatConversions$ValueMapperFromFunction$$anon$6.apply(FunctionsCompatConversions.scala:66)
at org.apache.kafka.streams.kstream.internals.AbstractStream.lambda$withKey$1(AbstractStream.java:103)
at org.apache.kafka.streams.kstream.internals.KStreamMapValues$KStreamMapProcessor.process(KStreamMapValues.java:40)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:201)
I suspect something is going wrong at the deserialization level, but not sure why?
Any pointers will be helpful.

Issue is with you ProducerApp. You set value.serializer to com.thebigscale.serdes.PersonSerializer and then try to send array of bytes. You shouldn't serialize you POJO. Kafka Serializer will do it for you - just sent Person object instance.
Below I've fixed your code with comments
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "com.thebigscale.serdes.PersonSerializer")
val producer = new KafkaProducer[String, Person](props) // <-- Instead BYTE_ARRAY -> Person
val person = new Person(4, "user4", 27)
//val personSerializer = new KafkaBytesSerializer[Person]() // remove
//val bytePerson: BYTE_ARRAY = personSerializer.serialize("", person) // remove
val record = new ProducerRecord[String, Person](KafkaConf.INPUT_TOPIC, "key1", person) // instead BYTE_ARRAY -> Person, bytePerson -> person
producer.send(record, new Callback {
override def onCompletion(metadata: RecordMetadata, exception: Exception): Unit = {
if (exception != null ) {
println("Exception thrown by producer: " + exception)
} else {
println("Record sent successfully: " + metadata)
}
}
})

Related

Trying to extract out GraphStageLogic for custom stateful implementation and then passing it as parameter to GraphStage is giving exception

Below is simplified code snippet, where GraphStateLogic implementaion is passed to GraphStage as an constructor argument :-
package akka.shapes.examples.notworking
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl.{GraphDSL, RunnableGraph, Sink, Source}
import akka.stream.stage.{GraphStage, GraphStageLogic, InHandler}
//This is base graph stage, where GraphStageLogic and SinkShape are passed in constructor parameter
class BaseGraphStage[T](val shape: SinkShape[T], graphStageLogic: GraphStageLogic) extends GraphStage[ SinkShape[T] ] {
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = graphStageLogic
}
//this is a sample stateful extension of GraphStageLogic, that accepts first ten elements only
class CountLogic(sinkShape: SinkShape[Int], maxValue: Int) extends GraphStageLogic(sinkShape) {
var counter: Long = 0
override def preStart(): Unit = {
pull(sinkShape.in)
}
setHandler(sinkShape.in, new InHandler {
override def onPush(): Unit = {
val e = grab(sinkShape.in)
println("conditional sink : " + e)
counter = counter + 1
counter == maxValue match {
case true => completeStage()
case false => pull(sinkShape.in)
}
}
})
}
object SampleSinkNotWorking {
def main(args: Array[String]): Unit = {
implicit val actorSystem = ActorSystem("NotWroking")
implicit val actorMaterializer = ActorMaterializer()
val inlet = Inlet[Int](name = "sampleInlet")
val sinkShape = SinkShape( inlet )
val countGraphStateLogic = new CountLogic(sinkShape, 10)
val sinkGraphStage = new BaseGraphStage[Int](sinkShape, countGraphStateLogic)
val sink = Sink.fromGraph( sinkGraphStage )
val graph = GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
Source(1 to 100) ~> sink
ClosedShape
}
val runnableGraph = RunnableGraph.fromGraph(graph)
runnableGraph.run()
}
}
Runnning above code is giving ArrayIndexOutOfBoundsException :-
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:
-1 at akka.stream.stage.GraphStageLogic.setHandler(GraphStage.scala:439) at
akka.shapes.examples.notworking.CountLogic.(SampleSinkNotWorking.scala:24)
at
akka.shapes.examples.notworking.SampleSinkNotWorking$.main(SampleSinkNotWorking.scala:46)
at
akka.shapes.examples.notworking.SampleSinkNotWorking.main(SampleSinkNotWorking.scala)
I tried debugging, and it looks like, InLet id is -1, ant it's not getting reset.
But, why it's not getting reset, when GraphStateLogic is passed as an constructor argument to GraphState?
i am bit refactor your code and problem is gone, take a look :
class BaseGraphStage(maxValue: Int) extends GraphStage[SinkShape[Int]] {
val inlet = Inlet[Int](name = "sampleInlet")
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) with StageLogging {
var counter: Int = 0
setHandler(inlet, new InHandler {
override def onPush(): Unit = {
val e = grab(inlet)
log.info(s"$e is consumed")
counter += 1
if (counter == maxValue) {
completeStage()
} else {
pull(inlet)
}
}
})
override def preStart(): Unit =
pull(inlet)
override def postStop(): Unit =
counter = 0
}
override def shape: SinkShape[Int] = SinkShape(inlet)
}
object SampleSinkNotWorking {
def main(args: Array[String]): Unit = {
implicit val actorSystem = ActorSystem("NotWorking")
implicit val actorMaterializer = ActorMaterializer()
val sink = Sink.fromGraph(new BaseGraphStage(10))
Source(1 to 100).runWith(sink)
}
}
Can't answer fully on your last question, but i think all trick is about creating inlets in context of graph stage and not out of that, and use pre and post handlers. Hope that helps.

Create custom Serializer and Deserializer in kafka using scala

I am using kafka_2.10-0.10.0.1 and scala_2.10.3.
I want to write custom Serializer and Deserializer using scala.
I tried with these Serializer (from CustomType) and Deserializer (obtain a CustomType):
class CustomTypeSerializer extends Serializer[CustomType] {
private val gson: Gson = new Gson()
override def configure(configs: util.Map[String, _], isKey: Boolean):
Unit = {
// nothing to do
}
override def serialize(topic: String, data: CustomType): Array[Byte] = {
if (data == null)
null
else
gson.toJson(data).getBytes
}
override def close(): Unit = {
//nothing to do
}
}
class CustomTypeDeserializer extends Deserializer[CustomType] {
private val gson: Gson = new Gson()
override def deserialize(topic: String, bytes: Array[Byte]): CustomType = {
val offerJson = gson.toJson(bytes.toString)
val psType: Type = new TypeToken[CustomType]() {}.getType()
val ps: CustomType = gson.fromJson(offerJson, psType)
ps
}
override def configure(configs: util.Map[String, _], isKey: Boolean):
Unit = {
// nothing to do
}
override def close(): Unit = {
//nothing to do
}
}
But, I got this error:
Exception in thread "main" org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition topic_0_1-1 at offset 26
Caused by: com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was BEGIN_ARRAY at line 1 column 2 path $
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:224)
at com.google.gson.Gson.fromJson(Gson.java:887)
at com.google.gson.Gson.fromJson(Gson.java:852)
at com.google.gson.Gson.fromJson(Gson.java:801)
at kafka.PSDeserializer.deserialize(PSDeserializer.scala:24)
at kafka.PSDeserializer.deserialize(PSDeserializer.scala:18)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:627)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseFetchedData(Fetcher.java:548)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:354)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1000)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
Can you help me please
Find below the custom serializer and deserializer for case class User, User(name:String,id:Int). Replace User in code with your case class. It will work.
import java.io.{ObjectInputStream, ByteArrayInputStream}
import java.util
import org.apache.kafka.common.serialization.{Deserializer, Serializer}
class CustomDeserializer extends Deserializer[User]{
override def configure(configs: util.Map[String,_],isKey: Boolean):Unit = {
}
override def deserialize(topic:String,bytes: Array[Byte]) = {
val byteIn = new ByteArrayInputStream(bytes)
val objIn = new ObjectInputStream(byteIn)
val obj = objIn.readObject().asInstanceOf[User]
byteIn.close()
objIn.close()
obj
}
override def close():Unit = {
}
}
import java.io.{ObjectOutputStream, ByteArrayOutputStream}
import java.util
import org.apache.kafka.common.serialization.Serializer
class CustomSerializer extends Serializer[User]{
override def configure(configs: util.Map[String,_],isKey: Boolean):Unit = {
}
override def serialize(topic:String, data:User):Array[Byte] = {
try {
val byteOut = new ByteArrayOutputStream()
val objOut = new ObjectOutputStream(byteOut)
objOut.writeObject(data)
objOut.close()
byteOut.close()
byteOut.toByteArray
}
catch {
case ex:Exception => throw new Exception(ex.getMessage)
}
}
override def close():Unit = {
}
}

Akka Kafka Custom Serializer

I'm using Akka Kafka (Scala) and want to send custom objects.
class TweetsSerializer extends Serializer[Seq[MyCustomType]] {
override def configure(configs: util.Map[String, _], isKey: Boolean): Unit = ???
override def serialize(topic: String, data: Seq[MyCustomType]): Array[Byte] = ???
override def close(): Unit = ???
}
How can i correctly write my own serializer ? And, what should i do with field config ?
I would use the StringSerializer, I mean, I´d convert all my types to string before produce them. However that works:
case class MyCustomType(a: Int)
class TweetsSerializer extends Serializer[Seq[MyCustomType]] {
private var encoding = "UTF8"
override def configure(configs: java.util.Map[String, _], isKey: Boolean): Unit = {
val propertyName = if (isKey) "key.serializer.encoding"
else "value.serializer.encoding"
var encodingValue = configs.get(propertyName)
if (encodingValue == null) encodingValue = configs.get("serializer.encoding")
if (encodingValue != null && encodingValue.isInstanceOf[String]) encoding = encodingValue.asInstanceOf[String]
}
override def serialize(topic: String, data: Seq[MyCustomType]): Array[Byte] =
try
if (data == null) return null
else return {
data.map { v =>
v.a.toString
}
.mkString("").getBytes("UTF8")
}
catch {
case e: UnsupportedEncodingException =>
throw new SerializationException("Error when serializing string to byte[] due to unsupported encoding " + encoding)
}
override def close(): Unit = Unit
}
}
object testCustomKafkaSerializer extends App {
implicit val producerConfig = {
val props = new Properties()
props.setProperty("bootstrap.servers", "localhost:9092")
props.setProperty("key.serializer", classOf[StringSerializer].getName)
props.setProperty("value.serializer", classOf[TweetsSerializer].getName)
props
}
lazy val kafkaProducer = new KafkaProducer[String, Seq[MyCustomType]](producerConfig)
// Create scala future from Java
private def publishToKafka(id: String, data: Seq[MyCustomType]) = {
kafkaProducer
.send(new ProducerRecord("outTopic", id, data))
.get()
}
val input = MyCustomType(1)
publishToKafka("customSerializerTopic", Seq(input))
}

Unable to parse JSON with GSON in Scala

I am using Gson to parse json to Scala, but error occured,the code is as follows, looks the Gson.fromJson(String, java.lang.reflect.Type type)
object GsonUtils {
val GSON = new GsonBuilder().create()
def java2Json(obj: Object) = GSON.toJson(obj)
def json2Java[T](json: String, tyze: Type) = GSON.fromJson(json, tyze)
}
case class Data(#BeanProperty val name: String, #BeanProperty val age: Int)
object GsonUtilsTest {
def main(args: Array[String]) {
val d = Data("1",1)
val json = GsonUtils.java2Json(d)
println(json)
//ERROR
val d2 = GsonUtils.json2Java(json, classOf[Data]).asInstanceOf[Data]
val dats = new java.util.ArrayList[Data]()
dats.add(Data("1",1))
val json2 = GsonUtils.java2Json(dats)
val tyze = new TypeToken[ java.util.List[Data]](){
}.getType()
//ERROR
GsonUtils.json2Java(json2,tyze)
}
}
When I run it, exception throws, the exception is:
Exception in thread "main" java.lang.ClassCastException: com.xyz.Data incompatible with scala.runtime.Nothing$
at java.lang.ClassCastException.<init>(ClassCastException.java:58)
at com.xyz.GsonUtils$.json2Java(GsonUtils.scala:18)
at com.xyz.GsonUtilsTest$.main(GsonUtils.scala:30)
at com.xyz.GsonUtilsTest.main(GsonUtils.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:88)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:613)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Use fromJson(String json, Class classOfT) instead of fromJson(String json, Type typeOfT)
object GsonUtils {
val GSON = new GsonBuilder().create()
def java2Json(obj: Object) = GSON.toJson(obj)
//def json2Java[T](json: String, tyze: Type) = GSON.fromJson(json, tyze)
def json2Java[T](json: String, tyze: Class[T]) = GSON.fromJson(json, tyze)
}
case class Data(val name: String, val age: Int)
object GsonUtilsTest {
def main(args: Array[String]) {
val d = Data("1",1)
val json = GsonUtils.java2Json(d)
println(json)
val obj:Data = GsonUtils.json2Java(json, classOf[Data])
println(s"${obj}")
val d2:util.ArrayList[Data] = new util.ArrayList[Data]
d2.add(Data("1",1))
d2.add(Data("2",2))
val json2 = GsonUtils.java2Json(d2)
println(json2)
val dataclass = classOf[util.ArrayList[Data]]
val obj2:util.ArrayList[Data] = GsonUtils.json2Java(json2, dataclass)
println(s"${obj2}")
}
}

Convert Any type in scala to Array[Byte] and back

I have a variable value declared as Any in my program.
I want to convert this value to Array[Byte].
How can I serialize to Array[Byte] and back? I found examples related to other types such as Double or Int, but not to Any.
This should do what you need. It's pretty similar to how one would do it in Java.
import java.io.{ByteArrayInputStream, ByteArrayOutputStream, ObjectInputStream, ObjectOutputStream}
object Serialization extends App {
def serialise(value: Any): Array[Byte] = {
val stream: ByteArrayOutputStream = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(stream)
oos.writeObject(value)
oos.close()
stream.toByteArray
}
def deserialise(bytes: Array[Byte]): Any = {
val ois = new ObjectInputStream(new ByteArrayInputStream(bytes))
val value = ois.readObject
ois.close()
value
}
println(deserialise(serialise("My Test")))
println(deserialise(serialise(List(1))))
println(deserialise(serialise(Map(1 -> 2))))
println(deserialise(serialise(1)))
}
def anyTypeToByteArray(value: Any): Array[Byte] = {
val valueConverted :Array[Byte] = SerializationUtils.serialize(value.isInstanceOf[Serializable])
valueConverted
}
def ByteArrayToAny(value: Array[Byte]): Any = {
val valueConverted: Any = SerializationUtils.deserialize(value)
valueConverted
}