I am using kafka_2.10-0.10.0.1 and scala_2.10.3.
I want to write custom Serializer and Deserializer using scala.
I tried with these Serializer (from CustomType) and Deserializer (obtain a CustomType):
class CustomTypeSerializer extends Serializer[CustomType] {
private val gson: Gson = new Gson()
override def configure(configs: util.Map[String, _], isKey: Boolean):
Unit = {
// nothing to do
}
override def serialize(topic: String, data: CustomType): Array[Byte] = {
if (data == null)
null
else
gson.toJson(data).getBytes
}
override def close(): Unit = {
//nothing to do
}
}
class CustomTypeDeserializer extends Deserializer[CustomType] {
private val gson: Gson = new Gson()
override def deserialize(topic: String, bytes: Array[Byte]): CustomType = {
val offerJson = gson.toJson(bytes.toString)
val psType: Type = new TypeToken[CustomType]() {}.getType()
val ps: CustomType = gson.fromJson(offerJson, psType)
ps
}
override def configure(configs: util.Map[String, _], isKey: Boolean):
Unit = {
// nothing to do
}
override def close(): Unit = {
//nothing to do
}
}
But, I got this error:
Exception in thread "main" org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition topic_0_1-1 at offset 26
Caused by: com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was BEGIN_ARRAY at line 1 column 2 path $
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:224)
at com.google.gson.Gson.fromJson(Gson.java:887)
at com.google.gson.Gson.fromJson(Gson.java:852)
at com.google.gson.Gson.fromJson(Gson.java:801)
at kafka.PSDeserializer.deserialize(PSDeserializer.scala:24)
at kafka.PSDeserializer.deserialize(PSDeserializer.scala:18)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:627)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseFetchedData(Fetcher.java:548)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:354)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1000)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
Can you help me please
Find below the custom serializer and deserializer for case class User, User(name:String,id:Int). Replace User in code with your case class. It will work.
import java.io.{ObjectInputStream, ByteArrayInputStream}
import java.util
import org.apache.kafka.common.serialization.{Deserializer, Serializer}
class CustomDeserializer extends Deserializer[User]{
override def configure(configs: util.Map[String,_],isKey: Boolean):Unit = {
}
override def deserialize(topic:String,bytes: Array[Byte]) = {
val byteIn = new ByteArrayInputStream(bytes)
val objIn = new ObjectInputStream(byteIn)
val obj = objIn.readObject().asInstanceOf[User]
byteIn.close()
objIn.close()
obj
}
override def close():Unit = {
}
}
import java.io.{ObjectOutputStream, ByteArrayOutputStream}
import java.util
import org.apache.kafka.common.serialization.Serializer
class CustomSerializer extends Serializer[User]{
override def configure(configs: util.Map[String,_],isKey: Boolean):Unit = {
}
override def serialize(topic:String, data:User):Array[Byte] = {
try {
val byteOut = new ByteArrayOutputStream()
val objOut = new ObjectOutputStream(byteOut)
objOut.writeObject(data)
objOut.close()
byteOut.close()
byteOut.toByteArray
}
catch {
case ex:Exception => throw new Exception(ex.getMessage)
}
}
override def close():Unit = {
}
}
Related
I would like to implement an akka Serializer using upickle but I'm not sure its possible. To do so I would need to implement a Serializer something like the following:
import akka.serialization.Serializer
import upickle.default._
class UpickleSerializer extends Serializer {
def includeManifest: Boolean = true
def identifier = 1234567
def toBinary(obj: AnyRef): Array[Byte] = {
writeBinary(obj) // ???
}
def fromBinary(bytes: Array[Byte], clazz: Option[Class[_]]): AnyRef = {
readBinary(bytes) // ???
}
}
The problem is I cannot call writeBinary/readBinary without having the relevant Writer/Reader. Is there a way I can look these up based on the object class?
Take a look at following files, you should get some ideas!
CborAkkaSerializer.scala
LocationAkkaSerializer.scala
Note: These serializers are using cbor
I found a way to do it using reflection. I base the solution on the assumption that any object that needs to be serialized should have defined a ReadWriter in its companion object:
class UpickleSerializer extends Serializer {
private var map = Map[Class[_], ReadWriter[AnyRef]]()
def includeManifest: Boolean = true
def identifier = 1234567
def toBinary(obj: AnyRef): Array[Byte] = {
implicit val rw = getReadWriter(obj.getClass)
writeBinary(obj)
}
def fromBinary(bytes: Array[Byte], clazz: Option[Class[_]]): AnyRef = {
implicit val rw = lookup(clazz.get)
readBinary[AnyRef](bytes)
}
private def getReadWriter(clazz: Class[_]) = map.get(clazz) match {
case Some(rw) => rw
case None =>
val rw = lookup(clazz)
map += clazz -> rw
rw
}
private def lookup(clazz: Class[_]) = {
import scala.reflect.runtime._
val rootMirror = universe.runtimeMirror(clazz.getClassLoader)
val classSymbol = rootMirror.classSymbol(clazz)
val moduleSymbol = classSymbol.companion.asModule
val moduleMirror = rootMirror.reflectModule(moduleSymbol)
val instanceMirror = rootMirror.reflect(moduleMirror.instance)
val members = instanceMirror.symbol.typeSignature.members
members.find(_.typeSignature <:< typeOf[ReadWriter[_]]) match {
case Some(rw) =>
instanceMirror.reflectField(rw.asTerm).get.asInstanceOf[ReadWriter[AnyRef]]
case None =>
throw new RuntimeException("Not found")
}
}
}
Below is simplified code snippet, where GraphStateLogic implementaion is passed to GraphStage as an constructor argument :-
package akka.shapes.examples.notworking
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl.{GraphDSL, RunnableGraph, Sink, Source}
import akka.stream.stage.{GraphStage, GraphStageLogic, InHandler}
//This is base graph stage, where GraphStageLogic and SinkShape are passed in constructor parameter
class BaseGraphStage[T](val shape: SinkShape[T], graphStageLogic: GraphStageLogic) extends GraphStage[ SinkShape[T] ] {
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = graphStageLogic
}
//this is a sample stateful extension of GraphStageLogic, that accepts first ten elements only
class CountLogic(sinkShape: SinkShape[Int], maxValue: Int) extends GraphStageLogic(sinkShape) {
var counter: Long = 0
override def preStart(): Unit = {
pull(sinkShape.in)
}
setHandler(sinkShape.in, new InHandler {
override def onPush(): Unit = {
val e = grab(sinkShape.in)
println("conditional sink : " + e)
counter = counter + 1
counter == maxValue match {
case true => completeStage()
case false => pull(sinkShape.in)
}
}
})
}
object SampleSinkNotWorking {
def main(args: Array[String]): Unit = {
implicit val actorSystem = ActorSystem("NotWroking")
implicit val actorMaterializer = ActorMaterializer()
val inlet = Inlet[Int](name = "sampleInlet")
val sinkShape = SinkShape( inlet )
val countGraphStateLogic = new CountLogic(sinkShape, 10)
val sinkGraphStage = new BaseGraphStage[Int](sinkShape, countGraphStateLogic)
val sink = Sink.fromGraph( sinkGraphStage )
val graph = GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
Source(1 to 100) ~> sink
ClosedShape
}
val runnableGraph = RunnableGraph.fromGraph(graph)
runnableGraph.run()
}
}
Runnning above code is giving ArrayIndexOutOfBoundsException :-
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException:
-1 at akka.stream.stage.GraphStageLogic.setHandler(GraphStage.scala:439) at
akka.shapes.examples.notworking.CountLogic.(SampleSinkNotWorking.scala:24)
at
akka.shapes.examples.notworking.SampleSinkNotWorking$.main(SampleSinkNotWorking.scala:46)
at
akka.shapes.examples.notworking.SampleSinkNotWorking.main(SampleSinkNotWorking.scala)
I tried debugging, and it looks like, InLet id is -1, ant it's not getting reset.
But, why it's not getting reset, when GraphStateLogic is passed as an constructor argument to GraphState?
i am bit refactor your code and problem is gone, take a look :
class BaseGraphStage(maxValue: Int) extends GraphStage[SinkShape[Int]] {
val inlet = Inlet[Int](name = "sampleInlet")
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) with StageLogging {
var counter: Int = 0
setHandler(inlet, new InHandler {
override def onPush(): Unit = {
val e = grab(inlet)
log.info(s"$e is consumed")
counter += 1
if (counter == maxValue) {
completeStage()
} else {
pull(inlet)
}
}
})
override def preStart(): Unit =
pull(inlet)
override def postStop(): Unit =
counter = 0
}
override def shape: SinkShape[Int] = SinkShape(inlet)
}
object SampleSinkNotWorking {
def main(args: Array[String]): Unit = {
implicit val actorSystem = ActorSystem("NotWorking")
implicit val actorMaterializer = ActorMaterializer()
val sink = Sink.fromGraph(new BaseGraphStage(10))
Source(1 to 100).runWith(sink)
}
}
Can't answer fully on your last question, but i think all trick is about creating inlets in context of graph stage and not out of that, and use pre and post handlers. Hope that helps.
I have Kryo-serialized binary data stored on S3 (thousands of serialized objects).
Alpakka allows to read the content as data: Source[ByteString, NotUsed]. But Kryo format doesn't use delimiters so I can't split each serialized object into a separate ByteString using data.via(Framing.delimiter(...)).
So, Kryo actually needs to read the data to understand when an object ends, and it doesn't look streaming-friendly.
Is it possible to implement this case in streaming fashion so that I get Source[MyObject, NotUsed] in the end of the day?
Here is a graph stage that does that. It handles the case when a serialized object spans two byte strings. It needs to be improved when objects are large (not my use case) and can take more than two byte strings in Source[ByteString, NotUsed].
object KryoReadStage {
def flow[T](kryoSupport: KryoSupport,
`class`: Class[T],
serializer: Serializer[_]): Flow[ByteString, immutable.Seq[T], NotUsed] =
Flow.fromGraph(new KryoReadStage[T](kryoSupport, `class`, serializer))
}
final class KryoReadStage[T](kryoSupport: KryoSupport,
`class`: Class[T],
serializer: Serializer[_])
extends GraphStage[FlowShape[ByteString, immutable.Seq[T]]] {
override def shape: FlowShape[ByteString, immutable.Seq[T]] = FlowShape.of(in, out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = {
new GraphStageLogic(shape) {
setHandler(in, new InHandler {
override def onPush(): Unit = {
val bytes =
if (previousBytes.length == 0) grab(in)
else ByteString.fromArrayUnsafe(previousBytes) ++ grab(in)
Managed(new Input(new ByteBufferBackedInputStream(bytes.asByteBuffer))) { input =>
var position = 0
val acc = ListBuffer[T]()
kryoSupport.withKryo { kryo =>
var last = false
while (!last && !input.eof()) {
tryRead(kryo, input) match {
case Some(t) =>
acc += t
position = input.total().toInt
previousBytes = EmptyArray
case None =>
val bytesLeft = new Array[Byte](bytes.length - position)
val bb = bytes.asByteBuffer
bb.position(position)
bb.get(bytesLeft)
last = true
previousBytes = bytesLeft
}
}
push(out, acc.toList)
}
}
}
private def tryRead(kryo: Kryo, input: Input): Option[T] =
try {
Some(kryo.readObject(input, `class`, serializer))
} catch {
case _: KryoException => None
}
})
setHandler(out, new OutHandler {
override def onPull(): Unit = {
pull(in)
}
})
private val EmptyArray: Array[Byte] = Array.empty
private var previousBytes: Array[Byte] = EmptyArray
}
}
override def toString: String = "KryoReadStage"
private lazy val in: Inlet[ByteString] = Inlet("KryoReadStage.in")
private lazy val out: Outlet[immutable.Seq[T]] = Outlet("KryoReadStage.out")
}
Example usage:
client.download(BucketName, key)
.via(KryoReadStage.flow(kryoSupport, `class`, serializer))
.flatMapConcat(Source(_))
It uses some additional helpers below.
ByteBufferBackedInputStream:
class ByteBufferBackedInputStream(buf: ByteBuffer) extends InputStream {
override def read: Int = {
if (!buf.hasRemaining) -1
else buf.get & 0xFF
}
override def read(bytes: Array[Byte], off: Int, len: Int): Int = {
if (!buf.hasRemaining) -1
else {
val read = Math.min(len, buf.remaining)
buf.get(bytes, off, read)
read
}
}
}
Managed:
object Managed {
type AutoCloseableView[T] = T => AutoCloseable
def apply[T: AutoCloseableView, V](resource: T)(op: T => V): V =
try {
op(resource)
} finally {
resource.close()
}
}
KryoSupport:
trait KryoSupport {
def withKryo[T](f: Kryo => T): T
}
class PooledKryoSupport(serializers: (Class[_], Serializer[_])*) extends KryoSupport {
override def withKryo[T](f: Kryo => T): T = {
pool.run(new KryoCallback[T] {
override def execute(kryo: Kryo): T = f(kryo)
})
}
private val pool = {
val factory = new KryoFactory() {
override def create(): Kryo = {
val kryo = new Kryo
(KryoSupport.ScalaSerializers ++ serializers).foreach {
case ((clazz, serializer)) =>
kryo.register(clazz, serializer)
}
kryo
}
}
new KryoPool.Builder(factory).softReferences().build()
}
}
I'm using Akka Kafka (Scala) and want to send custom objects.
class TweetsSerializer extends Serializer[Seq[MyCustomType]] {
override def configure(configs: util.Map[String, _], isKey: Boolean): Unit = ???
override def serialize(topic: String, data: Seq[MyCustomType]): Array[Byte] = ???
override def close(): Unit = ???
}
How can i correctly write my own serializer ? And, what should i do with field config ?
I would use the StringSerializer, I mean, I´d convert all my types to string before produce them. However that works:
case class MyCustomType(a: Int)
class TweetsSerializer extends Serializer[Seq[MyCustomType]] {
private var encoding = "UTF8"
override def configure(configs: java.util.Map[String, _], isKey: Boolean): Unit = {
val propertyName = if (isKey) "key.serializer.encoding"
else "value.serializer.encoding"
var encodingValue = configs.get(propertyName)
if (encodingValue == null) encodingValue = configs.get("serializer.encoding")
if (encodingValue != null && encodingValue.isInstanceOf[String]) encoding = encodingValue.asInstanceOf[String]
}
override def serialize(topic: String, data: Seq[MyCustomType]): Array[Byte] =
try
if (data == null) return null
else return {
data.map { v =>
v.a.toString
}
.mkString("").getBytes("UTF8")
}
catch {
case e: UnsupportedEncodingException =>
throw new SerializationException("Error when serializing string to byte[] due to unsupported encoding " + encoding)
}
override def close(): Unit = Unit
}
}
object testCustomKafkaSerializer extends App {
implicit val producerConfig = {
val props = new Properties()
props.setProperty("bootstrap.servers", "localhost:9092")
props.setProperty("key.serializer", classOf[StringSerializer].getName)
props.setProperty("value.serializer", classOf[TweetsSerializer].getName)
props
}
lazy val kafkaProducer = new KafkaProducer[String, Seq[MyCustomType]](producerConfig)
// Create scala future from Java
private def publishToKafka(id: String, data: Seq[MyCustomType]) = {
kafkaProducer
.send(new ProducerRecord("outTopic", id, data))
.get()
}
val input = MyCustomType(1)
publishToKafka("customSerializerTopic", Seq(input))
}
I'm trying to implement a type class solution for error handling in a Play application. What I want is to have some type class instances representing some validated (caught) errors and a default type class instance for any unvalidated (uncaught) errors.
I don't know if this is possible, but here's what I have so far:
trait ResponseError[E] {
def report(e: E)(implicit logger: Logger): Unit
def materialize(e: E): Result
}
trait ValidatedError[E <: Throwable] extends ResponseError[E] {
def report(e: E)(implicit logger: Logger): Unit =
ResponseError.logError(e)
}
trait UnvalidatedError[E <: Throwable] extends ResponseError[E] {
def report(e: E)(implicit logger: Logger): Unit = {
ResponseError.logError(e)
UnvalidatedError.notify(e)
}
}
object ResponseError {
def logError(e: Throwable)(implicit logger: Logger): Unit =
logger.error(e.getMessage)
}
object ValidatedError {
import java.util.concurrent.{ExecutionException, TimeoutException}
implicit val executionError = new ValidatedError[ExecutionException] {
def materialize(e: E): Result =
play.api.mvc.Results.BadRequest
}
implicit val timeoutError = new ValidatedError[TimeoutException] {
def materialize(e: E): Result =
play.api.mvc.Results.RequestTimeout
}
}
object UnvalidatedError {
implicit uncaughtError = new UnvalidatedError[Throwable] {
def materialize(e: E): Result =
play.api.mvc.Results.ServiceUnavailable
}
private def notify(e: Throwable) = ??? // send email notification
}
However how can I make sure to try my ValidatedError type class instances first, before falling back to my UnvalidatedError type class instance?
There you go. See my comment for details.
import java.util.concurrent.{TimeoutException, ExecutionException}
type Result = String
val badRequest: Result = "BadRequest"
val requestTimeout: Result = "RequestTimeout"
val serviceUnavailable: Result = "ServiceUnavailable"
class Logger {
def error(s: String) = println(s + "\n")
}
trait ResponseError[E] {
def report(e: E)(implicit logger: Logger): Unit
def materialize(e: E): Result
}
trait ValidatedError[E <: Throwable] extends UnvalidatedError[E] {
override def report(e: E)(implicit logger: Logger): Unit =
ResponseError.logError(e, validated = true)
}
trait UnvalidatedError[E <: Throwable] extends ResponseError[E] {
def report(e: E)(implicit logger: Logger): Unit = {
ResponseError.logError(e, validated = false)
UnvalidatedError.notify(e)
}
}
object ResponseError {
def logError(e: Throwable, validated: Boolean)(implicit logger: Logger): Unit =
logger.error({
validated match {
case true => "VALIDATED : "
case false => "UNVALIDATED : "
}
} + e.getMessage)
}
object ValidatedError {
import java.util.concurrent.{ExecutionException, TimeoutException}
implicit def executionError[E <: ExecutionException] = new ValidatedError[E] {
def materialize(e: E): Result =
badRequest
}
implicit def timeoutError[E <: TimeoutException] = new ValidatedError[E] {
def materialize(e: E): Result =
requestTimeout
}
}
object UnvalidatedError {
implicit def uncaughtError[E <: Throwable] = new UnvalidatedError[E] {
def materialize(e: E): Result =
serviceUnavailable
}
private def notify(e: Throwable) = println("Sending email: " + e) // send email notification
}
def testTypeclass[E](e: E)(implicit logger: Logger, ev: ResponseError[E]): Unit ={
ev.report(e)
}
import ValidatedError._
import UnvalidatedError._
implicit val logger: Logger = new Logger
val executionErr = new ExecutionException(new Throwable("execution exception!"))
testTypeclass(executionErr)
val timeoutErr = new TimeoutException("timeout exception!")
testTypeclass(timeoutErr)
val otherErr = new Exception("other exception!")
testTypeclass(otherErr)
Output:
VALIDATED : java.lang.Throwable: execution exception!
VALIDATED : timeout exception!
UNVALIDATED : other exception!
Sending email: java.lang.Exception: other exception!