I would like to implement an akka Serializer using upickle but I'm not sure its possible. To do so I would need to implement a Serializer something like the following:
import akka.serialization.Serializer
import upickle.default._
class UpickleSerializer extends Serializer {
def includeManifest: Boolean = true
def identifier = 1234567
def toBinary(obj: AnyRef): Array[Byte] = {
writeBinary(obj) // ???
}
def fromBinary(bytes: Array[Byte], clazz: Option[Class[_]]): AnyRef = {
readBinary(bytes) // ???
}
}
The problem is I cannot call writeBinary/readBinary without having the relevant Writer/Reader. Is there a way I can look these up based on the object class?
Take a look at following files, you should get some ideas!
CborAkkaSerializer.scala
LocationAkkaSerializer.scala
Note: These serializers are using cbor
I found a way to do it using reflection. I base the solution on the assumption that any object that needs to be serialized should have defined a ReadWriter in its companion object:
class UpickleSerializer extends Serializer {
private var map = Map[Class[_], ReadWriter[AnyRef]]()
def includeManifest: Boolean = true
def identifier = 1234567
def toBinary(obj: AnyRef): Array[Byte] = {
implicit val rw = getReadWriter(obj.getClass)
writeBinary(obj)
}
def fromBinary(bytes: Array[Byte], clazz: Option[Class[_]]): AnyRef = {
implicit val rw = lookup(clazz.get)
readBinary[AnyRef](bytes)
}
private def getReadWriter(clazz: Class[_]) = map.get(clazz) match {
case Some(rw) => rw
case None =>
val rw = lookup(clazz)
map += clazz -> rw
rw
}
private def lookup(clazz: Class[_]) = {
import scala.reflect.runtime._
val rootMirror = universe.runtimeMirror(clazz.getClassLoader)
val classSymbol = rootMirror.classSymbol(clazz)
val moduleSymbol = classSymbol.companion.asModule
val moduleMirror = rootMirror.reflectModule(moduleSymbol)
val instanceMirror = rootMirror.reflect(moduleMirror.instance)
val members = instanceMirror.symbol.typeSignature.members
members.find(_.typeSignature <:< typeOf[ReadWriter[_]]) match {
case Some(rw) =>
instanceMirror.reflectField(rw.asTerm).get.asInstanceOf[ReadWriter[AnyRef]]
case None =>
throw new RuntimeException("Not found")
}
}
}
Related
Spark 3.0 has deprecated UserDefinedAggregateFunction and I was trying to rewrite my udaf using Aggregator. Basic usage of Aggregator is simple, however, I struggle with more generic version of the function.
I will try to explain my problem with this example, an implementation of collect_set. It's not my actual case, but it's easier to explain the problem:
class CollectSetDemoAgg(name: String) extends Aggregator[Row, Set[Int], Set[Int]] {
override def zero = Set.empty
override def reduce(b: Set[Int], a: Row) = b + a.getInt(a.fieldIndex(name))
override def merge(b1: Set[Int], b2: Set[Int]) = b1 ++ b2
override def finish(reduction: Set[Int]) = reduction
override def bufferEncoder = Encoders.kryo[Set[Int]]
override def outputEncoder = ExpressionEncoder()
}
// using it:
df.agg(new CollectSetDemoAgg("rank").toColumn as "result").show()
I prefer .toColumn vs .udf.register, but it's not the point here.
Problem:
I can not make universal version of this Aggregator, it will only work with integers.
I've attempted:
class CollectSetDemo(name: String) extends Aggregator[Row, Set[Any], Set[Any]]
It crashes with error:
No Encoder found for Any
- array element class: "java.lang.Object"
- root class: "scala.collection.immutable.Set"
java.lang.UnsupportedOperationException: No Encoder found for Any
- array element class: "java.lang.Object"
- root class: "scala.collection.immutable.Set"
at org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerFor$1(ScalaReflection.scala:567)
I could not go with CollectSetDemo[T], case I was not able to proper outputEncoder. Also, when using udaf, I can only work with Spark data types, columns, etc.
Have not found a nice way to solve the situation, but I was able to somewhat workaround it. Code was partially borrowed from RowEncoder:
class CollectSetDemoAgg(name: String, fieldType: DataType) extends Aggregator[Row, Set[Any], Any] {
override def zero = Set.empty
override def reduce(b: Set[Any], a: Row) = b + a.get(a.fieldIndex(name))
override def merge(b1: Set[Any], b2: Set[Any]) = b1 ++ b2
override def finish(reduction: Set[Any]) = reduction.toSeq
override def bufferEncoder = Encoders.kryo[Set[Any]]
// now
override def outputEncoder = {
val mirror = ScalaReflection.mirror
val tt = fieldType match {
case ArrayType(LongType, _) => typeTag[Seq[Long]]
case ArrayType(IntegerType, _) => typeTag[Seq[Int]]
case ArrayType(StringType, _) => typeTag[Seq[String]]
// .. etc etc
case _ => throw new RuntimeException(s"Could not create encoder for ${name} column (${fieldType})")
}
val tpe = tt.in(mirror).tpe
val cls = mirror.runtimeClass(tpe)
val serializer = ScalaReflection.serializerForType(tpe)
val deserializer = ScalaReflection.deserializerForType(tpe)
new ExpressionEncoder[Any](serializer, deserializer, ClassTag[Any](cls))
}
}
One thing, that I had to add was result data type parameter in aggregator. The usage then changed to:
df.agg(new CollectSetDemoAgg("rank", new ArrayType(IntegerType, true)).toColumn as "result").show()
I really don't like how it turned out, but it works. I also welcome any suggestions how to improve it.
Modification of #Ramunas answer with generics:
class CollectSetDemoAgg[T: TypeTag](name: String) extends Aggregator[Row, Set[T], Seq[T]] {
override def zero = Set.empty
override def reduce(b: Set[T], a: Row) = b + a.getAs[T](a.fieldIndex(name))
override def merge(b1: Set[T], b2: Set[T]) = b1 ++ b2
override def finish(reduction: Set[T]) = reduction.toSeq
override def bufferEncoder = Encoders.kryo[Set[T]]
override def outputEncoder = {
val tt = typeTag[Seq[T]]
val tpe = tt.in(mirror).tpe
val cls = mirror.runtimeClass(tpe)
val serializer = serializerForType(tpe)
val deserializer = deserializerForType(tpe)
new ExpressionEncoder[Seq[T]](serializer, deserializer, ClassTag[Seq[T]](cls))
}
}
Do i have a misunderstanding how implciits work in Scala - given the following trait in Scala,
trait hasConfig {
implicit def string2array(s: java.lang.String): Array[String] = {
LoadedProperties.getList(s)
}
implicit def string2boolean(s: java.lang.String) : java.lang.Boolean = {
s.toLowerCase() match {
case "true" => true
case "false" => false
}
}
var config: Properties = new Properties()
def getConfigs : Properties = config
def loadConfigs(prop:Properties) : Properties = {
config = prop
config
}
def getConfigAs[T](key:String):T = {
if (hasConfig(key)) {
val value : T = config.getProperty(key).asInstanceOf[T]
value
}
else throw new Exception("Key not found in config")
}
def hasConfig(key: String): Boolean = {
config.containsKey(k)
}
}
Though java.util.properties contains (String, String) key value pairs, I expect the following code to work due to the implicit converstion defined,
class hasConfigTest extends FunSuite {
val recModel = new Object with hasConfig
//val prop = LoadedProperties.fromFile("test") Read properties from some file
recModel.loadConfigs(prop)
test("test string paramater") {
assert(recModel.getConfigAs[String]("application.id").equals("framework"))
}
test("test boolean paramater") {
assert(recModel.getConfigAs[Boolean]("framework.booleanvalue") == true)
//Property file contains framework.booleanvalue=true
//expected to return java.lang.boolean, get java.lang.string
}
}
However, I get the following error,
java.lang.String cannot be cast to java.lang.Boolean
java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Boolean
Why is the implcit conversion not taking care of this?
It doesn't work because casting (asInstanceOf) is something entirely different than implicit conversions. There are multiple ways in which you can solve this.
Implicit conversion
If you want to use the hardcore implicit conversions magic you should rewrite you getConfigAs method like this:
def getConfig(key:String): String = {
if (hasConfig(key)) {
val value: String = config.getProperty(key)
value
}
else throw new Exception("Key not found in config")
}
You will have to import the conversions into the current scope when you use getConfig.
val recModel = new Object with hasConfig
import recModel._
recModel.loadConfigs(prop)
val value: Boolean = recModel.getConfig("framework.booleanvalue")
Implicit parameters
A better way would be to keep your current API, but then you will have to introduce an implicit parameter because the implementation of getConfigAs needs access to the conversion.
def getConfigAs[T](key:String)(implicit conv: String => T): T = {
if (hasConfig(key)) {
val value: String = config.getProperty(key)
value
}
else throw new Exception("Key not found in config")
}
You will still need to import the necessary conversions at the use site though.
val recModel = new Object with hasConfig
import recModel._
recModel.loadConfigs(prop)
val value = recModel.getConfigAs[Boolean]("framework.booleanvalue")
Typeclasses
A way to avoid having to import your conversions (and possibly implicitly converting all kinds of Strings by accident) is to introduce a new type to encode your conversions. Then you can implement the conversions in its companion object, where implicit search can find them without importing them.
trait Converter[To]{
def convert(s: String): To
}
object Converter {
implicit val string2array: Converter[Array[String]] = new Converter[Array[String]] {
def convert(s: String): Array[String] =
LoadedProperties.getList(s)
}
implicit val string2boolean: Converter[Boolean] = new Converter[Boolean] {
def convert(s: String): Boolean =
s.toLowerCase() match {
case "true" => true
case "false" => false
}
}
}
Then you can change your getConfigAs method.
def getConfigAs[T](key:String)(implicit conv: Converter[T]): T = {
if (hasConfig(key)) {
val value: String = config.getProperty(key)
conv.convert(value)
}
else throw new Exception("Key not found in config")
}
And use it.
val recModel = new Object with hasConfig
recModel.loadConfigs(prop)
val value = recModel.getConfigAs[Boolean]("framework.booleanvalue")
You might also want to take a look over here.
Implicit conversions should be defined in scope, for example in enclosing object or imported into the current scope. In your case they should be defined in scope of the hasConfigTest class.
http://docs.scala-lang.org/tutorials/FAQ/finding-implicits
Here's a simple reproducible example:
object m {
implicit def string2boolean(s: String): Boolean = {
s.toLowerCase() match {
case "true" => true
case "false" => false
}
} //> string2boolean: (s: String)Boolean
println(false || "true") //> true
println(false || "false") //> false
}
I think what you are trying to say, is something like this:
import java.util.Properties
object LoadedProperties {
def getList(s: String): Array[String] = Array.empty
}
object hasConfig {
sealed trait ConfigReader[T] {
def read(conf: String): T
}
implicit object BooleanConfigReader extends ConfigReader[Boolean] {
override def read(conf: String): Boolean = conf.toLowerCase() match {
case "true" => true
case "false" => false
}
}
implicit object ArrayConfigReader extends ConfigReader[Array[String]] {
override def read(s: String): Array[String] = {
LoadedProperties.getList(s)
}
}
var config: Properties = new Properties()
def getConfigs: Properties = config
def loadConfigs(prop: Properties): Properties = {
config = prop
config
}
def getConfigAs[T](key: String)(implicit reader: ConfigReader[T]): T = {
val prop = config.getProperty(key)
if (prop == null)
throw new Exception("Key not found in config")
reader.read(prop)
}
}
val props = new Properties()
props.setProperty("a", "false")
props.setProperty("b", "some")
hasConfig.loadConfigs(props)
hasConfig.getConfigAs[Boolean]("a")
hasConfig.getConfigAs[Array[String]]("a")
Can't I use a generic on the unapply method of an extractor along with an implicit "converter" to support a pattern match specific to the parameterised type?
I'd like to do this (Note the use of [T] on the unapply line),
trait StringDecoder[A] {
def fromString(string: String): Option[A]
}
object ExampleExtractor {
def unapply[T](a: String)(implicit evidence: StringDecoder[T]): Option[T] = {
evidence.fromString(a)
}
}
object Example extends App {
implicit val stringDecoder = new StringDecoder[String] {
def fromString(string: String): Option[String] = Some(string)
}
implicit val intDecoder = new StringDecoder[Int] {
def fromString(string: String): Option[Int] = Some(string.charAt(0).toInt)
}
val result = "hello" match {
case ExampleExtractor[String](x) => x // <- type hint barfs
}
println(result)
}
But I get the following compilation error
Error: (25, 10) not found: type ExampleExtractor
case ExampleExtractor[String] (x) => x
^
It works fine if I have only one implicit val in scope and drop the type hint (see below), but that defeats the object.
object Example extends App {
implicit val intDecoder = new StringDecoder[Int] {
def fromString(string: String): Option[Int] = Some(string.charAt(0).toInt)
}
val result = "hello" match {
case ExampleExtractor(x) => x
}
println(result)
}
A variant of your typed string decoder looks promising:
trait StringDecoder[A] {
def fromString(s: String): Option[A]
}
class ExampleExtractor[T](ev: StringDecoder[T]) {
def unapply(s: String) = ev.fromString(s)
}
object ExampleExtractor {
def apply[A](implicit ev: StringDecoder[A]) = new ExampleExtractor(ev)
}
then
implicit val intDecoder = new StringDecoder[Int] {
def fromString(s: String) = scala.util.Try {
Integer.parseInt(s)
}.toOption
}
val asInt = ExampleExtractor[Int]
val asInt(Nb) = "1111"
seems to produce what you're asking for. One problem remains: it seems that trying to
val ExampleExtractor[Int](nB) = "1111"
results in a compiler crash (at least inside my 2.10.3 SBT Scala console).
I'm having some problems with a macro I've written to help me log metrics represented as case class instances to to InfluxDB. I presume I'm having a type erasure problem and that the tyep parameter T is getting lost, but I'm not entirely sure what's going on. (This is also my first exposure to Scala macros.)
import scala.language.experimental.macros
import play.api.libs.json.{JsNumber, JsString, JsObject, JsArray}
abstract class Metric[T] {
def series: String
def jsFields: JsArray = macro MetricsMacros.jsFields[T]
def jsValues: JsArray = macro MetricsMacros.jsValues[T]
}
object Metrics {
case class LoggedMetric(timestamp: Long, series: String, fields: JsArray, values: JsArray)
case object Kick
def log[T](metric: Metric[T]): Unit = {
println(LoggedMetric(
System.currentTimeMillis,
metric.series,
metric.jsFields,
metric.jsValues
))
}
}
And here's an example metric case class:
case class SessionCountMetric(a: Int, b: String) extends Metric[SessionCountMetric] {
val series = "sessioncount"
}
Here's what happens when I try to log it:
scala> val m = SessionCountMetric(1, "a")
m: com.confabulous.deva.SessionCountMetric = SessionCountMetric(1,a)
scala> Metrics.log(m)
LoggedMetric(1411450638296,sessioncount,[],[])
Even though the macro itself seems to work fine:
scala> m.jsFields
res1: play.api.libs.json.JsArray = ["a","b"]
scala> m.jsValues
res2: play.api.libs.json.JsArray = [1,"a"]
Here's the actual macro itself:
import scala.language.experimental.macros
import scala.reflect.macros.blackbox.Context
object MetricsMacros {
private def fieldNames[T: c.WeakTypeTag](c: Context)= {
val tpe = c.weakTypeOf[T]
tpe.decls.collect {
case field if field.isMethod && field.asMethod.isCaseAccessor => field.asTerm.name
}
}
def jsFields[T: c.WeakTypeTag](c: Context) = {
import c.universe._
val names = fieldNames[T](c)
Apply(
q"play.api.libs.json.Json.arr",
names.map(name => Literal(Constant(name.toString))).toList
)
}
def jsValues[T: c.WeakTypeTag](c: Context) = {
import c.universe._
val names = fieldNames[T](c)
Apply(
q"play.api.libs.json.Json.arr",
names.map(name => q"${c.prefix.tree}.$name").toList
)
}
}
Update
I tried Eugene's second suggestion like this:
abstract class Metric[T] {
def series: String
}
trait MetricSerializer[T] {
def fields: Seq[String]
def values(metric: T): Seq[Any]
}
object MetricSerializer {
implicit def materializeSerializer[T]: MetricSerializer[T] = macro MetricsMacros.materializeSerializer[T]
}
object Metrics {
def log[T: MetricSerializer](metric: T): Unit = {
val serializer = implicitly[MetricSerializer[T]]
println(serializer.fields)
println(serializer.values(metric))
}
}
with the macro now looking like this:
object MetricsMacros {
def materializeSerializer[T: c.WeakTypeTag](c: Context) = {
import c.universe._
val tpe = c.weakTypeOf[T]
val names = tpe.decls.collect {
case field if field.isMethod && field.asMethod.isCaseAccessor => field.asTerm.name
}
val fields = Apply(
q"Seq",
names.map(name => Literal(Constant(name.toString))).toList
)
val values = Apply(
q"Seq",
names.map(name => q"metric.$name").toList
)
q"""
new MetricSerializer[$tpe] {
def fields = $fields
def values(metric: Metric[$tpe]) = $values
}
"""
}
}
However, when I call Metrics.log -- specifically when it calls implicitly[MetricSerializer[T]] I get the following error:
error: value a is not a member of com.confabulous.deva.Metric[com.confabulous.deva.SessionCountMetric]
Why is it trying to use Metric[com.confabulous.deva.SessionCountMetric] instead of SessionCountMetric?
Conclusion
Fixed it.
def values(metric: Metric[$tpe]) = $values
should have been
def values(metric: $tpe) = $values
You're in a situation that's very close to one described in a recent question: scala macros: defer type inference.
As things stand right now, you'll have to turn log into a macro. An alternative would also to turn Metric.jsFields and Metric.jsValues into JsFieldable and JsValuable type classes materialized by implicit macros at callsites of log (http://docs.scala-lang.org/overviews/macros/implicits.html).
I add variables with Dynamic from scala 2.10.0-RC1 like this:
import language.dynamics
import scala.collection.mutable.HashMap
object Main extends Dynamic {
private val map = new HashMap[String, Any]
def selectDynamic(name: String): Any = {return map(name)}
def updateDynamic(name:String)(value: Any) = {map(name) = value}
}
val fig = new Figure(...) // has a method number
Main.figname = fig
Now, if I want to access Main.figname.number it doesn't work, because the compiler thinks it's of type Any.
But it's also Main.figname.isInstanceOf[Figure] == true, so it's Any and Figure, but doesn't have Figures abilities. Now I can cast it like, Main.figname.asInstanceOf[Figure].number and it works! This is ugly! And I can't present this to my domain users (I'd like to build a internal DSL.)
Note: If I use instead of Any the supertype of Figure it doesn't work either.
Is this a bug in scala 2.10, or a feature?
It is quite logical. You are explicitly returning instances of Any. A workaround would be to have instances of Dynamic all along:
import language.dynamics
import scala.collection.mutable.HashMap
import scala.reflect.ClassTag
trait DynamicBase extends Dynamic {
def as[T:ClassTag]: T
def selectDynamic[T](name: String): DynamicBase
def updateDynamic(name:String)(value: Any)
}
class ReflectionDynamic( val self: Any ) extends DynamicBase with Proxy {
def as[T:ClassTag]: T = { implicitly[ClassTag[T]].runtimeClass.asInstanceOf[Class[T]].cast( self ) }
// TODO: cache method lookup for faster access + handle NoSuchMethodError
def selectDynamic[T](name: String): DynamicBase = {
val ref = self.asInstanceOf[AnyRef]
val clazz = ref.getClass
clazz.getMethod(name).invoke( ref ) match {
case dyn: DynamicBase => dyn
case res => new ReflectionDynamic( res )
}
}
def updateDynamic( name: String )( value: Any ) = {
val ref = self.asInstanceOf[AnyRef]
val clazz = ref.getClass
// FIXME: check parameter type, and handle overloads
clazz.getMethods.find(_.getName == name+"_=").foreach{ meth =>
meth.invoke( ref, value.asInstanceOf[AnyRef] )
}
}
}
object Main extends DynamicBase {
def as[T:ClassTag]: T = { implicitly[ClassTag[T]].runtimeClass.asInstanceOf[Class[T]].cast( this ) }
private val map = new HashMap[String, DynamicBase]
def selectDynamic[T](name: String): DynamicBase = { map(name) }
def updateDynamic(name:String)(value: Any) = {
val dyn = value match {
case dyn: DynamicBase => dyn
case _ => new ReflectionDynamic( value )
}
map(name) = dyn
}
}
Usage:
scala> class Figure {
| val bla: String = "BLA"
| }
defined class Figure
scala> val fig = new Figure() // has a method number
fig: Figure = Figure#6d1fa2
scala> Main.figname = fig
Main.figname: DynamicBase = Figure#6d1fa2
scala> Main.figname.bla
res40: DynamicBase = BLA
All instances are wrapped in a Dynamic instance.
We can recover the actual type using the as method which performs a dynamic cast.
scala> val myString: String = Main.figname.bla.as[String]
myString: String = BLA
You can add any extensions or custom functionalities to Any or any predefined value classes. You can define an implicit value class like this:
implicit class CustomAny(val self: Any) extends AnyVal {
def as[T] = self.asInstanceOf[T]
}
Usage:
scala> class Figure {
| val xyz = "xyz"
| }
defined class Figure
scala> val fig = new Figure()
fig: Figure = Figure#73dce0e6
scala> Main.figname = fig
Main.figname: Any = Figure#73dce0e6
scala> Main.figname.as[Figure].xyz
res8: String = xyz
The implicit value class is not costly like like regular class. It will be optimised in compile time and it will be equivalent to a method call on a static object, rather than a method call on a newly instantiated object.
You can find more info on implicit value class here.