Registering a class with Kryo via twitter chill-scala - scala

I'm trying to serialize an instance of a Scala class using Kryo via Twitter's chill-scala library. It's from a library (external jar) and thus, I think, needs to be registered with Kryo.
How do I register a class to (de)serialize using chill-scala?
Here's the core of my code, based mostly on examining the chill-scala test suite.
// This is from the chill-scala test suite
def serialize[T](t: T): Array[Byte] = ScalaKryoInstantiator.defaultPool.toBytesWithClass(t)
def deserialize[T](bytes: Array[Byte]): T =
ScalaKryoInstantiator.defaultPool.fromBytes(bytes).asInstanceOf[T]
/**
* Save a value in cache.
*/
def save[T](key: String, value: T, expiration: Int = 0): Future[T] = {
cache.put(key, serialize[T](value), expiration, TimeUnit.SECONDS)
Future.successful(value)
}
/**
* Finds a value in the cache.
*/
def find[T: ClassTag](key: String): Future[Option[T]] = Future {
val result = deserialize[T](cache.get(key).asInstanceOf[Array[Byte]])
Option(result)
}
When I run it, it throws
com.esotericsoftware.kryo.KryoException: Unable to find class: <name_of_external_class>
More generally, is there documentation someplace on how to use chill-scala? The authors of that package have obviously done a substantial amount of work and I've seen a number of positive references to it -- but no documentation.
Thanks for any pointers,
Byron

I believe that chill is just an extension to Kryo, which provides serializers for supported scala classes better than the Kryo default FieldSerializer.
So, if you do not know how to use chill, you should try to read the docs of Kryo.
And, the exception of
Unable to find class: <name_of_external_class>
is probably because the class is not in your class path. It may not relate to Kryo.
Note that, even not registered in Kryo, the class can still be serialized by Kryo.
If you need good example of using chill, source code of Apache Spark is a good choice. Also, the small example in this repo is acceptable.

Related

Good practices of writing and unit testing Utility methods in scala

In particular situations, you need to have some utility methods that are required across different classes. To solve this situation, you create an Util object wherein you place all these methods
object AggregatorUtil {
def aggregateValues(list : List[BigDecimal]) = //some logic...
}
// Import everything in the Utilities object
import AggregatorUtil._
and then import whichever members of util are required in your class. However, the downside to this is that, as all your methods are inside the singleton object and it becomes tricky to mock the object and unit test methods of the class that use utility methods.
To solve this problem again, the only solution that came to mind was Extracting the functionality out to a trait and then mocking the trait.
Please let me know if there is any other approach for handling and testing of util methods and which one is a rather cleaner approach.
Thanks in advance !!!
Note: -I am using scalatest and mockito in my project.
If you need to mock, putting this all in a mocked-out trait is the way forward. If mocking is unnecessary though, avoid it. Mocking unnecessarily is... unnecessary. You'll just be wasting time and effort for something which provides no additional value.
Mocking is best used when you have complex functionality or functionality in other files which you want to treat as a black box and just assume it works as expected (you'd then typically unit test this stuff separately). If you can avoid it and use the functions' actual functionality though, you'll get a much more realistic view of what your application does and will spot new bugs/breaking changes quicker (if you have mocked out functionality and forget to update your mocks, you might not any spot new bugs you introduce).
A good example of when mocking is necessary is when you're mocking calls to a database in a MVC application (e.g. a Scala Play microservice). You obviously don't want to have to run an actual database when testing your code, so you'd typically mock out your connector layer and return dummy/mocked data from your connector functions.
An example of something you wouldn't mock is something like:
trait MyTrait {
def toInt(str: String): Int
}
val mockedTrait = mock[MyTrait]
when(mockedTrait.toInt(eq("3")).thenReturn(3)
It's a bit of a silly example, but I think it explains the point clearly - doing something like this would be ridiculous. Mocking isn't always the answer.
I mostly mock with Test-Implementation which I find more readable and you don't have to learn a mocking framework.
Here an example:
The Interface:
trait DataRepo {
def persist(data: DataObject): Future[DataObject]
def idents(): Future[List[String]]
def insertData(dataCont: DataObject): Future[Int]
...
}
The Mocked Interface:
object DataRepoMock extends DataRepo {
def persist(data: DataObject): Future[DataObject] = ??? // only implement when needed
def idents(): Future[List[String]] = Future.successful((0 to 10).map(_=>Random.nextInt(100)))
def insertData(dataCont: DataObject): Future[Int] = Future.successful(Random.nextInt(100))
...
}
You can also use all the Scala goodies, like Pattern Matching, to make your Mock react differently to input.
Here is an example that this is not just used by myself;):
EPFLx: scala-reactiveX see Lecture 2.5 Testing Actor Systems:
def fakeGetter(url:String, depth: Int):Props =
Props(new Getter(url, depth){
override def webClient: WebClient = FakeWebClient
})

How do I specify type parameters via a configuration file?

I am building a market simulator using Scala/Akka/Play. I have an Akka actor with two children. The children need to have specific types which I would like to specify as parameters.
Suppose that I have the following class definition...
case class SecuritiesMarket[A <: AuctionMechanismLike, C <: ClearingMechanismLike](instrument: Security) extends Actor
with ActorLogging {
val auctionMechanism: ActorRef = context.actorOf(Props[A], "auction-mechanism")
val clearingMechanism: ActorRef = context.actorOf(Props[C], "clearing-mechanism")
def receive: Receive = {
case order: OrderLike => auctionMechanism forward order
case fill: FillLike => clearingMechanism forward fill
}
}
Instances of this class can be created as follows...
val stockMarket = SecuritiesMarket[DoubleAuctionMechanism, CCPClearingMechanism](Security("GOOG"))
val derivativesMarket = SecuritiesMarket[BatchAuctionMechanism, BilateralClearingMechanism](Security("SomeDerivative"))
There are many possible combinations of auction mechanism types and clearing mechanism types that I might use when creating SecuritiesMarket instance for a particular model/simulation.
Can I specify the type parameters that I wish to use in a given simulation in the application.conf file?
I see two questions here.
Can I get a Class instance from a String?
Yes.
val cls: Class[DoubleAuctionMechanism] = Class.forName("your.app.DoubleAuctionMechanism").asInstanceOf[Class[DoubleAuctionMechanism]]
You would still need the cast, as forName returns Class[_].
Can I instantiate a type with type parameters are not known compile time?
Well sort of, but not really.
object SecuritiesMarket {
def apply[A, C](clsAuc: Class[A], clsClr: Class[C])(security: Security): SecuritiesMarket[A, C] = {
SecuritiesMarket[A, C](security)
}
}
I think auction mechanisms and clearing mechanisms are dependencies for SecurityMarket. I'm guessing you instantiate them in its constructor somehow (how?). If that's the case why not just pass them in as a constructor parameter?
Edit:
I don't see how I could create the child actors inside SecurityMarket
Answering this in the comments; Props[T] can also be written as Props[T](classOfT), which can be simplified as Props(classOfT). Those three are the same. So the following code:
val auctionMechanism: ActorRef = context.actorOf(Props[A], "auction-mechanism")
Can be replaced with:
val classOfA = Class.forName("path.to.A")
val auctionMechanism: ActorRef = context.actorOf(Props(classOfA), "auction-mechanism")
First, application.conf is a runtime artifact and its contents are as far as I know not normally parsed at compile time. When the file is parsed at runtime, the parser creates an instance of the class Config which then controls the Akka setup.
The Typesafe Config library project readme is quite nice and the linked documentation has all of the details:
https://github.com/typesafehub/config/blob/master/README.md.
Second, since template parameters are not available at runtime because of type erasure, you can't normally use application.conf to control templating. You could create a custom build step to parse application.conf and modify your code before compilation, but this is maybe not what you want. (And if you do want a custom build step, perhaps a different .conf would be appropriate.)
Instead you might try simply eliminating the type parameters for the securities market class. Then create a single, simple implementation of the auction and clearing actors. Implement these actors by reading the names of the respective mechanisms from application.conf, instantiating the configured mechanism reflectively, and delegating to the instantiated mechanism. The mechanism classes could be independent of Akka, which is perhaps nice if that's where you keep most of your logic?

Overcoming changes to persistent message classes in Akka Persistence

Let's say I start out with a Akka Persistence system like this:
case class MyMessage(x: Int)
class MyProcessor extends Processor {
def receive = {
case Persistent(m # MyMessage) => m.x
//...
}
}
And then someday I change it to this:
case class MyMessage(x: Int, y: Int)
class MyProcessor extends Processor {
def receive = {
case Persistent(m # MyMessage) => m.x + m.y
//...
}
}
After I deploy my new system, when the instance of MyProcessor tries to restore its state, the journaled messages will be of the former case class. Because it is expecting the latter type, it will throw an OnReplayFailure, rendering the processor useless. Question is: if we were to assume an absent y can equal 0 (or whatever) is there a best practice to overcome this? For example, perhaps using an implicit to convert from the former message to the latter on recovery?
Akka uses Java serialization by default and says that for a long-term project we should use a proper alternative. This is because Java serialization is very difficult to evolve over time. Akka recommends using either Google Protocol Buffers, Apache Thrift or Apache Avro.
With Google Protocol Buffers, for example, in your case you'd be writing something like:
if (p.hasY) p.getY else 0
Akka explains all that in a nice article (admittedly it's not very Google-able):
http://doc.akka.io/docs/akka/current/scala/persistence-schema-evolution.html
and even explains your particular use case of adding a new field to an existing message type:
http://doc.akka.io/docs/akka/current/scala/persistence-schema-evolution.html#Add_fields
A blog post the Akka documentation recommends for a comparison of the different serialization toolkits:
http://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html

Dependency injection with Scala

I was searching a way of doing dependency injection in Scala kind of like Spring or Unity in C# and I found nothing really interesting.
MacWire: I don't understand the benefit as we have to give the class in wire[CASS]. So what's the point if you give the implementation when you call wire? I can do new CASS it will be the same.
Cake pattern with self type: Seems to not answer what I'm searching for.
So I decided to make my implementation and ask you what do you think because it's surprising me that nothing like this has been done before. Maybe my implementation have lot's of issues in real life also.
So here is an example:
trait Messenger {
def send
}
class SkypeMessenger extends Messenger {
def send = println("Skype")
}
class ViberMessenger extends Messenger {
def send = println("Viber")
}
I want here to inject everywhere in my app the implementation configured in only one place:
object App {
val messenger = Inject[Messenger]
def main(args: Array[String]) {
messenger.send
}
}
Note the Inject[Messenger] that I define like below with the config I want (prod or dev):
object Inject extends Injector with DevConfig
trait ProdConfig {
this: Injector =>
register[Messager](new SkypeMessager)
register[Messager](new ViberMessager, "viber")
}
trait DevConfig {
this: Injector =>
register[Messager](new ViberMessager)
register[Messager](new ViberMessager, "viber")
}
And finally here is the Injector which contains all methods apply and register:
class Injector {
var map = Map[String, Any]()
def apply[T: ClassTag] =
map(classTag[T].toString).asInstanceOf[T]
def apply[T: ClassTag](id: String) =
map(classTag[T].toString + id).asInstanceOf[T]
def register[T: ClassTag](instance: T, id: String = "") = {
map += (classTag[T].toString + id -> instance)
instance
}
}
To summaries:
I have a class Injector which is a Map between interfaces/traits (eventually also an id) and an instance of the implementation.
We define a trait for each config (dev, prod...) which contains the registers. It also have a self reference to Injector.
And we create an instance of the Injector with the Config we want
The usage is to call the apply method giving the Interface type (eventually also an id) and it will return the implementation's instance.
What do you think?
You code looks a lot like dependency injection in Lift web framework. You can consult Lift source code to see how it's implemented or just use the framework. You don't have to run a Lift app to use its libraries. Here is a small intro doc. Basically you should be looking at this code in Lift:
package net.liftweb.http
/**
* A base trait for a Factory. A Factory is both an Injector and
* a collection of FactorMaker instances. The FactoryMaker instances auto-register
* with the Injector. This provides both concrete Maker/Vender functionality as
* well as Injector functionality.
*/
trait Factory extends SimpleInjector
You can also check this related question: Scala - write unit tests for objects/singletons that extends a trait/class with DB connection where I show how Lift injector is used.
Thanks guys,
So I make my answer but the one from Aleksey was very good.
I understand better the Cake Pattern with this sample:
https://github.com/freekh/play-slick/tree/master/samples/play-slick-cake-sample
Take a look also to the other implementations without DI and compare:
https://github.com/freekh/play-slick/tree/master/samples/
And so the cake pattern doesn't have a centralized config like we can have with my shown lift style DI. I will anyway use the Cake pattern as it fits well with Slick.
What I didn't like with Subcut is the implicits everywhere. I know there is a way to avoid them but it looks like a fix to me.
Thanks
To comment on MacWire, you are right that you could just use new - and that's the whole point :). MacWire is there only to let you remove some boilerplate from your code, by not having to enumerate all the dependencies again (which is already done in the constructor).
The main idea is that you do the wiring at "the end of the world", where you assemble your application (or you could divide that into trait-modules, but that's optional). Otherwise you just use constructors to express dependencies. No magic, no frameworks.

Serialize Function1 to database

I know it's not directly possible to serialize a function/anonymous class to the database but what are the alternatives? Do you know any useful approach to this?
To present my situation: I want to award a user "badges" based on his scores. So I have different types of badges that can be easily defined by extending this class:
class BadgeType(id:Long, name:String, detector:Function1[List[UserScore],Boolean])
The detector member is a function that walks the list of scores and return true if the User qualifies for a badge of this type.
The problem is that each time I want to add/edit/modify a badge type I need to edit the source code, recompile the whole thing and re-deploy the server. It would be much more useful if I could persist all BadgeType instances to a database. But how to do that?
The only thing that comes to mind is to have the body of the function as a script (ex: Groovy) that is evaluated at runtime.
Another approach (that does not involve a database) might be to have each badge type into a jar that I can somehow hot-deploy at runtime, which I guess is how a plugin-system might work.
What do you think?
My very brief advice is that if you want this to be truly data-driven, you need to implement a rules DSL and an interpreter. The rules are what get saved to the database, and the interpreter takes a rule instance and evaluates it against some context.
But that's overkill most of the time. You're better off having a little snippet of actual Scala code that implements the rule for each badge, give them unique IDs, then store the IDs in the database.
e.g.:
trait BadgeEval extends Function1[User,Boolean] {
def badgeId: Int
}
object Badge1234 extends BadgeEval {
def badgeId = 1234
def apply(user: User) = {
user.isSufficientlyAwesome // && ...
}
}
You can either have a big whitelist of BadgeEval instances:
val weDontNeedNoStinkingBadges = Map(
1234 -> Badge1234,
5678 -> Badge5678,
// ...
}
def evaluator(id: Int): Option[BadgeEval] = weDontNeedNoStinkingBadges.get(id)
def doesUserGetBadge(user: User, id: Int) = evaluator(id).map(_(user)).getOrElse(false)
... or if you want to keep them decoupled, use reflection:
def badgeEvalClass(id: Int) = Class.forName("com.example.badge.Badge" + id + "$").asInstanceOf[Class[BadgeEval]]
... and if you're interested in runtime pluggability, try the service provider pattern.
You can try and use Scala Continuations - they can give you the ability to serialize the computation and run it at later time or even on another machine.
Some links:
Continuations
What are Scala continuations and why use them?
Swarm - Concurrency with Scala Continuations
Serialization relates to data rather than methods. You cannot serialize functionality because it is a class file which is designed to serialize that and object serialization serializes the fields of an object.
So like Alex says, you need a rule engine.
Try this one if you want something fairly simple, which is string based, so you can serialize the rules as strings in a database or file:
http://blog.maxant.co.uk/pebble/2011/11/12/1321129560000.html
Using a DSL has the same problems unless you interpret or compile the code at runtime.