How to serialize/deserialize case class to js.Dynamic with uPickle - scala

I am using uPickle/ScalaJS to deserialize a js.Dynamic object into a case class using this code fragment:
read[myClass](JSON.stringify(dynObj))
where myClass is the case class and dynObj is the js.Dynamic object.
Is there a boilerplate-free and simpler way to do this?
In order to serialize a case class, I have been able to serialize to js.Dynamic using Shapeless using this example as a starting point:
Converting nested case classes to nested Maps using Shapeless
I would like to be able to use uPickle to do this instead. How can I accomplish the round-trip with uPickle?

upickle.default.readJs[myClass](upickle.json.readJs(dynObj))
Should do it. You can wrap it in a nice helper if you find yourself doing it a lot.
Similar calls exist to write things to js.Dynamic, just the other way round
upickle.json.writeJs(upickle.default.writeJs[myClass](myClassInstance))
Though you can probably leave out the type parameter here since it'll be inferred

The answer above no longer applies for newer versions of upickle. In version 0.6.5 I had to use the following to deserialize a dynamic object:
val someJsObject: js.Dynamic = ...
upickle.WebJson.transform(someJsObject, implicitly[upickle.default.Reader[TargetType]])
To serialize, you will probably want something like:
val sourceObject: SourceType = ...
implicitly[upickle.default.Writer[SourceType]].write(upickle.WebJson.Builder, sourceObject)

Related

Using Enumerations in Scala Best Practices

I have been using sealed traits and case objects to define enumerated types in Scala and I recently came across another approach to extend the Enumeration class in Scala like this below:
object CertificateStatusEnum extends Enumeration {
val Accepted, SignatureError, CertificateExpired, CertificateRevoked, NoCertificateAvailable, CertChainError, ContractCancelled = Value
}
against doing something like this:
sealed trait CertificateStatus
object CertificateStatus extends {
case object Accepted extends CertificateStatus
case object SignatureError extends CertificateStatus
case object CertificateExpired extends CertificateStatus
case object CertificateRevoked extends CertificateStatus
case object NoCertificateAvailable extends CertificateStatus
case object CertChainError extends CertificateStatus
case object ContractCancelled extends CertificateStatus
}
What is considered a good approach?
They both get the job done for simple purposes, but in terms of best practice, the use of sealed traits + case objects is more flexible.
The story behind is that since Scala came with everything Java had, so Java had enumerations and Scala had to put them there for interoperability reasons. But Scala does not need them, because it supports ADTs (algebraic data types) so it can generate enumeration in a functional way like the one you just saw.
You'll encounter certain limitations with the normal Enumeration class:
the inability of the compiler to detect pattern matches exhaustively
it's actually harder to extend the elements to hold more data besides the String name and the Int id, because Value is final.
at runtime, all enums have the same type because of type erasure, so limited type level programming - for example, you can't have overloaded methods.
when you did object CertificateStatusEnum extends Enumeration your enumerations will not be defined as CertificateStatusEnum type, but as CertificateStatusEnum.Value - so you have to use some type aliases to fix that. The problem with this is the type of your companion will still be CertificateStatusEnum.Value.type so you'll end up doing multiple aliases to fix that, and have a rather confusing enumeration.
On the other hand, the algebraic data type comes as a type-safe alternative where you specify the shape of each element and to encode the enumeration you just need sum types which are expressed exactly using sealed traits (or abstract classes) and case objects.
These solve the limitations of the Enumeration class, but you'll encounter some other (minor) drawbacks, though these are not that limiting:
case objects won't have a default order - so if you need one, you'll have to add your id as an attribute in the sealed trait and provide an ordering method.
a somewhat problematic issue is that even though case objects are serializable, if you need to deserialize your enumeration, there is no easy way to deserialize a case object from its enumeration name. You will most probably need to write a custom deserializer.
you can't iterate over them by default as you could using Enumeration. But it's not a very common use case. Nevertheless, it can be easily achieved, e.g. :
object CertificateStatus extends {
val values: Seq[CertificateStatus] = Seq(
Accepted,
SignatureError,
CertificateExpired,
CertificateRevoked,
NoCertificateAvailable,
CertChainError,
ContractCancelled
)
// rest of the code
}
In practice, there's nothing that you can do with Enumeration that you can't do with sealed trait + case objects. So the former went out of people's preferences, in favor of the latter.
This comparison only concerns Scala 2.
In Scala 3, they unified ADTs and their generalized versions (GADTs) with enums under a new powerful syntax, effectively giving you everything you need. So you'll have every reason to use them. As Gael mentioned, they became first-class entities.
It depends on what you want from enum.
In the first case, you implicitly have an order on items (accessed by id property). Reordering has consequences.
I'd prefer 'case object', in some cases enum item could have extra info in the constructor (like, Color with RGB, not just name).
Also, I'd recommend https://index.scala-lang.org/mrvisser/sealerate or similar libraries. That allows iterating over all elements.

Creating Spark Dataframes from regular classes

I have always seen that, when we are using a map function, we can create a dataframe from rdd using case class like below:-
case class filematches(
row_num:Long,
matches:Long,
non_matches:Long,
non_match_column_desc:Array[String]
)
newrdd1.map(x=> filematches(x._1,x._2,x._3,x._4)).toDF()
This works great as we all know!!
I was wondering , why we specifically need case classes here?
We should be able to achieve same effect using normal classes with parameterized constructors (as they will be vals and not private):-
class filematches1(
val row_num:Long,
val matches:Long,
val non_matches:Long,
val non_match_column_desc:Array[String]
)
newrdd1.map(x=> new filematches1(x._1,x._2,x._3,x._4)).toDF
Here , I am using new keyword to instantiate the class.
Running above has given me the error:-
error: value toDF is not a member of org.apache.spark.rdd.RDD[filematches1]
I am sure I am missing some key concept on case classes vs regular classes here but not able to find it yet.
To resolve error of
value toDF is not a member of org.apache.spark.rdd.RDD[...]
You should move your case class definition out of function where you are using it. You can refer http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Spark-Scala-Error-value-toDF-is-not-a-member-of-org-apache/td-p/29878 for mode detail.
On your Other query - case classes are syntactic sugar and they provide following additional things
Case classes are different from general classes. They are specially used when creating immutable objects.
They have default apply function which is used as constructor to create object. (so Lesser code)
All the variables in case class are by default val type. Hence immutable. which is a good thing in spark world as all red are immutable
example for case class is
case class Book( name : string)
val book1 = Book("test")
you cannot change value of book1.name as it is immutable. and you do not need to say new Book() to create object here.
The class variables are public by default. so you don't need setter and getters.
Moreover while comparing two objects of case classes, their structure is compared instead of references.
Edit : Spark Uses Following class to Infer Schema
Code Link :
https://github.com/apache/spark/blob/branch-2.4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
If you check. in schemaFor function (Line 719 to 791). It converts Scala types to catalyst types. I this the case to handle non case classes for schema inference is not added yet. so the every time you try to use non case class with infer schema. It goes to other option and hence gives error of Schema for type $other is not supported.
Hope this helps

Use `#annotation.varargs` on constructors

I want to declare a class like this:
class StringSetCreate(val s: String*) {
// ...
}
and call that in Java. The problem is that the constructor is of type
public StringSetCreate(scala.collection.Seq)
So in java, you need to fiddle around with the scala sequences which is ugly.
I know that there is the #annotation.varargs annotation which, if used on a method, generates a second method which takes the java varargs.
This annotation does not work on constructors, at least I don't know where to put it. I found a Scala Issue SI-8383 which reports this problem. As far as I understand there is no solution currently. Is this right? Are there any workarounds? Can I somehow define that second constructor by hand?
The bug is already filed as https://issues.scala-lang.org/browse/SI-8383 .
For a workaround I'd recommend using a factory method (perhaps on the companion object), where #varargs should work:
object StringSetCreate {
#varargs def build(s: String*) = new StringSetCreate(s: _*)
}
Then in Java you call StringSetCreate.build("a", "b") rather than using new.

Scala v 2.10: How to get a new instance of a class (object) starting from the class name

I have tens of JSON fragments to parse, and for each one I need to get an instance of the right parser. My idea was to create a config file where to write the name of the class to instantiate for each parser (a kind of map url -> parser) . Getting back to your solution, I cannot call the method I implemented in each parser if I have a pointer to Any. I suppose this is a very common problem with a well-set solution, but I have no idea what the best practices could be.
I really have no experience with Java, Reflection, Class Loading and all that stuff. So,
can anyone write for me the body of the method below? I need to get an instance of a class passed as String (no arguments needed for the constructor, at least so far...)
def createInstance(clazzName: String) = {
// get the Class for the given clazzName, e.g. "net.my.BeautifulClazz"
// instantiate an object and return it
}
Thanks, as usual...
There is a very simple answer:
scala> def createInstance(clazzName: String) = Class.forName(clazzName).newInstance
createInstance: (clazzName: String)Any
scala> createInstance("java.lang.String")
res0: Any = ""
If it works for you, everything is fine. If it don't, we have to look into your class loader. This is usually the point when things will get dirty.
Depending in what you want to do, look into:
The cake pattern, if you want to combine your classes during compile time
OSGi when you want to build a plugin infrastructure (look here for a very simple example)
Google guice, if you really need dependency injection (e.g. when mixing Scala and Java code) and the cake pattern does not work for you

what is the input type of classOf

I am wondering what type do I put in place of XXX
def registerClass(cl:XXX) = kryo.register(classOf[cl])
EDIT: For why I want to do this.
I have to register many classes using the above code. I wanted to remove the duplication of calling kyro.register several times, hoping to write code like below:
Seq(com.mypackage.class1,com.mypackage.class2,com.mypackage.class3).foreach(registerClass)
Another question, can I pass String instead? and convert it somehow to a class in registerClass?
Seq("com.mypackage.class1","com.mypackage.class2").foreach(registerClass)
EDIT 2:
When I write com.mypackage.class1, it means any class defined in my source. So if I create a class
package com.mypackage.model
class Dummy(val ids:Seq[Int],val name:String)
I would provide com.mypackage.model.Dummy as input
So,
kryo.register(classOf[com.mypackage.model.Dummy])
Kryo is a Java Serialization library. The signature of the register class is
register(Class type)
You could do it like this:
def registerClass(cl:Class[_]) = kryo.register(cl)
And then call it like this:
registerClass(classOf[Int])
The type parameter to classOf needs to be known at compile time. Without knowing more about what you're trying to do, is there any reason you can't use:
def registerClass(cl:XXX) = kryo.register(cl.getClass)