What is the point of Serializable interface? - scala

Here is a runnable demo
import java.io._
object Main extends App {
case class Value(s: String)
val serializer = new ObjectOutputStream(new ByteArrayOutputStream())
serializer.writeObject(Value("123"))
println("success") //> success
}
Please note that program succeeds despite I do not mark my class with Serializable. Does Serializable make sense in Scala?

Case classes extend Serializable by default in Scala. If you create a regular class it will need to extend Serializable otherwise it will throw a serialization error.

Related

Scala inner case class not serializable

I am trying to do a very basic serialization of a very simple case class in Scala:
import org.scalatest.wordspec.AnyWordSpecLike
import java.io.{ByteArrayOutputStream, ObjectOutputStream}
class PersistenceSpec extends AnyWordSpecLike{
case class TestClass(name: String) extends Serializable
def serializeSomething(): ByteArrayOutputStream = {
val testItem = TestClass("My Thing")
val bos: ByteArrayOutputStream = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(bos)
oos.writeObject(testItem)
bos
}
"serializeSomething" when {
"executed" must {
"successfully serialize" in {
val outputStream = serializeSomething()
println(outputStream.toString())
}
}
}
}
When I run this test I get a java.io.NotSerializableException on the call to oos.writeObject(testItem), which makes no sense, since case classes automatically implement Serializable, and this is the simplest possible example.
However, if I paste the code for TestClass and serializeSomething() into repl, I am able to call the function, and it works just fine.
What is different when calling my function via scalatest, vs repl that would cause this exception?
One final note: If I change the call from oos.writeObject(testItem) to oos.writeObject("Hello"), it works fine, even when run from scalatest.
You need to define TestClass outside of PersistenceSpec.
Inner class instances automatically get a reference to the instance of the outer class. So, when you write it out, it tries to serialize the PersistenceSpec instance as well, and that of course fails.

Scala/Spark serializable error - join don't work

I am trying to use join method between 2 RDD and save it to cassandra but my code don't work . at the begining , i get a huge Main method and everything working well , but when i use function and class this don't work . i am new to scala and spark
code is :
class Migration extends Serializable {
case class userId(offerFamily: String, bp: String, pdl: String) extends Serializable
case class siteExternalId(site_external_id: Option[String]) extends Serializable
case class profileData(begin_ts: Option[Long], Source: Option[String]) extends Serializable
def SparkMigrationProfile(sc: SparkContext) = {
val test = sc.cassandraTable[siteExternalId](KEYSPACE,TABLE)
.keyBy[userId]
.filter(x => x._2.site_external_id != None)
val profileRDD = sc.cassandraTable[profileData](KEYSPACE,TABLE)
.keyBy[userId]
//dont work
test.join(profileRDD)
.foreach(println)
// don't work
test.join(profileRDD)
.saveToCassandra(keyspace, table)
}
At the beginig i get the famous : Exception in thread "main" org.apache.spark.SparkException: Task not serializable at . . .
so i extends my main class and also the case class but stil don't work .
I think you should move your case classes from Migration class to dedicated file and/or object. This should solve your problem. Additionally, Scala case classes are serializable by default.

Extending generic Serializables with implicit conversions

I am trying to add extensions methods to Serializable types and there seems to be a hole in my understanding of the class. Here is a snippet of the basics of what I'm trying to do:
class YesSer extends Serializable
class NoSer
implicit class SerOps[S <: Serializable](s: S) {
def isSer(msg: String) = {
println(msg)
assert(s.isInstanceOf[Serializable])
}
}
val n = new NoSer
val ln = List(new NoSer, new NoSer)
val y = new YesSer
val ly = List(new YesSer, new YesSer)
// n.isSer("non Serializable")
ln.isSer("list of non Serializable")
y.isSer("Serializable")
ly.isSer("list of Serializable")
List extends Serializable
It's obvious to me the line n.isSer won't compile, but it also seems that ln.isSer also shouldn't compile, as its "inner" type is NoSer. Is there some kind of coercion to Serializeable of the inner type of ln? Am I trying to do something absolutely bonkers??
List extends Serializable. So List[A].isSer(String) is defined; the type of A does not matter.
Serializable is just a marker interface, used to indicate whether a class is designed to be serializable. Whether or not you will be able to actually serialize the object depends on whether the entire transitive object graph rooted at the object is serializable. Your ln will fail serialization at runtime with a NotSerializableException because it contains non-serializable types. See the javadoc for java.lang.Serializable (which scala.Serializable extends) for more details.

How to make case classes/objects NOT serializable in scala? annotation/trait/helper works

In scala, I would like to disable the Serializable trait of many case classes, since I want this class of objects to be never serialized and shipped to a remote computer in a distributed computing framework (Specifically Apache Spark), any implementation that does so should trigger an explicit runtime exception when any closure containing it is serialized.
I've tried #transient + null check, it triggers a runtime exception at deserialization (not what I want), and the error information is quite obfuscated. Is there a way to improve this?
Thanks a lot for your advice!
You can implement and mix in a trait that disables serialization:
trait NotSerializable extends Serializable {
private def writeObject(out: java.io.ObjectOutputStream): Unit = throw new NotSerializableException()
private def readObject(in: java.io.ObjectInputStream): Unit = throw new NotSerializableException()
private def readObjectNoData(): Unit = throw new NotSerializableException()
}
case class Test(foo: String) extends NotSerializable
An attempt to serialize will then throw an exception:
new ObjectOutputStream(new ByteArrayOutputStream()).writeObject(Test("test"))
|-> java.io.NotSerializableException: A$A39$A$A39
However, what feature of case class do you actually need?
The most simple solution might be to not use case classes and objects.

Apache Spark Task not Serializable when Class exends Serializable

I am consistently having errors regarding Task not Serializable.
I have made a small Class and it extends Serializable - which is what I believe is meant to be the case when you need values in it to be serialised.
class SGD(filePath : String) extends Serializable {
val rdd = sc.textFile(filePath)
val mappedRDD = rdd.map(x => x.split(" ")
.slice(0,3))
.map(y => Rating(y(0).toInt, y(1).toInt, y(2).toDouble))
.cache
val RNG = new Random(1)
val factorsRDD = mappedRDD(x => (x.user, (x.product, x.rating)))
.groupByKey
.mapValues(listOfItemsAndRatings =>
Vector(Array.fill(2){RNG.nextDouble}))
}
The final line always results in a Task not Serializable error. What I do not understand is: the Class is Serializable; and, the Class Random is also Serializable according to the API. So, what am I doing wrong? I consistently can't get stuff like this to work; therefore, I imagine my understanding is wrong. I keep being told the Class must be Serializable... well it is and it still doesn't work!?
scala.util.Random was not Serializable until 2.11.0-M2.
Most likely you are using an earlier version of Scala.
A class doesn't become Serializable until all its members are Serializable as well (or some other mechanism is provided to serialize them, e.g. transient or readObject/writeObject.)
I get the following stacktrace when running given example in spark-1.3:
Caused by: java.io.NotSerializableException: scala.util.Random
Serialization stack:
- object not serializable (class: scala.util.Random, value: scala.util.Random#52bbf03d)
- field (class: $iwC$$iwC$SGD, name: RNG, type: class scala.util.Random)
One way to fix it is to take instatiation of random variable within mapValues:
mapValues(listOfItemsAndRatings => { val RNG = new Random(1)
Vector(Array.fill(2)(RNG.nextDouble)) })