How do I avoid/fix error "java.io.Serializable" in Scala - scala

How do you typically fix the "java.io.Serializable" error below?
I am guessing the data types in my functions caused it (?). How do you avoid it OR change the result back to the right type.
def allKeys(sampledf: DataFrame): DataFrame = {......}
val afd12= afd.schema.fieldNames.contains("ID") && afd.schema.fieldNames.contains("CONNECTIDS") match {
case true => allKeys(afd)
case false => "no"
}
afd12.printSchema()
This is the error I get:
afd: java.io.Serializable = [ID: string, ADDITIONALINFO: string ... 87 more fields]
<console>:95: error: value printSchema is not a member of java.io.Serializable
afd12.printSchema()
^

You have to make sure that pattern match block
match {
case true => allKeys(afd)
case false => "no"
}
returns consistent types. Right now one branch returns Dataset[Row] and another String, so the closest common type is Serializable. The simplest fix is to return an empty DataFrame with schema of your choice, instead of no.
match {
case true => allKeys(afd)
case _ => spark.emptyDataFrame
}

Related

Typecasting in Scala

I have an alphanumeric field in an RDD of type AnyRef.
Case1: If it's 99898, I want to cast it as Long
Case2: If it's 0099898, I want to cast it as String
Case3: If it's AB998, I want to cast it as String.
I am trying this:
try {
account_number.asInstanceOf[ Long ])
} catch {
case _: Throwable => account_number.asInstanceOf[ String ])
}
But in this, I miss the case2, because 0099898 is converted to 99898. Any ideas?
If this field is AnyRef I wouldn't expect AnyVals there at all (like Long) - Scala's numbers are not equal to Java's numbers. At best you can have there some instance of java.lang.Numeric (e.g. java.lang.Long which is NOT scala.Long).
But to turn it into Long you would have to use pattern matching (with type matching or regexp pattern matching) and conversion (NOT casting!) to
val isStringID = raw"(0[0-9]+)".r
val isLongID = raw"([0-9]+)".r
account_number match {
case isStringID(id) => id // numeric string starting with 0
case isLongID(id) => id.toLong // numeric string convertible to Long
case l: java.lang.Long => l.toLong // Java's long
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
However, I would find that completely useless - right now you have Any instead of AnyVal. You could expect it to have Long or String but it's not represented by the returned value so compiler would NOT have any information about the safe usages. Personally, I would recommend doing something imediatelly after matching e.g. wrapping it with Either or creating ADT or passing it to function which needs String or Long.
// can be exhaustively pattern matched, or .folded or passed, etc
val stringOrLong: Either[String, Long] = account_number match {
case isStringID(id) => Left(id)
case isLongID(id) => Right(id.toLong)
case l: java.lang.Long => Right(l.toLong)
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
You cannot use .asInstanceOf to turn AnyRef to Long because neither is subtype or supertype of another, and this operation would always fail.
Any
/ \
AnyVal AnyRef
| |
Long |
\ /
Nothing
.asInstanceOf would only make sense if you were moving vertically in this hierarchy, not horizontally.
Another option you have is:
def tryConvert(s: String): Either[Long, String] = {
Try(s.toLong).filter(_.toString == s) match {
case Success(value) =>
Left(value)
case Failure(_) =>
Right(s)
}
}
Code run at Scastie.

Scala: How to add match vals to a list val

I have a few vals that match for matching values
Here is an example:
val job_ = Try(jobId.toInt) match {
case Success(value) => jobs.findById(value).map(_.id)
.getOrElse( Left(WrongValue("jobId", s"$value is not a valid job id")))
case Failure(_) => jobs.findByName(jobId.toString).map(_.id)
.getOrElse( Left(WrongValue("jobId", s"'$jobId' is not a known job title.")))
}
// Here the value arrives as a string e.i "yes || no || true || or false" then converted to a boolean
val bool_ = bool.toLowerCase() match {
case "yes" => true
case "no" => false
case "true" => true
case "false" => false
case other => Left(Invalid("bool", s"wrong value received"))
}
Note: invalid case is case class Invalid(x: String, xx: String)
above i'm looking for a given job value and checking whether it exist in the db or not,
No I have a few of these and want to add to a list, here is my list val and flatten it:
val errors = List(..all my vals errors...).flatten // <--- my_list_val (how do I include val bool_ and val job_)
if (errors.isEmpty) { do stuff }
My result should contain errors from val bool_ and val job_
THANK!
You need to fix the types first. The type of bool_ is Any. Which does not give you something you can work with.
If you want to use Either, you need to use it everwhere.
Then, the easiest approach would be to use a for comprehension (I am assuming you're dealing with Either[F, T] here, where WrongValue and Invalid are both sub-classes of F and you're not really interested in the errors).
for {
foundJob <- job_
_ <- bool_
} yield {
// do stuff
}
Note, that in Scala >= 2.13 you can use toIntOption when converting the String to Int:
vaj job_: Either[F, T] = jobId.toIntOption match {
case Some(value) => ...
case _ => ...
}
Also, in case expressions, you can use alternatives when you have the same statement for several cases:
val bool_: Either[F, Boolean] = bool.toLowerCase() match {
case "yes" | "true" => Right(true)
case "no" | "false" => Right(false)
case other => Left(Invalid("bool", "wrong value received"))
}
So, according to your question, and your comments, these are the types you're dealing with.
type ID = Long //whatever id is
def WrongValue(x: String, xx: String) :String = "?-?-?"
case class Invalid(x: String, xx: String)
Now let's create a couple of error values.
val job_ :Either[String,ID] = Left(WrongValue("x","xx"))
val bool_ :Either[Invalid,Boolean] = Left(Invalid("x","xx"))
To combine and report them you might do something like this.
val errors :List[String] =
List(job_, bool_).flatMap(_.swap.toOption.map(_.toString))
println(errors.mkString(" & "))
//?-?-? & Invalid(x,xx)
After checking types as #cbley explained. You can just do a filter operation with pattern matching on your list:
val error = List(// your variables ).filter(_ match{
case Left(_) => true
case _ => false
})

Understanding Scala default argument message

I am playing around with some Scala code and have met with an error message I don't quite follow. Below is my code
val ignoredIds = Array("one", "two", "three")
def csNotify(x : Any): String = {
case org: String if !ignoredIds.contains(x) =>
println( s" $x should not be here")
"one"
case org : String if ignoredIds.contains(x) =>
println(s"$x should be here")
"two"
}
csNotify("four")
The console output is that I am the arguments for a default function must be known. The error point appears to be pointing at the " String = ". Why would this be the case ? The function should check the two cases and return a string ?
Your case is not finding the match against which it can check your block , and you have missed the match block:
val ignoredIds = Array("one", "two", "three")
def csNotify(x : Any): String = x match {
case org: String if !ignoredIds.contains(x) =>
println( s" $x should not be here")
"one"
case org : String if ignoredIds.contains(x) =>
println(s"$x should be here")
"two"
}
csNotify("four")
So basically when you pass x in method , you have to give it for match as well.
Amit Prasad's answer already shows how to fix it, but to explain the error message:
{
case org: String if !ignoredIds.contains(x) =>
println( s" $x should not be here")
"one"
case org : String if ignoredIds.contains(x) =>
println(s"$x should be here")
"two"
}
on its own (without ... match before it) is a pattern-matching anonymous function, which can only be used where the compiler knows the argument type from the context, i.e. the expected type must be either PartialFunction[Something, SomethingElse] or a single-abstract-method type (including Something => SomethingElse).
Here the expected type is String, which isn't either of those, so the compiler complains about not knowing what the argument type is.
You need to use match keyword here to use cases. There might be some value for which you will be using pattern matching. So use the following code in your function:
x match {
case org: String if !ignoredIds.contains(x) => ???
case org : String if ignoredIds.contains(x) => ???
}
Also, you should consider adding one more case which is default. As you know the parameter x of your function def csNotify(x: Any): String is of type any. So anything other than String can also be passed here like Int or Boolean or any custom type. In that case, the code will break with match error.
There will also be a compiler warning saying match is not exhaustive as the current code does not handle all possible values for type Any of parameter x.
But if you add one default case in your pattern matching, all the cases which are not handled by the first two cases (unexpected type or values) will go to the default case. In this way the code will be more robust:
def csNotify(x : Any): String = x match {
case org: String if !ignoredIds.contains(org) => ???
case org : String if ignoredIds.contains(org) => ???
case org => s"unwanted value: $org" // or any default value
}
Note: Kindly replace ??? with your intended code. :)

Null check for Double/Int Value in Spark

I am new in Spark,
How can I check for for Null value in Double and Int value in scala or Spark.
Like for String We can do like this :
val value = (FirstString.isEmpty()) match {
case true => SecondString
case _ => FirstString
}
I searched for it a lot but i found only for String value. Can you please suggest me for other datatype as well.
Thanks in advance.
null is only applicable for AnyRef (i.e non primitive types) types in Scala. AnyVal types can not be set to null.
For example:
// the below are AnyVal(s) and wont compile
val c: Char = null
val i: Int = null
val d: Double = null
String is an AnyRef and so can be null:
// this is ok!
val c: String = null
That's why pattern matching nulls to Int/Double types is not possible:
// wont compile!
null match {
case a:Int => "is a null Int"
case _ => "something else"
}
May be you can simply use Options. So like:
val d: Double = ...
val isNull = Option(d).isDefined
Or you can use pattern matching:
val d: Double = ...
Option(d) match {
case Some(v) => use v
case _ => you got Null
}
isEmpty is not at all the same as "check for null". Calling isEmpty on null will fail:
val s: String = null
s.isEmpty // throws NullPointerException
Int and Double can't be null (neither can any other primitive types), so there is no need to check if they are. If you are talking specifically about Spark Rows, you need to check for null before getting an Int/Double/other primitive value:
It is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null.

Type Mismatch in scala case match

Trying to create multiple dataframes in a single foreach, using spark, as below
I get values delivery and click out of row.getAs("type"), when I try to print them.
val check = eachrec.foreach(recrd => recrd.map(row => {
row.getAs("type") match {
case "delivery" => val delivery_data = delivery(row.get(0).toString,row.get(1).toString)
case "click" => val click_data = delivery(row.get(0).toString,row.get(1).toString)
case _ => "not sure if this impacts"
}})
)
but getting below error:
Error:(41, 14) type mismatch; found : String("delivery") required: Nothing
case "delivery" => val delivery_data = delivery(row.get(0).toString,row.get(1).toString)
^
My plan is to create dataframe using todf() once I create these individual delivery objects referenced by delivery_data and click_data by:
delivery_data.toDF() and click_data.toDF().
Please provide any clue regarding the error above (in match case).
How can I create two df's using todf() in val check?
val declarations make your first 2 cases return type to be unit, but in the third case you return a String
for instance, here the z type was inferred by the compiler, Unit:
def x = {
val z: Unit = 3 match {
case 2 => val a = 2
case _ => val b = 3
}
}
I think you need to cast this match clause to String.
row.getAs("type").toString