I am new in Spark,
How can I check for for Null value in Double and Int value in scala or Spark.
Like for String We can do like this :
val value = (FirstString.isEmpty()) match {
case true => SecondString
case _ => FirstString
}
I searched for it a lot but i found only for String value. Can you please suggest me for other datatype as well.
Thanks in advance.
null is only applicable for AnyRef (i.e non primitive types) types in Scala. AnyVal types can not be set to null.
For example:
// the below are AnyVal(s) and wont compile
val c: Char = null
val i: Int = null
val d: Double = null
String is an AnyRef and so can be null:
// this is ok!
val c: String = null
That's why pattern matching nulls to Int/Double types is not possible:
// wont compile!
null match {
case a:Int => "is a null Int"
case _ => "something else"
}
May be you can simply use Options. So like:
val d: Double = ...
val isNull = Option(d).isDefined
Or you can use pattern matching:
val d: Double = ...
Option(d) match {
case Some(v) => use v
case _ => you got Null
}
isEmpty is not at all the same as "check for null". Calling isEmpty on null will fail:
val s: String = null
s.isEmpty // throws NullPointerException
Int and Double can't be null (neither can any other primitive types), so there is no need to check if they are. If you are talking specifically about Spark Rows, you need to check for null before getting an Int/Double/other primitive value:
It is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null.
Related
I have an alphanumeric field in an RDD of type AnyRef.
Case1: If it's 99898, I want to cast it as Long
Case2: If it's 0099898, I want to cast it as String
Case3: If it's AB998, I want to cast it as String.
I am trying this:
try {
account_number.asInstanceOf[ Long ])
} catch {
case _: Throwable => account_number.asInstanceOf[ String ])
}
But in this, I miss the case2, because 0099898 is converted to 99898. Any ideas?
If this field is AnyRef I wouldn't expect AnyVals there at all (like Long) - Scala's numbers are not equal to Java's numbers. At best you can have there some instance of java.lang.Numeric (e.g. java.lang.Long which is NOT scala.Long).
But to turn it into Long you would have to use pattern matching (with type matching or regexp pattern matching) and conversion (NOT casting!) to
val isStringID = raw"(0[0-9]+)".r
val isLongID = raw"([0-9]+)".r
account_number match {
case isStringID(id) => id // numeric string starting with 0
case isLongID(id) => id.toLong // numeric string convertible to Long
case l: java.lang.Long => l.toLong // Java's long
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
However, I would find that completely useless - right now you have Any instead of AnyVal. You could expect it to have Long or String but it's not represented by the returned value so compiler would NOT have any information about the safe usages. Personally, I would recommend doing something imediatelly after matching e.g. wrapping it with Either or creating ADT or passing it to function which needs String or Long.
// can be exhaustively pattern matched, or .folded or passed, etc
val stringOrLong: Either[String, Long] = account_number match {
case isStringID(id) => Left(id)
case isLongID(id) => Right(id.toLong)
case l: java.lang.Long => Right(l.toLong)
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
You cannot use .asInstanceOf to turn AnyRef to Long because neither is subtype or supertype of another, and this operation would always fail.
Any
/ \
AnyVal AnyRef
| |
Long |
\ /
Nothing
.asInstanceOf would only make sense if you were moving vertically in this hierarchy, not horizontally.
Another option you have is:
def tryConvert(s: String): Either[Long, String] = {
Try(s.toLong).filter(_.toString == s) match {
case Success(value) =>
Left(value)
case Failure(_) =>
Right(s)
}
}
Code run at Scastie.
How do you typically fix the "java.io.Serializable" error below?
I am guessing the data types in my functions caused it (?). How do you avoid it OR change the result back to the right type.
def allKeys(sampledf: DataFrame): DataFrame = {......}
val afd12= afd.schema.fieldNames.contains("ID") && afd.schema.fieldNames.contains("CONNECTIDS") match {
case true => allKeys(afd)
case false => "no"
}
afd12.printSchema()
This is the error I get:
afd: java.io.Serializable = [ID: string, ADDITIONALINFO: string ... 87 more fields]
<console>:95: error: value printSchema is not a member of java.io.Serializable
afd12.printSchema()
^
You have to make sure that pattern match block
match {
case true => allKeys(afd)
case false => "no"
}
returns consistent types. Right now one branch returns Dataset[Row] and another String, so the closest common type is Serializable. The simplest fix is to return an empty DataFrame with schema of your choice, instead of no.
match {
case true => allKeys(afd)
case _ => spark.emptyDataFrame
}
I'm new to Scala and Play framework. I try to query all the data for selected columns from a data table and save them as Excel file.
Selected columns usually have different types, such as Int, Str, Timestamp, etc.
I want to convert all value types, include null into String
(null convert to empty string "")
without knowing the actual type of a column, so the code can be used for any tables.
According to Play's document, I can write the implicit converter below, however, this cannot handle null. Googled this for long time, cannot find solution. Can someone please let me know how to handle null in the implicit converter?
Thanks in advance~
implicit def valueToString: anorm.Column[String] =
anorm.Column.nonNull1[String] { (value, meta) =>
val MetaDataItem(qualified, nullable, clazz) = meta
value match {
case s: String => Right(s) // Provided-default case
case i: Int => Right(i.toString()) // Int to String
case t: java.sql.Clob => Right(t.toString()) // Blob/Text to String
case d: java.sql.Timestamp => Right(d.toString()) // Datatime to String
case _ => Left(TypeDoesNotMatch(s"Cannot convert $value: ${value.asInstanceOf[AnyRef].getClass} to String for column $qualified"))
}
}
As indicated in the documentation, if there a Column[T], allowing to parse of column of type T, and if the column(s) can be null, then Option[T] should be asked, benefiting from the generic support as Option[T].
There it is a custom Column[String] (make sure the custom one is used, not the provided Column[String]), so Option[String] should be asked.
import myImplicitStrColumn
val parser = get[Option[String]]("col")
I tried to immitate the default keyword of C#:
private class Default[T] {
private var default : T = _
def get = default
}
Then in the package object I define:
def default[T] = new Default[T].get
I expected default[Int] to be 0, but
println(default[String])
println(default[Int])
println(default[Double])
println(default[Boolean])
all prints null. However
val x = default[Int]
println(x)
prints 0. If I add a type annotation : Any to x it prints null again.
I'm guessing because println expects an argument of type Any the same is happening there.
How is it possible that assigning an expression to a variable of a more general type changes the value of that expression? I find that really counter-intuitive.
Has it something to do with boxing, so that I'm actually calling two different default functions (once with primitive int, once with Integer)? If yes, is there a way to avoid that?
After studying the generated bytecode, I realised what's actually happening. default[T] always returns null, but assigning it to a primitive calls BoxesRunTime.unboxTo... which converts null to whatever the primitive default is.
There are not so many such classes. You could process all of them explicitly:
import scala.reflect.ClassTag
def default[T: ClassTag]: T = (implicitly[ClassTag[T]] match {
case ClassTag.Boolean => false
case ClassTag.Byte => 0: Byte
case ClassTag.Char => 0: Char
case ClassTag.Double => 0: Double
case ClassTag.Float => 0: Float
case ClassTag.Int => 0: Int
case ClassTag.Long => 0: Long
case ClassTag.Short => 0: Short
case ClassTag.Unit => ()
case _ => null.asInstanceOf[T]
}).asInstanceOf[T]
scala> println(default[Int])
0
In Groovy language, it is very simple to check for null or false like:
groovy code:
def some = getSomething()
if(some) {
// do something with some as it is not null or emtpy
}
In Groovy if some is null or is empty string or is zero number etc. will evaluate to false. What is similar concise method of testing for null or false in Scala?
What is the simple answer to this part of the question assuming some is simply of Java type String?
Also another even better method in groovy is:
def str = some?.toString()
which means if some is not null then the toString method on some would be invoked instead of throwing NPE in case some was null. What is similar in Scala?
What you may be missing is that a function like getSomething in Scala probably wouldn't return null, empty string or zero number. A function that might return a meaningful value or might not would have as its return an Option - it would return Some(meaningfulvalue) or None.
You can then check for this and handle the meaningful value with something like
val some = getSomething()
some match {
case Some(theValue) => doSomethingWith(theValue)
case None => println("Whoops, didn't get anything useful back")
}
So instead of trying to encode the "failure" value in the return value, Scala has specific support for the common "return something meaningful or indicate failure" case.
Having said that, Scala's interoperable with Java, and Java returns nulls from functions all the time. If getSomething is a Java function that returns null, there's a factory object that will make Some or None out of the returned value.
So
val some = Option(getSomething())
some match {
case Some(theValue) => doSomethingWith(theValue)
case None => println("Whoops, didn't get anything useful back")
}
... which is pretty simple, I claim, and won't go NPE on you.
The other answers are doing interesting and idiomatic things, but that may be more than you need right now.
Well, Boolean cannot be null, unless passed as a type parameter. The way to handle null is to convert it into an Option, and then use all the Option stuff. For example:
Option(some) foreach { s => println(s) }
Option(some) getOrElse defaultValue
Since Scala is statically type, a thing can't be "a null or is empty string or is zero number etc". You might pass an Any which can be any of those things, but then you'd have to match on each type to be able to do anything useful with it anyway. If you find yourself in this situation, you most likely are not doing idiomatic Scala.
In Scala, the expressions you described mean that a method called ? is invoked on an object called some. Regularly, objects don't have a method called ?. You can create your own implicit conversion to an object with a ? method which checks for nullness.
implicit def conversion(x: AnyRef) = new {
def ? = x ne null
}
The above will, in essence, convert any object on which you call the method ? into the expression on the right hand side of the method conversion (which does have the ? method). For example, if you do this:
"".?
the compiler will detect that a String object has no ? method, and rewrite it into:
conversion("").?
Illustrated in an interpreter (note that you can omit . when calling methods on objects):
scala> implicit def any2hm(x: AnyRef) = new {
| def ? = x ne null
| }
any2hm: (x: AnyRef)java.lang.Object{def ?: Boolean}
scala> val x: String = "!!"
x: String = "!!"
scala> x ?
res0: Boolean = true
scala> val y: String = null
y: String = null
scala> y ?
res1: Boolean = false
So you could write:
if (some ?) {
// ...
}
Or you could create an implicit conversion into an object with a ? method which invokes the specified method on the object if the argument is not null - do this:
scala> implicit def any2hm[T <: AnyRef](x: T) = new {
| def ?(f: T => Unit) = if (x ne null) f(x)
| }
any2hm: [T <: AnyRef](x: T)java.lang.Object{def ?(f: (T) => Unit): Unit}
scala> x ? { println }
!!
scala> y ? { println }
so that you could then write:
some ? { _.toString }
Building (recursively) on soc's answer, you can pattern match on x in the examples above to refine what ? does depending on the type of x. :D
If you use extempore's null-safe coalescing operator, then you could write your str example as
val str = ?:(some)(_.toString)()
It also allows you to chain without worrying about nulls (thus "coalescing"):
val c = ?:(some)(_.toString)(_.length)()
Of course, this answer only addresses the second part of your question.
You could write some wrapper yourself or use an Option type.
I really wouldn't check for null though. If there is a null somewhere, you should fix it and not build checks around it.
Building on top of axel22's answer:
implicit def any2hm(x: Any) = new {
def ? = x match {
case null => false
case false => false
case 0 => false
case s: String if s.isEmpty => false
case _ => true
}
}
Edit: This seems to either crash the compiler or doesn't work. I'll investigate.
What you ask for is something in the line of Safe Navigation Operator (?.) of Groovy, andand gem of Ruby, or accessor variant of the existential operator (?.) of CoffeeScript. For such cases, I generally use ? method of my RichOption[T], which is defined as follows
class RichOption[T](option: Option[T]) {
def ?[V](f: T => Option[V]): Option[V] = option match {
case Some(v) => f(v)
case _ => None
}
}
implicit def option2RichOption[T](option: Option[T]): RichOption[T] =
new RichOption[T](option)
and used as follows
scala> val xs = None
xs: None.type = None
scala> xs.?(_ => Option("gotcha"))
res1: Option[java.lang.String] = None
scala> val ys = Some(1)
ys: Some[Int] = Some(1)
scala> ys.?(x => Some(x * 2))
res2: Option[Int] = Some(2)
Using pattern matching as suggested in a couple of answers here is a nice approach:
val some = Option(getSomething())
some match {
case Some(theValue) => doSomethingWith(theValue)
case None => println("Whoops, didn't get anything useful back")
}
But, a bit verbose.
I prefer to map an Option in the following way:
Option(getSomething()) map (something -> doSomethingWith(something))
One liner, short, clear.
The reason to that is Option can be viewed as some kind of collection – some special snowflake of a collection that contains either zero elements or exactly one element of a type and as as you can map a List[A] to a List[B], you can map an Option[A] to an Option[B]. This means that if your instance of Option[A] is defined, i.e. it is Some[A], the result is Some[B], otherwise it is None. It's really powerful!