Typecasting in Scala - scala

I have an alphanumeric field in an RDD of type AnyRef.
Case1: If it's 99898, I want to cast it as Long
Case2: If it's 0099898, I want to cast it as String
Case3: If it's AB998, I want to cast it as String.
I am trying this:
try {
account_number.asInstanceOf[ Long ])
} catch {
case _: Throwable => account_number.asInstanceOf[ String ])
}
But in this, I miss the case2, because 0099898 is converted to 99898. Any ideas?

If this field is AnyRef I wouldn't expect AnyVals there at all (like Long) - Scala's numbers are not equal to Java's numbers. At best you can have there some instance of java.lang.Numeric (e.g. java.lang.Long which is NOT scala.Long).
But to turn it into Long you would have to use pattern matching (with type matching or regexp pattern matching) and conversion (NOT casting!) to
val isStringID = raw"(0[0-9]+)".r
val isLongID = raw"([0-9]+)".r
account_number match {
case isStringID(id) => id // numeric string starting with 0
case isLongID(id) => id.toLong // numeric string convertible to Long
case l: java.lang.Long => l.toLong // Java's long
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
However, I would find that completely useless - right now you have Any instead of AnyVal. You could expect it to have Long or String but it's not represented by the returned value so compiler would NOT have any information about the safe usages. Personally, I would recommend doing something imediatelly after matching e.g. wrapping it with Either or creating ADT or passing it to function which needs String or Long.
// can be exhaustively pattern matched, or .folded or passed, etc
val stringOrLong: Either[String, Long] = account_number match {
case isStringID(id) => Left(id)
case isLongID(id) => Right(id.toLong)
case l: java.lang.Long => Right(l.toLong)
case _ => throw new IllegalArgumentException("Expected long or numeric string")
}
You cannot use .asInstanceOf to turn AnyRef to Long because neither is subtype or supertype of another, and this operation would always fail.
Any
/ \
AnyVal AnyRef
| |
Long |
\ /
Nothing
.asInstanceOf would only make sense if you were moving vertically in this hierarchy, not horizontally.

Another option you have is:
def tryConvert(s: String): Either[Long, String] = {
Try(s.toLong).filter(_.toString == s) match {
case Success(value) =>
Left(value)
case Failure(_) =>
Right(s)
}
}
Code run at Scastie.

Related

Scala Pattern Matching using Option[Type]

I am playing around with Scala at the moment and the pattern matching. I have the general idea behind it and can get the basics working. My issue is with Option[]. It is possible to use pattern matching on Option[]'s?
What I am trying to do is make a little function that will take in an option[String] parameter and then based on the input return the string if its a string and a heads up if not. I am not too sure on how to go about this though, I have tried a few thing but it either gives out or in the case below will never hit the second case.
def getString(someString: Option[String]): String =
someString match {
case s: Option[String] => someString //also tried things like case: String => ...
case _ => s"no string entered" //and things like case _ => ...
}
This is the easiest way to implement your function:
def getString(someString: Option[String]): String =
someString.getOrElse("no string entered")
If you want to use match it looks like this:
def getString(someString: Option[String]): String =
someString match {
case Some(s) => s
case _ => "no string entered"
}

How to pattern match on List('(', List[Char],')')?

I am struggling a bit with some pattern matching on a List[Char]. I would like to extract sub-lists that are enclosed by parentheses. So, I would like to extract "test" as a List[Char] when given "(test)". So basically a match on List('(', List[Char],')'). I am able to match on List('(',t,')') where t is a single character, but not a variable amount of characters.
How should this be declared?
val s = "(test)"
s match {
case List('(',t,')') => {
println("matches single character")
}
case '('::x::y => {
//x will be the first character in the List[Char] (after the '(') and y the tail
}
}
s match {
case '(' +: t :+ ')' => ...
}
Read about custom extractors in Scala and then see http://www.scala-lang.org/api/2.11.8/index.html#scala.collection.$colon$plus$ to understand how it works.
Note that it'll match any suitable Seq[Char], but a string isn't really one; it can only be converted (implicitly or explicitly). So you can use one of
val s: Seq[Char] = ...some String or List[Char]
val s = someString.toSeq
I expect that performance for String should be good enough (and if it's critical, don't use this); but for large List[Char] this will be quite slow.

Warning about an unchecked type argument in this Scala pattern match?

This file:
object Test extends App {
val obj = List(1,2,3) : Object
val res = obj match {
case Seq(1,2,3) => "first"
case _ => "other"
}
println(res)
}
Gives this warning:
Test.scala:6: warning: non variable type-argument A in type pattern Seq[A]
is unchecked since it is eliminated by erasure
case Seq(1,2,3) => "first"
Scala version 2.9.0.1.
I don't see how an erased type parameter is needed to perform the match. That first case clause is meant to ask if obj is a Seq with 3 elements equal to 1, 2, and 3.
I would understand this warning if I had written something like:
case strings : Seq[String] => ...
Why do I get the warning, and what is a good way to make it go away?
By the way, I do want to match against something with static type of Object. In the real code I'm parsing something like a Lisp datum - it might be an String, sequence of datums, Symbol, Number, etc.
Here is some insight to what happens behind the scene. Consider this code:
class Test {
new Object match { case x: Seq[Int] => true }
new Object match { case Seq(1) => true }
}
If you compile with scalac -Xprint:12 -unchecked, you'll see the code just before the erasure phase (id 13). For the first type pattern, you will see something like:
<synthetic> val temp1: java.lang.Object = new java.lang.Object();
if (temp1.isInstanceOf[Seq[Int]]())
For the Seq extractor pattern, you will see something like:
<synthetic> val temp3: java.lang.Object = new java.lang.Object();
if (temp3.isInstanceOf[Seq[A]]()) {
<synthetic> val temp4: Seq[A] = temp3.asInstanceOf[Seq[A]]();
<synthetic> val temp5: Some[Seq[A]] = collection.this.Seq.unapplySeq[A](temp4);
// ...
}
In both cases, there is a type test to see if the object is of type Seq (Seq[Int] and Seq[A]). Type parameters will be eliminated during the erasure phase. Thus the warning. Even though the second may be unexpected, it does make sense to check the type since if object is not of type Seq that clause won't match and the JVM can proceed to the next clause. If the type does match, then the object can be casted to Seq and unapplySeq can be called.
RE: thoredge comment on the type check. May be we are talking about different things. I was merely saying that:
(o: Object) match {
case Seq(i) => println("seq " + i)
case Array(i) => println("array " + i)
}
translates to something like:
if (o.isInstanceOf[Seq[_]]) { // type check
val temp1 = o.asInstanceOf[Seq[_]] // cast
// verify that temp1 is of length 1 and println("seq " + temp1(0))
} else if (o.isInstanceOf[Array[_]]) { // type check
val temp1 = o.asInstanceOf[Array[_]] // cast
// verify that temp1 is of length 1 and println("array " + temp1(0))
}
The type check is used so that when the cast is done there is no class cast exception.
Whether the warning non variable type-argument A in type pattern Seq[A] is unchecked since it is eliminated by erasure is justified and whether there would be cases where there could be class cast exception even with the type check, I don't know.
Edit: here is an example:
object SeqSumIs10 {
def unapply(seq: Seq[Int]) = if (seq.sum == 10) Some(seq) else None
}
(Seq("a"): Object) match {
case SeqSumIs10(seq) => println("seq.sum is 10 " + seq)
}
// ClassCastException: java.lang.String cannot be cast to java.lang.Integer
Declaring the match object outside at least makes it go away, but I'm not sure why:
class App
object Test extends App {
val obj = List(1,2,3) : Object
val MatchMe = Seq(1,2,3)
val res = obj match {
case MatchMe => "first"
case _ => "other"
}
println(res)
}

How to check for null or false in Scala concisely?

In Groovy language, it is very simple to check for null or false like:
groovy code:
def some = getSomething()
if(some) {
// do something with some as it is not null or emtpy
}
In Groovy if some is null or is empty string or is zero number etc. will evaluate to false. What is similar concise method of testing for null or false in Scala?
What is the simple answer to this part of the question assuming some is simply of Java type String?
Also another even better method in groovy is:
def str = some?.toString()
which means if some is not null then the toString method on some would be invoked instead of throwing NPE in case some was null. What is similar in Scala?
What you may be missing is that a function like getSomething in Scala probably wouldn't return null, empty string or zero number. A function that might return a meaningful value or might not would have as its return an Option - it would return Some(meaningfulvalue) or None.
You can then check for this and handle the meaningful value with something like
val some = getSomething()
some match {
case Some(theValue) => doSomethingWith(theValue)
case None => println("Whoops, didn't get anything useful back")
}
So instead of trying to encode the "failure" value in the return value, Scala has specific support for the common "return something meaningful or indicate failure" case.
Having said that, Scala's interoperable with Java, and Java returns nulls from functions all the time. If getSomething is a Java function that returns null, there's a factory object that will make Some or None out of the returned value.
So
val some = Option(getSomething())
some match {
case Some(theValue) => doSomethingWith(theValue)
case None => println("Whoops, didn't get anything useful back")
}
... which is pretty simple, I claim, and won't go NPE on you.
The other answers are doing interesting and idiomatic things, but that may be more than you need right now.
Well, Boolean cannot be null, unless passed as a type parameter. The way to handle null is to convert it into an Option, and then use all the Option stuff. For example:
Option(some) foreach { s => println(s) }
Option(some) getOrElse defaultValue
Since Scala is statically type, a thing can't be "a null or is empty string or is zero number etc". You might pass an Any which can be any of those things, but then you'd have to match on each type to be able to do anything useful with it anyway. If you find yourself in this situation, you most likely are not doing idiomatic Scala.
In Scala, the expressions you described mean that a method called ? is invoked on an object called some. Regularly, objects don't have a method called ?. You can create your own implicit conversion to an object with a ? method which checks for nullness.
implicit def conversion(x: AnyRef) = new {
def ? = x ne null
}
The above will, in essence, convert any object on which you call the method ? into the expression on the right hand side of the method conversion (which does have the ? method). For example, if you do this:
"".?
the compiler will detect that a String object has no ? method, and rewrite it into:
conversion("").?
Illustrated in an interpreter (note that you can omit . when calling methods on objects):
scala> implicit def any2hm(x: AnyRef) = new {
| def ? = x ne null
| }
any2hm: (x: AnyRef)java.lang.Object{def ?: Boolean}
scala> val x: String = "!!"
x: String = "!!"
scala> x ?
res0: Boolean = true
scala> val y: String = null
y: String = null
scala> y ?
res1: Boolean = false
So you could write:
if (some ?) {
// ...
}
Or you could create an implicit conversion into an object with a ? method which invokes the specified method on the object if the argument is not null - do this:
scala> implicit def any2hm[T <: AnyRef](x: T) = new {
| def ?(f: T => Unit) = if (x ne null) f(x)
| }
any2hm: [T <: AnyRef](x: T)java.lang.Object{def ?(f: (T) => Unit): Unit}
scala> x ? { println }
!!
scala> y ? { println }
so that you could then write:
some ? { _.toString }
Building (recursively) on soc's answer, you can pattern match on x in the examples above to refine what ? does depending on the type of x. :D
If you use extempore's null-safe coalescing operator, then you could write your str example as
val str = ?:(some)(_.toString)()
It also allows you to chain without worrying about nulls (thus "coalescing"):
val c = ?:(some)(_.toString)(_.length)()
Of course, this answer only addresses the second part of your question.
You could write some wrapper yourself or use an Option type.
I really wouldn't check for null though. If there is a null somewhere, you should fix it and not build checks around it.
Building on top of axel22's answer:
implicit def any2hm(x: Any) = new {
def ? = x match {
case null => false
case false => false
case 0 => false
case s: String if s.isEmpty => false
case _ => true
}
}
Edit: This seems to either crash the compiler or doesn't work. I'll investigate.
What you ask for is something in the line of Safe Navigation Operator (?.) of Groovy, andand gem of Ruby, or accessor variant of the existential operator (?.) of CoffeeScript. For such cases, I generally use ? method of my RichOption[T], which is defined as follows
class RichOption[T](option: Option[T]) {
def ?[V](f: T => Option[V]): Option[V] = option match {
case Some(v) => f(v)
case _ => None
}
}
implicit def option2RichOption[T](option: Option[T]): RichOption[T] =
new RichOption[T](option)
and used as follows
scala> val xs = None
xs: None.type = None
scala> xs.?(_ => Option("gotcha"))
res1: Option[java.lang.String] = None
scala> val ys = Some(1)
ys: Some[Int] = Some(1)
scala> ys.?(x => Some(x * 2))
res2: Option[Int] = Some(2)
Using pattern matching as suggested in a couple of answers here is a nice approach:
val some = Option(getSomething())
some match {
case Some(theValue) => doSomethingWith(theValue)
case None => println("Whoops, didn't get anything useful back")
}
But, a bit verbose.
I prefer to map an Option in the following way:
Option(getSomething()) map (something -> doSomethingWith(something))
One liner, short, clear.
The reason to that is Option can be viewed as some kind of collection – some special snowflake of a collection that contains either zero elements or exactly one element of a type and as as you can map a List[A] to a List[B], you can map an Option[A] to an Option[B]. This means that if your instance of Option[A] is defined, i.e. it is Some[A], the result is Some[B], otherwise it is None. It's really powerful!

Scala: short form of pattern matching that returns Boolean

I found myself writing something like this quite often:
a match {
case `b` => // do stuff
case _ => // do nothing
}
Is there a shorter way to check if some value matches a pattern? I mean, in this case I could just write if (a == b) // do stuff, but what if the pattern is more complex? Like when matching against a list or any pattern of arbitrary complexity. I'd like to be able to write something like this:
if (a matches b) // do stuff
I'm relatively new to Scala, so please pardon, if I'm missing something big :)
This is exactly why I wrote these functions, which are apparently impressively obscure since nobody has mentioned them.
scala> import PartialFunction._
import PartialFunction._
scala> cond("abc") { case "def" => true }
res0: Boolean = false
scala> condOpt("abc") { case x if x.length == 3 => x + x }
res1: Option[java.lang.String] = Some(abcabc)
scala> condOpt("abc") { case x if x.length == 4 => x + x }
res2: Option[java.lang.String] = None
The match operator in Scala is most powerful when used in functional style. This means, rather than "doing something" in the case statements, you would return a useful value. Here is an example for an imperative style:
var value:Int = 23
val command:String = ... // we get this from somewhere
command match {
case "duplicate" => value = value * 2
case "negate" => value = -value
case "increment" => value = value + 1
// etc.
case _ => // do nothing
}
println("Result: " + value)
It is very understandable that the "do nothing" above hurts a little, because it seems superflous. However, this is due to the fact that the above is written in imperative style. While constructs like these may sometimes be necessary, in many cases you can refactor your code to functional style:
val value:Int = 23
val command:String = ... // we get this from somewhere
val result:Int = command match {
case "duplicate" => value * 2
case "negate" => -value
case "increment" => value + 1
// etc.
case _ => value
}
println("Result: " + result)
In this case, you use the whole match statement as a value that you can, for example, assign to a variable. And it is also much more obvious that the match statement must return a value in any case; if the last case would be missing, the compiler could not just make something up.
It is a question of taste, but some developers consider this style to be more transparent and easier to handle in more real-world examples. I would bet that the inventors of the Scala programming language had a more functional use in mind for match, and indeed the if statement makes more sense if you only need to decide whether or not a certain action needs to be taken. (On the other hand, you can also use if in the functional way, because it also has a return value...)
This might help:
class Matches(m: Any) {
def matches[R](f: PartialFunction[Any, R]) { if (f.isDefinedAt(m)) f(m) }
}
implicit def any2matches(m: Any) = new Matches(m)
scala> 'c' matches { case x: Int => println("Int") }
scala> 2 matches { case x: Int => println("Int") }
Int
Now, some explanation on the general nature of the problem.
Where may a match happen?
There are three places where pattern matching might happen: val, case and for. The rules for them are:
// throws an exception if it fails
val pattern = value
// filters for pattern, but pattern cannot be "identifier: Type",
// though that can be replaced by "id1 # (id2: Type)" for the same effect
for (pattern <- object providing map/flatMap/filter/withFilter/foreach) ...
// throws an exception if none of the cases match
value match { case ... => ... }
There is, however, another situation where case might appear, which is function and partial function literals. For example:
val f: Any => Unit = { case i: Int => println(i) }
val pf: PartialFunction[Any, Unit] = { case i: Int => println(i) }
Both functions and partial functions will throw an exception if called with an argument that doesn't match any of the case statements. However, partial functions also provide a method called isDefinedAt which can test whether a match can be made or not, as well as a method called lift, which will turn a PartialFunction[T, R] into a Function[T, Option[R]], which means non-matching values will result in None instead of throwing an exception.
What is a match?
A match is a combination of many different tests:
// assign anything to x
case x
// only accepts values of type X
case x: X
// only accepts values matches by pattern
case x # pattern
// only accepts a value equal to the value X (upper case here makes a difference)
case X
// only accepts a value equal to the value of x
case `x`
// only accept a tuple of the same arity
case (x, y, ..., z)
// only accepts if extractor(value) returns true of Some(Seq()) (some empty sequence)
case extractor()
// only accepts if extractor(value) returns Some something
case extractor(x)
// only accepts if extractor(value) returns Some Seq or Tuple of the same arity
case extractor(x, y, ..., z)
// only accepts if extractor(value) returns Some Tuple2 or Some Seq with arity 2
case x extractor y
// accepts if any of the patterns is accepted (patterns may not contain assignable identifiers)
case x | y | ... | z
Now, extractors are the methods unapply or unapplySeq, the first returning Boolean or Option[T], and the second returning Option[Seq[T]], where None means no match is made, and Some(result) will try to match result as described above.
So there are all kinds of syntactic alternatives here, which just aren't possible without the use of one of the three constructions where pattern matches may happen. You may able to emulate some of the features, like value equality and extractors, but not all of them.
Patterns can also be used in for expressions. Your code sample
a match {
case b => // do stuff
case _ => // do nothing
}
can then be expressed as
for(b <- Some(a)) //do stuff
The trick is to wrap a to make it a valid enumerator. E.g. List(a) would also work, but I think Some(a) is closest to your intended meaning.
The best I can come up with is this:
def matches[A](a:A)(f:PartialFunction[A, Unit]) = f.isDefinedAt(a)
if (matches(a){case ... =>}) {
//do stuff
}
This won't win you any style points though.
Kim's answer can be “improved” to better match your requirement:
class AnyWrapper[A](wrapped: A) {
def matches(f: PartialFunction[A, Unit]) = f.isDefinedAt(wrapped)
}
implicit def any2wrapper[A](wrapped: A) = new AnyWrapper(wrapped)
then:
val a = "a" :: Nil
if (a matches { case "a" :: Nil => }) {
println("match")
}
I wouldn't do it, however. The => }) { sequence is really ugly here, and the whole code looks much less clear than a normal match. Plus, you get the compile-time overhead of looking up the implicit conversion, and the run-time overhead of wrapping the match in a PartialFunction (not counting the conflicts you could get with other, already defined matches methods, like the one in String).
To look a little bit better (and be less verbose), you could add this def to AnyWrapper:
def ifMatch(f: PartialFunction[A, Unit]): Unit = if (f.isDefinedAt(wrapped)) f(wrapped)
and use it like this:
a ifMatch { case "a" :: Nil => println("match") }
which saves you your case _ => line, but requires double braces if you want a block instead of a single statement... Not so nice.
Note that this construct is not really in the spirit of functional programming, as it can only be used to execute something that has side effects. We can't easily use it to return a value (therefore the Unit return value), as the function is partial — we'd need a default value, or we could return an Option instance. But here again, we would probably unwrap it with a match, so we'd gain nothing.
Frankly, you're better off getting used to seeing and using those match frequently, and moving away from this kind of imperative-style constructs (following Madoc's nice explanation).