Why does the empty string not match as Seq.empty? - scala

EDIT: This was an old bug long since fixed in Scala 2.8 and later
During some experimentation around question Pattern matching a String as Seq[Char], I ran across another weird matching phenomenon. Consider the following code that treats a string as a sequence of characters:
def %%&#(input: String) : String = {
val uha : Seq[Char] = input
uha match {
case Seq() => "Empty"
case Seq(first # _, 'o', 'o') => "Bar"
case _ => "Oh"
}
}
Calling input on the empty String "" correctly yields "Empty".
However, if I rewrite the first match clause as
case Seq.empty => "Empty"
the matching of "" fails and matches the default clause instead.
Walking through the Scala library source code (which you shouldn't have to do in an ideal world :-) ) I believe that both Seq() and Seq.empty will result in RandomAccessSeq.empty. Apparently, this does not concur with the phenomenon described above because only Seq() matches the empty String.
UPDATE: Upon some further experimentation this question can be narrowed down to the following:
val list = List()
>>> list2: List[Nothing] = List()
val emptySeq = Seq.empty
list == emptySeq
>>> res1: Boolean = false
This basically means that an empty Seq does not automatically equal Seq.empty .
So when matching against a constant (as opposed to using an extractor as suggested by starblue) this unequality leads to the failing match.
The same is true when interpreting the empty String as a sequence.

This appears to be a bug in the library. Do you want to file the bug or shall I?
scala> Seq.empty match {case Seq() => "yup"; case _ => "nope"}
res0: java.lang.String = yup
scala> Seq() match {case Seq.empty => "yup"; case _ => "nope"}
res1: java.lang.String = yup
scala> ("" : Seq[Char]) match {case Seq() => "yup"; case _ => "nope"}
res2: java.lang.String = yup
scala> ("" : Seq[Char]) match {case Seq.empty => "yup"; case _ => "nope"}
res3: java.lang.String = nope

In matching the unapply or unapplySeq functions are used, not apply as you seem to believe.

Related

Scala Tuple Option

If I have Scala tuple Option of the likes:
(Some(1), None)
(None, Some(1))
(None, None)
And I want always to extract always the "Some" value if it exists, and otherwise get the None. The only way with pattern matching?
There is this:
def oneOf[A](tup: (Option[A], Option[A])) = tup._1.orElse(tup._2)
That will return the first option that is defined, or None if neither is.
Edit:
Another way to phrase the same thing is
def oneOf[A](tup: (Option[A], Option[A])) =
tup match { case (first, second) => first.orElse(second) }
It's longer, but perhaps more readable.
This should work:
def f(t: (Option[Int], Option[Int])): Option[Int] = t match {
case (Some(n), _) => Some(n)
case (_, Some(n)) => Some(n)
case _ => None
}
I want always to extract always the Some value if it exists, and otherwise get the None
You can just use orElse
def orOption[T](p: (Option[T], Option[T])): Option[T] = {
val (o1, o2) = p
o1 orElse o2
}
However, this does decide what to do if there exists two Some values:
scala> orOption((Some(1), Some(2)))
res0: Option[Int] = Some(1)
You should probably use pattern matching and then decide what to do if there are two Some values, like throw an exception. Alternatively, consider using a better encoding for the result type than Option.

Why can't I match on a Seq.empty?

I tried matching a seq like this:
val users: Seq[User] = ....
users match {
case Seq.empty => ....
case ..
}
I got an error saying:
stable identifier required, but scala.this.Predef.Set.empty found.
Can someone explain why I can't do this? i.e. the theory behind it
Both Seq.apply and Seq.empty are implemented in GenericCompanion, which has no unapply method, so you'd think that pattern matching wouldn't be possible, but you're still able to pattern match on Seq() because Seq.unapplySeq(), implemented in SeqFactory, makes that available.
From the unapplySeq() docs:
This method is called in a pattern match { case Seq(...) => }.
more background
Collections make pattern matching possible via the unapplySeq() method, which gets called when the compiler sees something like case List() => ....
It's interesting that List(42) is the same thing as List.apply(42) but not so in pattern matching:
lst match {
case List(8) => ... // OK
case List.apply(8) => ... // won't compile
}
The same principle applies to Seq() and Seq.empty.
Match on Seq() or Nil instead:
scala> Seq.empty
res0: Seq[Nothing] = List()
scala> val a = Seq(1,2,3)
a: Seq[Int] = List(1, 2, 3)
scala> val b = Seq()
b: Seq[Nothing] = List()
scala> a match {case Seq() => "empty"
| case _ => "other"
| }
res1: String = other
scala> b match {case Seq() => "empty"
| case _ => "other"
| }
res2: String = empty
See #jwvh's answer for technical reasons why.

Scala: How to simplify nested pattern matching statements

I am writing a Hive UDF in Scala (because I want to learn scala). To do this, I have to override three functions: evaluate, initialize and getDisplayString.
In the initialize function I have to:
Receive an array of ObjectInspector and return an ObjectInspector
Check if the array is null
Check if the array has the correct size
Check if the array contains the object of the correct type
To do this, I am using pattern matching and came up with the following function:
override def initialize(genericInspectors: Array[ObjectInspector]): ObjectInspector = genericInspectors match {
case null => throw new UDFArgumentException(functionNameString + ": ObjectInspector is null!")
case _ if genericInspectors.length != 1 => throw new UDFArgumentException(functionNameString + ": requires exactly one argument.")
case _ => {
listInspector = genericInspectors(0) match {
case concreteInspector: ListObjectInspector => concreteInspector
case _ => throw new UDFArgumentException(functionNameString + ": requires an input array.")
}
PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(listInspector.getListElementObjectInspector.asInstanceOf[PrimitiveObjectInspector].getPrimitiveCategory)
}
}
Nevertheless, I have the impression that the function could be made more legible and, in general, prettier since I don't like to have code with too many levels of indentation.
Is there an idiomatic Scala way to improve the code above?
It's typical for patterns to include other patterns. The type of x here is String.
scala> val xs: Array[Any] = Array("x")
xs: Array[Any] = Array(x)
scala> xs match {
| case null => ???
| case Array(x: String) => x
| case _ => ???
| }
res0: String = x
The idiom for "any number of args" is "sequence pattern", which matches arbitrary args:
scala> val xs: Array[Any] = Array("x")
xs: Array[Any] = Array(x)
scala> xs match { case Array(x: String) => x case Array(_*) => ??? }
res2: String = x
scala> val xs: Array[Any] = Array(42)
xs: Array[Any] = Array(42)
scala> xs match { case Array(x: String) => x case Array(_*) => ??? }
scala.NotImplementedError: an implementation is missing
at scala.Predef$.$qmark$qmark$qmark(Predef.scala:230)
... 32 elided
scala> Array("x","y") match { case Array(x: String) => x case Array(_*) => ??? }
scala.NotImplementedError: an implementation is missing
at scala.Predef$.$qmark$qmark$qmark(Predef.scala:230)
... 32 elided
This answer should not be construed as advocating matching your way back to type safety.

Why is this matching not allowed in scala?

I have to find if an element is inside a list using scala, and I am only allowed to use recursion. Why is the following code not working, as the match statement seems to be correct to me. My IDE gives me an error on all three case statements and it says type, mistmatch, boolean required.
def isInN(x: Int, l: List[Int]): Boolean = (l,l.head) match {
case Nil,_ => false
case _,x => true
case _,_ => isInN (x,l.tail)
}
You're matching a tuple, so you need to use a proper tuple extractor. i.e. just add parenthesis:
def isInN(x: Int, l: List[Int]): Boolean = (l, l.head) match {
case (Nil, _) => false
case (_, x) => true
case (_, _) => isInN (x, l.tail)
}
This compiles, but won't quite work like you want it to. l.head will throw an exception if the list is empty. And x within the match will eagerly match anything, so the last case can never be reached. To match against an identifier, you need to surround it with backticks, otherwise it will just be a placeholder matcher.
To use a similar pattern match without calling l.head, you can pattern match on the List itself:
def isInN(x: Int, l: List[Int]): Boolean = l match {
case Nil => false
case `x` :: _ => true
case head :: tail => isInN (x, tail)
}
Though this is all trumped by contains in the standard collections library:
scala> List(1, 2, 3, 4).contains(2)
res5: Boolean = true

Initializing Scala's Option[String] from a string so that certain value leads to None

Is there a shorter way to doing this:
val accessControlAllowOrigin = c_dash.getString("access-control-allow-origin") match {
case "" => None
case x => Some(x)
}
This is reading in a Typesafe Config value, where empty string denotes the absence of such a config (in Typesafe Config it's good manners to include all values, not leaving anything out).
Is there something like:
val sopt = Option( s, "magic" )
..which would provide either Some(s) or None if s's value is "magic"?
By looking at the doc I came to:
scala> def f(s: String) = (Some(s) filter( _ != "magic" ))
f: (s: String)Option[String]
scala> f("aaa")
res1: Option[String] = Some(aaa)
scala> f("magic")
res2: Option[String] = None
Is that the simplest?
You can use filter, which works the same for an Option as it would for any other collection:
c_dash.getString("access-control-allow-origin").filter(_.nonEmpty)
Anything not matching the filter predicate will become None.
I may be missing something obvious, but how about:
val sopt = if (s == "magic") None else Some(s)
Or, for your f():
def f(s:String) = if (s == "magic") None else Some(s)
or, more generically:
def noneIfDefault(s:String, default: String) = if (s == default) None else Some(s)