Pattern matching in Scala for String and Int - scala

Why does pattern matching in Scala work for String and AnyVals such as Int?
Usually we see things like Case classes or Extractors...

Extractors and case classes are used just for two of 13 kinds of patterns in Scala, "Extractor patterns" and "Constructor patterns" respectively. You can't use Int or String in this kind of pattern (case String(x)). But you can use them in other kinds:
Typed patterns, as in case x: String. In that case there is nothing special about String, you can do the same with any class (but there is something special about Int and other primitives: case x: Int actually checks if the received object is a java.lang.Integer in most cases).
Literal patterns, as in case 0 or case "". Again, nothing special about strings, this works for all literals.

java.lang.String is enriched with scala.collection.immutable.StringOps (http://www.scala-lang.org/api/2.11.8/#scala.collection.immutable.StringOps) which mix scala.collection.immutable.StringLike (http://www.scala-lang.org/api/2.11.8/#scala.collection.immutable.StringLike) in. There you can find complementary methods, like apply.
String is a bit special as well, you can convert it to list of Chars, and use List extractors then like case List(a,b) or case x:xs on a String, bearing in mind that a and b will be Chars; x: Char and xs: List[Char]
All primitive types have Rich* classes (e.g. scala.runtime.RichBoolean, scala.runtime.RichByte).
Value classes mechanics is used to enrich all of the above mentioned types (http://docs.scala-lang.org/overviews/core/value-classes.html). In compile time they have a wrapper type, like RichBoolean or RichInt but in runtime they are pure Boolean or Int types. In such way avoiding overhead of creating runtime objects.

val x: Any = 5
def f[T](v: T) = v match {
case _: Int => "Int"
case _: String => "String"
case _ => "Unknown"
}

You don't have to define unapply in class to able to use that class in switch/case-style pattern matching. unapply is used to deconstruct object, so, if you want to match switch in List-style (case x:xs), you should use unapply/unapplySeq. Good example here are regexps, are they are constructed from strings -- "something".r (note .r in the end).

Related

Extract sublist of descendants in Scala

I have a class Foo extends Bar and a List or other collection of base class:
val bars: Iterable[Bar]
I need to extract all Foo elements from the collection. The following is the code:
val fooes: Iterable[Foo] = bars
.filter(x => Try(x.isInstanceOf[Foo]).isSuccess))
.map(_.isInstanceOf[Foo])
Is there conciser approach?
val fooes: Iterable[Foo] = bars.collect{case foo:Foo => foo}
The .collect() method takes a partial-function as its parameter. In this case the function is defined only for Foo types. All others are ignored.
Couple of possible rewrites worth remembering in general
filter followed by map as collect
isInstanceOf followed by asInstanceOf as pattern match with typed pattern
Hence the following discouraged style
bars
.filter { _.isInstanceOf[Foo] }
.map { _.asInstanceOf[Foo] }
can be rewritten to idiomatic style
bars collect { case foo: Foo => foo }
...writing type tests and casts is rather verbose in Scala. That's
intentional, because it is not encouraged practice. You are usually
better off using a pattern match with a typed pattern. That's
particularly true if you need to do both a type test and a type cast,
because both operations are then rolled into a single pattern match.
Note the nature of typed pattern is still just runtime type check followed by runtime type cast, that is, it merely represent nicer stylistic clothing not an increase in type safety. For example
scala -print -e 'lazy val result: String = (42: Any) match { case v: String => v }'
expands to something like
<synthetic> val x1: Object = scala.Int.box(42);
if (x1.$isInstanceOf[String]()) {
<synthetic> val x2: String = (x1.$asInstanceOf[String]());
...
}
where we clearly see type check isInstanceOf followed by type cast asInstanceOf.

When is case syntactically significant?

A/a case, not case case.
Apparently case a matches anything and binds it to the name a, while case A looks for an A variable and matches anything == considers equal to A. This came as quite a surprise to me; while I know Scala is case sensitive, I never expected identifier case to affect the parsing rules.
Is it common for Scala's syntax to care about the case of identifiers, or is there only a small number of contexts in which this happens? If there's only a small number of such contexts, what are they? I couldn't find anything on Google; all I got were results about pattern matching.
There is one more that is similar in nature, called a type pattern. In a type pattern, a simple identifier that starts with a lower case letter is a type variable, and all others are attempt to match actual types (except _).
For example:
val a: Any = List(1, 2, 3)
val c = 1
// z is a type variable
a match { case b: List[z] => a }
// Type match on `Int`
a match { case b: List[Int] => a }
// type match on the singleton c.type (not a simple lower case identifier)
// (doesn't actually compile because c.type will never conform)
a match { case b: List[c.type] => a }
Type matching like the the first example is lesser-known because, well, it's hardly used.

Selector of pattern match being exhaustive

Looking at the Scala doc for sealed classes, it says:
If the selector of a pattern match is an instance of a sealed class, the compilation of pattern matching can emit warnings which diagnose that a given set of patterns is not exhaustive, i.e. that there is a possibility of a MatchError being raised at run-time.
I don't quite understand what they meant in this paragraph. My understanding is that if a switch case, doesn't cover all the possibilities, then we'll get a warning at compile time, saying we might get an error at run time. Is this correct?
I find it strange, because how can we cover ALL the scenarios in a switch case? We would have to match all possible strings, which is just silly, so I take it my understanding is incorrect. Someone care to elucidate, please?
What the paragraph is saying is that in-case you have a fixed hierarchy structure like this:
sealed trait Foo
class Bar extends Foo
class Baz extends Foo
class Zab extends Foo
Then when you pattern match on it, the compiler can infer if you've attempted to match on all possible types extending the sealed trait, in this example:
def f(foo: Foo) = foo match {
| case _: Bar => println("bar")
| case _: Baz => println("baz")
| }
<console>:13: warning: match may not be exhaustive.
It would fail on the following input: Zab()
def f(foo: Foo) = foo match {
^
f: (foo: Foo)Unit
Note the beginning of the document says:
A sealed class may not be directly inherited, except if the inheriting
template is defined in the same source file as the inherited class.
This is unique to Scala, and can't be done in Java. A final class in Java cannot be inherited even if declared inside the same file. This is why this logic won't work for String in Scala, which is an alias for java.lang.String. The compiler warning may only be emitted for Scala types that match the above criteria.
I find it strange, because how can we cover ALL the scenarios in a switch case? We would have to match all possible strings
Yes, if the selector has type String (except it isn't a sealed class, because that's a Scala concept and String is a Java class).
which is just silly
No. For strings, you just need a catch-all case, e.g.
val x: String = ...
x match {
case "a" => ...
case "b" => ...
case _ => ...
}
is an exhaustive match: whatever x is, it matches one of the cases. More usefully, you can have:
val x: Option[A] = ...
x match {
case Some(y) => ...
case None => ...
}
and the compiler will be aware the match is exhaustive even without a catch-all case.
The wildcard character allows us to cover all the scenarios.
something match {
case one => ...
case two => ...
case _ => ...
}
It is selected whenever all other cases don't match; that is, it is the default case. Further information here.

Can I use #switch and Enumerations?

Can I use switch-case for pattern matching on enumerations?
I tried
import scala.annotation.switch
object Foo extends Enumeration {
val First = Value
val Second = Value
val Third = Value
}
object Main {
def foo(x: Foo.Value) = (x: #switch) match {
case Foo.First => 1
case Foo.Second => 2
case Foo.Third => 3
}
}
but get the following warning (Scala 2.11.4):
warning: could not emit switch for #switch annotated match
def foo(x: Foo.Value) = (x: #switch) match {
I then tried defining the enumeration in Java instead, since Java's enums are different than Scala's Enumeration. Still no luck.
It #switch pattern matching only available on primitive types?
To complete Regis answer, in Scala In Depth, Joshua Suereth states that the following conditions must be true for Scala to apply the tableswitch optimization:
The matched value must be a known integer.
The matched expression must be “simple.” It can’t contain any type checks, if statements, or extractors.
The expression must also have its value available at compile time.
There should be more than two case statements.
Foo object does not match to any of above criteria though it is not a subject for tableswitch optimisation.
The point of the switch annotation is to make sure that your match is compiled into a tableswitch or lookupswitch JVM instruction. Those instructions only work on ints, which means that the switch annotation will only have any effect on types that can safely fit in an Int. Meaning Int itself as well as Char, Byte, Short and Boolean. In addition, the values you match against have to be literal values (as opposed to values stored in a val). Given that an Enumeration is a reference value, they are not compatible with switch annotation. The restriction about literal values actually means that there is probably no way to use this ennotation for Short and Byte, for purely syntactic reasons as there is no support for literal shorts and bytes in scala: you have to use a literal int along with a type ascription as in 123: Byte, but this is not accepted as a pattern.
So that leaves only Int, Char and Boolean as valid types (the usefulness of using #switch for a boolean value is dubious to say the least)

Pattern matching a String as Seq[Char]

In Scala it is possible formulate patterns based on the invididual characters of a string by treating it as a Seq[Char].
An example of this feature is mentioned in A Tour of Scala
This is the example code used there:
object RegExpTest1 extends Application {
def containsScala(x: String): Boolean = {
val z: Seq[Char] = x
z match {
case Seq('s','c','a','l','a', rest # _*) =>
println("rest is "+rest)
true
case Seq(_*) =>
false
}
}
}
The problem I have with this is the third line of the snippet:
val z: Seq[Char] = x
Why is this sort of cast necessary? Shouldn't a String behave like a Seq[Char] under all circumstances (which would include pattern matching)? However, without this conversion, the code snippet will not work.
There is some real abuse of terminology going on in the question and the comments. There is no cast in this code, and especially "So basically, this is a major concession to Java interoperability, sacrificing some type soundness" has no basis in reality.
A scala cast looks like this: x.asInstanceOf[Y].
What you see above is an assignment: val z: Seq[Char] = x
This assignment is legal because there is an implicit conversion from String to Seq[Char]. I emphasize again, this is not a cast. A cast is an arbitrary assertion which can fail at runtime. There is no way for the implicit conversion to fail.
The problem with depending on implicit conversions between types, and the answer to the original question, is that implicit conversions only take place if the original value doesn't type check. Since it's perfectly legal to match on a String, no conversion takes place, the match just fails.
Not 100% sure if this is correct, but my intuition says that without this explicit cast you would pattern match against java.lang.String, which is not what you want.
The explicit cast forces the Scala compiler to use Predef.stringWrapper implicit conversion; thus, as RichString extends Seq[Char], you are able to do a pattern match as if the string were a sequence of characters.
I'm going to echo everything that andri said. For interoperability, Scala strings are java.lang.Strings. In Predef, there's an implicit conversion from String to RichString, which implements Seq[Char].
A perhaps nicer way of coding the pattern match, without needing an intermediate val z to hold the Seq[Char]:
def containsScala(x: String): Boolean = {
(x: Seq[Char]) match {
...
}
}