Difference between position of 'if's in pattern matching - scala

Is there a difference between the following two ways of pattern matching:
foo match {
case a if(cond) => println("bar")
case a => println("baz")
case _ => println("default")
}
and
foo match {
case a => if (cond) println("bar") else println("baz")
case _ => println("default")
}

In terms of what you are trying to accomplish, there is no difference. But semantically there is a difference in what you're actually doing.
In the first case, you have a pattern with a guard suffix (the condition). The guard is evaluated only if the pattern in the case matches. If the guard expression evaluates to true, the pattern match succeeds.
In the second case, when the pattern in the case matches, a partial function is executed. Within that partial function body is a condition, and this works out like any function body with a condition inside.
Check out Section 8.4 of the Scala Language Specification for details on pattern matching. It is riveting.
Finally, note both your examples will produce errors or warnings (depending on Scala version) because the defaults are unreachable. I know it is just a contrived example, but just sayin'.

8.4 of the spec says evaluation of expressions may not be in textual order, which is why guards should not be side-effecting.
So if your conditional test is side-effecting, don't put it in a guard.

Related

Scala match case with multiple branch with if

I have a match case with if and the expression is always the same.
I put some pseudo code:
value match {
case A => same expression
case B(_) if condition1 => same expression
case _ if condition2 => same expression
...
case _ => different expression //similar to an else
}
The match contains both case object (case A) matching and case class(case B(_))
Is it the best practice?
Try to explain this code in words. "This function returns one of two values. The first is returned if the input is A. Or if the input is of type B and a condition holds. Oh, or if a different condition holds. Otherwise, it's the other value". That sounds incredibly complex to me.
I have to recommend breaking this down at least a bit. At minimum, you've got two target expressions, and which one is chosen depends on some predicate of value. That sounds like a Boolean to me. Assuming value is of some trait type Foo (which A.type and B extend), you could write
sealed trait Foo {
def isFrobnicated: Boolean = this match {
case A => true
case B(_) if condition1 => true
case _ => condition2
}
}
...
if (value.isFrobnicated) {
same expression
} else {
different expression
}
Now the cognitive load is split into two different, smaller chunks of code to digest, and presumably isFrobnicated will be given a self-documenting name and a chunk of comments explaining why this distinction is important. Anyone reading the bottom snippet can simply understand "Oh, there's two options, based on the frobnication status", and if they want more details, there's some lovely prose they can go read in the isFrobnicated docs. And all of the complexity of "A or B if this or anything if that" is thrown into its own function, separate from everything else.
If A and B don't have a common supertype that you control, then you can always write a standalone function, an implicit class, or (if you're in Scala 3) a proper extension method. Take your pick.
Depending on your actual use case, there may be more that can be done, but this should be a start.

When is case syntactically significant?

A/a case, not case case.
Apparently case a matches anything and binds it to the name a, while case A looks for an A variable and matches anything == considers equal to A. This came as quite a surprise to me; while I know Scala is case sensitive, I never expected identifier case to affect the parsing rules.
Is it common for Scala's syntax to care about the case of identifiers, or is there only a small number of contexts in which this happens? If there's only a small number of such contexts, what are they? I couldn't find anything on Google; all I got were results about pattern matching.
There is one more that is similar in nature, called a type pattern. In a type pattern, a simple identifier that starts with a lower case letter is a type variable, and all others are attempt to match actual types (except _).
For example:
val a: Any = List(1, 2, 3)
val c = 1
// z is a type variable
a match { case b: List[z] => a }
// Type match on `Int`
a match { case b: List[Int] => a }
// type match on the singleton c.type (not a simple lower case identifier)
// (doesn't actually compile because c.type will never conform)
a match { case b: List[c.type] => a }
Type matching like the the first example is lesser-known because, well, it's hardly used.

For returning Option from pattern matching, use Try?

I have lots of pattern matching code that looks as follows: If some Foo matches, a Some of Bar is returned, otherwise None, and there are lots of Foos and long constructors for Bars. Something like:
y match {
case Foo1(z) => Some(Bar1(z))
case Foo2(z) => Some(Bar2(z))
case Foo3(z) => Some(Bar3(z))
case Foo4(z) => Some(Bar4(z))
case _ => None
}
In the actual code, the constructors on the right hand side of the arrows are more complex, and there are more cases.
Now, in order to get rid of the repetitive option constructors (Some), I could do:
Try(
y match {
case Foo1(z) => Bar1(z)
case Foo2(z) => Bar2(z)
case Foo3(z) => Bar3(z)
case Foo4(z) => Bar4(z)
}
).toOption
This looks significantly cleaner to me, and from a semantic perspective it is justifiable, as the case _ is actually a situation that should not occur, thus modeling it as an exception seems justifiable. Note that it is the repetitive Somes that bug me, not the last case.
My question is whether there is some (e.g. performance) penalty to the latter approach I am not aware of.
The second version is less readable, and less correct*. There isn't much of a point in using Try if you're just going to discard the error to None. The Try is only there to catch match errors and convert them to None, but you can do that in one like with case _ => None.
There would be slightly more overhead wrapping in Try, but not enough to matter. Correctness and readability should come first.
If you really don't want to have that extra case, consider wrapping y in Option and using collect:
Option(y) collect {
case Foo1(z) => Bar1(z)
case Foo2(z) => Bar2(z)
case Foo3(z) => Bar3(z)
case Foo4(z) => Bar4(z)
}
Using collect on an Option will wrap the cases in the partial function that are defined in Some, and any cases that do not match will become None.
*By correct, I mean widely accepted use of Try.

Scala: pattern matching Option with case classes and code blocks

I'm starting to learn the great Scala language ang have a question about "deep" pattern matching
I have a simple Request class:
case class Request(method: String, path: String, version: String) {}
And a function, that tries to match an request instance and build a corresponding response:
def guessResponse(requestOrNone: Option[Request]): Response = {
requestOrNone match {
case Some(Request("GET", path, _)) => Response.streamFromPath(path)
case Some(Request(_, _, _)) => new Response(405, "Method Not Allowed", requestOrNone.get)
case None => new Response(400, "Bad Request")
}
}
See, I use requestOrNone.get inside case statement to get the action Request object. Is it type safe, since case statement matched? I find it a bit of ugly. Is it a way to "unwrap" the Request object from Some, but still be able to match Request class fields?
What if I want a complex calculation inside a case with local variables, etc... Can I use {} blocks after case statements? I use IntelliJ Idea with official Scala plugin and it highlights my brackets, suggesting to remove them.
If that is possible, is it good practice to enclose matches in matches?
... match {
case Some(Request("GET", path, _)) => {
var stream = this.getStream(path)
stream match {
case Some(InputStream) => Response.stream(stream.get)
case None => new Response(404, "Not Found)
}
}
}
For the first part of your question, you can name the value you match against with # :
scala> case class A(i: Int)
defined class A
scala> Option(A(1)) match {
| case None => A(0)
| case Some(a # A(_)) => a
| }
res0: A = A(1)
From the Scala Specifications (8.1.3 : Pattern Binders) :
A pattern binder x#p consists of a pattern variable x and a pattern p.
The type of the variable x is the static type T of the pattern p. This
pattern matches any value v matched by the pattern p, provided the
run-time type of v is also an instance of T , and it binds the
variable name to that value.
However, you do not need to in your example: since you're not matching against anything about the Request but just its presence, you could do :
case Some(req) => new Response(405, "Method Not Allowed", req)
For the second part, you can nest matches. The reason Intellij suggests removing the braces is that they are unnecessary : the keyword case is enough to know that the previous case is done.
As to whether it is a good practice, that obviously depends on the situation, but I would probably try to refactor the code into smaller blocks.
You can rewrite the pattern as following (with alias).
case Some(req # Request(_, _, _)) => new Response(405, "Method Not Allowed", req)
You cannot use code block in pattern, only guard (if ...).
There are pattern matching compiler plugin like rich pattern matching.

Why doesn't Scala optimize calls to the same Extractor?

Take the following example, why is the extractor called multiple times as opposed to temporarily storing the results of the first call and matching against that. Wouldn't it be reasonable to assume that results from unapply would not change given the same string.
object Name {
val NameReg = """^(\w+)\s(?:(\w+)\s)?(\w+)$""".r
def unapply(fullName: String): Option[(String, String, String)] = {
val NameReg(fname, mname, lname) = fullName
Some((fname, if (mname == null) "" else mname, lname))
}
}
"John Smith Doe" match {
case Name("Jane", _, _) => println("I know you, Jane.")
case Name(f, "", _) => println(s"Hi ${f}")
case Name(f, m, _) => println(s"Howdy, ${f} ${m}.")
case _ => println("Don't know you")
}
Wouldn't it be reasonable to assume that results from unapply would not change given the same string.
Unfortunately, assuming isn't good enough for a (static) compiler. In order for memoizing to be a legal optimization, the compiler has to prove that the expression being memoized is pure and referentially transparent. However, in the general case, this is equivalent to solving the Halting Problem.
It would certainly be possible to write an optimization pass which tries to prove purity of certain expressions and memoizes them iff and only iff it succeeds, but that may be more trouble than it's worth. Such proofs get very hard very quickly, so they are only likely to succeed for very trivial expressions, which execute very quickly anyway.
What is a pattern match? The spec says it matches the "shape" of the value and binds vars to its "components."
In the realm of mutation, you have questions like, if I match on case class C(var v: V), does a case C(x) capture the mutable field? Well, the answer was no:
https://issues.scala-lang.org/browse/SI-5158
The spec says (sorry) that order of evaluation may be changed, so it recommends against side-effects:
In the interest of efficiency the evaluation of a pattern matching
expression may try patterns in some other order than textual sequence.
This might affect evaluation through side effects in guards.
That's in relation to guard expressions, presumably because extractors were added after case classes.
There's no special promise to evaluate extractors exactly once. (Explicitly in the spec, that is.)
The semantics are only that "patterns are tried in sequence".
Also, your example for regexes can be simplified, since a regex will not be re-evaluated when unapplied to its own Match. See the example in the doc. That is,
Name.NameReg findFirstMatchIn s map {
case Name("Jane",_,_) =>
}
where
object Name { def unapply(m: Regex.Matcher) = ... }