Magic PartialFunction in Scala - scala

I don't think this code should work, but it does (in Scala 2.10):
scala> ((i: Int) => i.toString match {
| case s if s.length == 2 => "A two digit number"
| case s if s.length == 3 => "A three digit number"
| }): PartialFunction[Int,String]
res0: PartialFunction[Int,String] = <function1>
// other interactions omitted
scala> res0.orElse(PartialFunction((i: Int) => i.toString))
res5: PartialFunction[Int,String] = <function1>
scala> res5(1)
res6: String = 1
How does it work? I would expect a MatchError to be thrown inside res0.
The Scala language specification does not seem to explicitly document how res0 should be interpreted.

The trick is that the compiler is not interpreting your definition as a total function converted to a partial function -- it's actually creating a partial function in the first place. You can verify by noting that res0.isDefinedAt(1) == false.
If you actually convert a total function to a partial function, you will get the behavior you expected:
scala> PartialFunction((i: Int) => i.toString match {
| case s if s.length == 2 => "A two digit number"
| case s if s.length == 3 => "A three digit number"
| })
res0: PartialFunction[Int,String] = <function1>
scala> res0 orElse ({ case i => i.toString }: PartialFunction[Int, String])
res1: PartialFunction[Int,String] = <function1>
scala> res1(1)
scala.MatchError: 1 (of class java.lang.String)
// ...
In this example, PartialFunction.apply treats its argument as a total function, so any information about where it's defined is lost.

orElse is defined on PartialFunction so that the argument is treated as a fallback for the cases when the original is not defined. See the API.

You say that if res0 does not match, you want to try your other pf instead. How this essentially works is:
if (res0.isDefinedAt(1)) {
res0(1)
} else {
other(1)
}
The orElse call creates an instance of OrElse, which inherits from PartialFunction: https://github.com/scala/scala/blob/master/src/library/scala/PartialFunction.scala#L159
When you now call apply on this OrElse, it will call f1.applyOrElse(x, f2): https://github.com/scala/scala/blob/master/src/library/scala/PartialFunction.scala#L162
This will call if (isDefinedAt(x)) apply(x) else f2(x): https://github.com/scala/scala/blob/master/src/library/scala/PartialFunction.scala#L117-L118
Therefore you will only get a MatchError, when neither of the pf's matches.

Related

Scala PartialFunctions giving match error: weird behavior

I'm trying to use a partial function for some validations, lets take an example of a string:
def isLengthValid: PartialFunction[String, Option[String]] ={
case s:String if s.length > 5 => Some("Invalid")
}
def isStringValid: PartialFunction[String, Option[String]] ={
case s: String if s == "valid" => Some("Valid")
}
isLengthValid("valid") orElse isStringValid("valid")
expected output => Some("Valid")
But I'm getting a match Error:
scala.MatchError: valid (of class java.lang.String)
Could anyone help that what is going wrong here, because as per my understanding .isDefinedAt is called internally and is should not give matchError.
P.S ignore the inputs, this is just an example.
Your understanding is wrong. The ScalaDocs page clearly states, "It is the responsibility of the caller to call isDefinedAt before calling apply ...".
Your code isn't calling isDefinedAt, which is causing the exception, so you have to explicitly call it or you can employ other methods that hide the isDefinedAt internally.
Seq("valid") collect (isLengthValid orElse isStringValid)
//res0: Seq[Option[String]] = List(Some(Valid))
Seq("vlad") collect (isLengthValid orElse isStringValid)
//res1: Seq[Option[String]] = List()
This works as intended if you write the last line as
(isLengthValid orElse isStringValid)("valid")
I suspect the problem is that your version desugars to
(isLengthValid.apply("valid")).orElse(isStringValid.apply("valid"))
This means the apply is calculated before the orElse takes place, which means the partial function is treated as a total function and a match error is thrown as Valy Dia's answer explains. The orElse is actually being called on the result output, not the partial function.
The error message comes from the first expression - isLengthValid.
It is define only for string with a length stricly greater than 5. Hence when it is applied to a the string "valid" of length 5, it throws a MatchError:
scala>"valid".length
res5: Int = 5
isLengthValid("valid")
scala.MatchError: valid (of class java.lang.String)
If the method isLengthValid defined this way instead (Note the greater than equal sign), it wouldn't throw the MatchError :
def isLengthValid: PartialFunction[String, Option[String]] ={
case s:String if s.length >= 5 => Some("Invalid")
}
scala>isLengthValid("valid")
res8: Option[String] = Some("Invalid")
And the original expression would return an Option:
scala>isLengthValid("valid") orElse isStringValid("valid")
res9: Option[String] = Some("Invalid")
What you could do here as well as explained in this question, is to use this definition instead:
val isLengthValid = new PartialFunction[String, Option[String]] {
def isDefinedAt(x: String) = x.length > 5
def apply(x: String) = Some("Invalid")
}
scala>isLengthValid("valid")
res13: Some[String] = Some("Invalid")

scala: anonymous partial function strange syntax

I came across a code similar to this, and it was surprised that it even compiles:
scala> val halfSize: PartialFunction[String, Int] = _.length match {
case even if even % 2 == 0 => even / 2
}
halfSize: PartialFunction[String,Int] = <function1>
scala> List("x", "xx").collect(halfSize)
res1: List[Int] = List(1)
As far as I known, the valid syntax to define a PartialFunction is a case function:
val halfSize: PartialFunction[String, Int] = {
case s if s.length % 2 == 0 => s.length / 2
}
The first code seems more optimized since it calls length only once. But even in the SLS I was not able to find the explanation of the syntax. Is this an undocumented feature of scalac ?
The rules are given in https://www.scala-lang.org/files/archive/spec/2.12/06-expressions.html#anonymous-functions:
The eventual run-time value of an anonymous function is determined by the expected type:
...
PartialFunction[T, U], if the function literal is of the shape x => x match { … }
In this case the literal does have such a shape, as rogue-one's answer explains, so PartialFunction is allowed as the expected type.
(EDIT: actually, it doesn't, since it matches x.length instead of x. This looks like a minor bug, but one which should be fixed by changing the specification.)
A PartialFunction's value receives an additional isDefinedAt member, which is derived from the pattern match in the function literal, with each case's body being replaced by true, and an added default (if none was given) that evaluates to false.
So in this case it ends up with
def isDefinedAt(x: String) = x.length match {
case even if even % 2 == 0 => true
case _ => false
}
val halfSize: PartialFunction[String, Int] = _.length match {
case even if even % 2 == 0 => even / 2
}
The underscore (_) in the above function is just a short hand notation that refers to the single argument of the function. Thus the above snippet is just a short form of
val halfSize: PartialFunction[String, Int] = { (x: String) => x.length match {
case even if even % 2 == 0 => even / 2
}
}

When to use scala triple caret (^^^) vs double caret (^^) and the into method (>>)

Can someone explain how and when to use the triple caret ^^^ (vs the double caret ^^) when designing scala parser combinators? And also when / how to use the parser.into() method (>>).
I'll begin with an example using Scala's Option type, which is similar in some important ways to Parser, but can be easier to reason about. Suppose we have the following two values:
val fullBox: Option[String] = Some("13")
val emptyBox: Option[String] = None
Option is monadic, which means (in part) that we can map a function over its contents:
scala> fullBox.map(_.length)
res0: Option[Int] = Some(2)
scala> emptyBox.map(_.length)
res1: Option[Int] = None
It's not uncommon to care only about whether the Option is full or not, in which case we can use map with a function that ignores its argument:
scala> fullBox.map(_ => "Has a value!")
res2: Option[String] = Some(Has a value!)
scala> emptyBox.map(_ => "Has a value!")
res3: Option[String] = None
The fact that Option is monadic also means that we can apply to an Option[A] a function that takes an A and returns an Option[B] and get an Option[B]. For this example I'll use a function that attempts to parse a string into an integer:
def parseIntString(s: String): Option[Int] = try Some(s.toInt) catch {
case _: Throwable => None
}
Now we can write the following:
scala> fullBox.flatMap(parseIntString)
res4: Option[Int] = Some(13)
scala> emptyBox.flatMap(parseIntString)
res5: Option[Int] = None
scala> Some("not an integer").flatMap(parseIntString)
res6: Option[Int] = None
This is all relevant to your question because Parser is also monadic, and it has map and flatMap methods that work in very similar ways to the ones on Option. It also has a bunch of confusing operators (which I've ranted about before), including the ones you mention, and these operators are just aliases for map and flatMap:
(parser ^^ transformation) == parser.map(transformation)
(parser ^^^ replacement) == parser.map(_ => replacement)
(parser >> nextStep) == parser.flatMap(nextStep)
So for example you could write the following:
object MyParser extends RegexParsers {
def parseIntString(s: String) = try success(s.toInt) catch {
case t: Throwable => err(t.getMessage)
}
val digits: Parser[String] = """\d+""".r
val numberOfDigits: Parser[Int] = digits ^^ (_.length)
val ifDigitsMessage: Parser[String] = digits ^^^ "Has a value!"
val integer: Parser[Int] = digits >> parseIntString
}
Where each parser behaves in a way that's equivalent to one of the Option examples above.

scala unapply that returns boolean

Reading this, I still have questions about unapply() that returns Boolean.
If take a look at Scala Programming Book (2nd edition), page 602. There is an example:
case Email(Twice(x # UpperCase()), domain) => ...
Where UpperCase defined as an object that has unapply() returning Boolean and does not have apply()
object UpperCase {
def unapply(s:String): Boolean = s.toUpperCase == s
}
Where Twice is something like this:
object Twice {
def apply(s:String): String = s + s
...
The questions are (must be too many, sorry for that):
How does UpperCase().unapply(..) works here?
If I pass: DIDI#hotmail.com, then x in first code snippet = 'DI'.. then we use '#' ..to bind 'x' to pass it to UpperCase.unapply to invoke unapply(x) i.e unapply('DIDI') (?) Then it returns True.
But why not Option ? I tend to think that unapply returns Option.. kind of one way how it works. That's probably because usually Option wraps some data, but for simple case we should NOT wrap boolean? And because we do not have apply()?
What the difference, when use Boolean / Option ? Based on this example.
And this syntax: x # UpperCase(), is it supposed to substitute value match case (is that way how I suppose to read it?) syntax if we are matching inside one particular case? It doesn't seems as unified way/syntax of doing this.
Usually syntax like (assuming that x,y is Int): case AA(x # myX, y) => println("myX: " + myX) says that x binds to myX, basically myX is alias to x.. in this case. But in our case - x # UpperCase(). x binds to UpperCase().unapply().. putting x as parameter. I mean binding here is quite abstract/wide notion..
This is simple:
1) If you return Boolean, your unapply just tests matching query
scala> object UpperCase {
| def unapply(s: String) = s.toUpperCase == s
| }
defined module UpperCase
scala> "foo" match {
| case UpperCase() => true
| case _ => false
| }
res9: Boolean = false
2) If you return Option[T], you create an extractor, which unwraps T
scala> object IsUpperCase {
| def unapply(s: String) = Option(s).map(x => x.toUpperCase == x)
| }
defined module IsUpperCase
scala> "FOO" match {case IsUpperCase(flag) => flag}
res0: Boolean = true
This is not so simple.
The behavior of "boolean test" match just changed in the latest milestone:
apm#mara:~/clones/scala$ ./build/pack/bin/scala
Welcome to Scala version 2.11.0-20130911-042842-a49b4b6375 (OpenJDK 64-Bit Server VM, Java 1.7.0_25).
Type in expressions to have them evaluated.
Type :help for more information.
scala>
scala> object OK { def unapply(s: String) = Some(s) filter (_ == "OK") }
defined object OK
scala> import PartialFunction._
import PartialFunction._
scala> cond("OK") { case OK() => true }
<console>:12: error: wrong number of patterns for object OK offering String: expected 1, found 0
cond("OK") { case OK() => true }
^
<console>:12: error: wrong number of patterns for object OK offering String: expected 1, found 0
cond("OK") { case OK() => true }
^
scala> cond("OK") { case OK(x) => true }
res1: Boolean = true
scala> :q
Previously, you could use the extractor without extracting any fields, just for a "boolean test", even though the result of the unapply is not Boolean:
apm#mara:~/clones/scala$ scalam
Welcome to Scala version 2.11.0-M4 (OpenJDK 64-Bit Server VM, Java 1.7.0_25).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import PartialFunction._
import PartialFunction._
scala> object OK { def unapply(s: String) = Some(s) filter (_ == "OK") }
defined object OK
scala> cond("OK") { case OK() => true }
res0: Boolean = true
scala> cond("OK") { case OK(x) => true }
res1: Boolean = true
Here is some discussion and here is the change in the way extractor signatures are processed.

Why no partial function type literal?

I wonder why there doesn't exist a literal for partial function types. I have to write
val pf: PartialFunction[Int, String] = {
case 5 => "five"
}
where an literal like :=> would be shorter:
val pf: Int :=> String = {
case 5 => "five"
}
Partial functions are often used and in Scala already some "special" feature, so why no special syntax for it?
Probably in part because you don't need a literal: you can always write your own :=> as a type infix operator if you want more concise syntax:
scala> type :=>[A, B] = PartialFunction[A, B]
defined type alias $colon$eq$greater
scala> val pf: Int :=> String = { case 5 => "five" }
pf: :=>[Int,String] = <function1>
scala> pf.isDefinedAt(0)
res0: Boolean = false
scala> pf.isDefinedAt(5)
res1: Boolean = true
I'm not one of the designers of the Scala language, though, so this is more or less a guess about the "why?". You might get better answers over at the scala-debate list, which is a more appropriate venue for language design questions.