Pattern matching with conjunctions (PatternA AND PatternB) - scala

Scala has a language feature to support disjunctions in pattern matching ('Pattern Alternatives'):
x match {
case _: String | _: Int =>
case _ =>
}
However, I often need to trigger an action if the scrutiny satisfies PatternA and PatternB (conjunction.)
I created a pattern combinator '&&' that adds this capability. Three little lines that remind me why I love Scala!
// Splitter to apply two pattern matches on the same scrutiny.
object && {
def unapply[A](a: A) = Some((a, a))
}
// Extractor object matching first character.
object StartsWith {
def unapply(s: String) = s.headOption
}
// Extractor object matching last character.
object EndsWith {
def unapply(s: String) = s.reverse.headOption
}
// Extractor object matching length.
object Length {
def unapply(s: String) = Some(s.length)
}
"foo" match {
case StartsWith('f') && EndsWith('f') => "f.*f"
case StartsWith('f') && EndsWith(e) && Length(3) if "aeiou".contains(e) => "f..[aeiou]"
case _ => "_"
}
Points for discussion
Is there an existing way to do this?
Are there problems with this approach?
Can this approach create any other useful combinators? (for example, Not)
Should such a combinator be added to the standard library?
UPDATE
I've just been asked how the compiler interprets case A && B && C. These are infix operator patterns (Section 8.1.9 of the Scala Reference). You could also express this with standard extract patterns (8.1.7) as &&(&&(A, B), C).' Notice how the expressions are associated left to right, as per normal infix operator method calls likeBoolean#&&inval b = true && false && true`.

I really like this trick. I do not know of any existing way to do this, and I don't foresee any problem with it -- which doesn't mean much, though. I can't think of any way to create a Not.
As for adding it to the standard library... perhaps. But I think it's a bit hard. On the other hand, how about talking Scalaz people into including it? It looks much more like their own bailiwick.

A possible problem with this is the bloated translation that the pattern matcher generates.
Here is the translation of the sample program, generated with scalac -print. Even -optimise fails to simplify the if (true) "_" else throw new MatchError() expressions.
Large pattern matches already generate more bytecode than is legal for a single method, and use of this combinator may amplify that problem.
If && was built into the language, perhaps the translation could be smarter. Alternatively, small improvements to -optimise could help.

Related

What is the advantage of using Option.map over Option.isEmpty and Option.get?

I am a new to Scala coming from Java background, currently confused about the best practice considering Option[T].
I feel like using Option.map is just more functional and beautiful, but this is not a good argument to convince other people. Sometimes, isEmpty check feels more straight forward thus more readable. Is there any objective advantages, or is it just personal preference?
Example:
Variation 1:
someOption.map{ value =>
{
//some lines of code
}
} orElse(foo)
Variation 2:
if(someOption.isEmpty){
foo
} else{
val value = someOption.get
//some lines of code
}
I intentionally excluded the options to use fold or pattern matching. I am simply not pleased by the idea of treating Option as a collection right now, and using pattern matching for a simple isEmpty check is an abuse of pattern matching IMHO. But no matter why I dislike these options, I want to keep the scope of this question to be the above two variations as named in the title.
Is there any objective advantages, or is it just personal preference?
I think there's a thin line between objective advantages and personal preference. You cannot make one believe there is an absolute truth to either one.
The biggest advantage one gains from using the monadic nature of Scala constructs is composition. The ability to chain operations together without having to "worry" about the internal value is powerful, not only with Option[T], but also working with Future[T], Try[T], Either[A, B] and going back and forth between them (also see Monad Transformers).
Let's try and see how using predefined methods on Option[T] can help with control flow. For example, consider a case where you have an Option[Int] which you want to multiply only if it's greater than a value, otherwise return -1. In the imperative approach, we get:
val option: Option[Int] = generateOptionValue
var res: Int = if (option.isDefined) {
val value = option.get
if (value > 40) value * 2 else -1
} else -1
Using collections style method on Option, an equivalent would look like:
val result: Int = option
.filter(_ > 40)
.map(_ * 2)
.getOrElse(-1)
Let's now consider a case for composition. Let's say we have an operation which might throw an exception. Additionaly, this operation may or may not yield a value. If it returns a value, we want to query a database with that value, otherwise, return an empty string.
A look at the imperative approach with a try-catch block:
var result: String = _
try {
val maybeResult = dangerousMethod()
if (maybeResult.isDefined) {
result = queryDatabase(maybeResult.get)
} else result = ""
}
catch {
case NonFatal(e) => result = ""
}
Now let's consider using scala.util.Try along with an Option[String] and composing both together:
val result: String = Try(dangerousMethod())
.toOption
.flatten
.map(queryDatabase)
.getOrElse("")
I think this eventually boils down to which one can help you create clear control flow of your operations. Getting used to working with Option[T].map rather than Option[T].get will make your code safer.
To wrap up, I don't believe there's a single truth. I do believe that composition can lead to beautiful, readable, side effect deferring safe code and I'm all for it. I think the best way to show other people what you feel is by giving them examples as we just saw, and letting them feel for themselves the power they can leverage with these sets of tools.
using pattern matching for a simple isEmpty check is an abuse of pattern matching IMHO
If you do just want an isEmpty check, isEmpty/isDefined is perfectly fine. But in your case you also want to get the value. And using pattern matching for this is not abuse; it's precisely the basic use-case. Using get allows to very easily make errors like forgetting to check isDefined or making the wrong check:
if(someOption.isEmpty){
val value = someOption.get
//some lines of code
} else{
//some other lines
}
Hopefully testing would catch it, but there's no reason to settle for "hopefully".
Combinators (map and friends) are better than get for the same reason pattern matching is: they don't allow you to make this kind of mistake. Choosing between pattern matching and combinators is a different question. Generally combinators are preferred because they are more composable (as Yuval's answer explains). If you want to do something covered by a single combinator, I'd generally choose them; if you need a combination like map ... getOrElse, or a fold with multi-line branches, it depends on the specific case.
It seems similar to you in case of Option but just consider the case of Future. You will not be able to interact with the future's value after going out of Future monad.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Promise
import scala.util.{Success, Try}
// create a promise which we will complete after sometime
val promise = Promise[String]();
// Now lets consider the future contained in this promise
val future = promise.future;
val byGet = if (!future.value.isEmpty) {
val valTry = future.value.get
valTry match {
case Success(v) => v + " :: Added"
case _ => "DEFAULT :: Added"
}
} else "DEFAULT :: Added"
val byMap = future.map(s => s + " :: Added")
// promise was completed now
promise.complete(Try("PROMISE"))
//Now lets print both values
println(byGet)
// DEFAULT :: Added
println(byMap)
// Success(PROMISE :: Added)

Using unapply as constructor in match

Is it good style in Scala to "abuse" the unapply method for pattern matching? What I wanted to do, is matching an object against another, and constructing the other object. Since I am fairly new in Scala, I wound up with the following solution. But it doesnt seem really right to use the unapply methode like this, since it is intended as an extractor. Could someone please give me feedback on this?
object Poker {
def unapply(hand: Hand): Option[Poker] = if(hand.countValueGroups().exists(_._2 == 4)) Some(new Poker(hand)) else None
}
val h = Hand("AC As AH Ad 2h")
h match {
case Poker(han) => println("POKER!!!"+han)
case _ => println("?????")
}
I don't know why this should be bad practice, so I'd say the answer is no, this is normal practice. As one commenter said, the mere purpose of unapply is to be used in pattern matching. While pattern matching is mostly used with the companion object of a case class, the concept is deliberately open to other extractors (example: regular expressions).
The only thing that's weird in your example is to return Option[Poker] with Poker begin a singleton object. Since you can't do much with that, probably you want to use a Boolean instead:
object Poker {
def unapply(hand: Hand): Boolean =
hand.countValueGroups().exists(_._2 == 4)
}
case class Hand(s: String) {
def countValueGroups(): List[(Any, Int)] = List("foo" -> 4) // ???
}
val h = Hand("AC As AH Ad 2h")
h match {
case Poker() => println("POKER!!!")
case _ => println("?????")
}

Treating Identifiers as data in Scala

case class Var(x: Int,y:Int)
def compare(v:Var,compare:String): Boolean = {
v.x compare v.y //--NeedHelp
}
def getComp(v:Var,compName:String):Int={
v.compName //--NeedHelp
}
val v = Var(2,3)
assert(true == compare(v,"<"))
assert(false == compare(v,"=="))
assert(false == compare(v,">"))
assert(2==getComp(v,"x"))
assert(3==getComp(v,"y"))
there are two //--NeedHelp statements, how can I go and write it so that it executes as expected. What I am trying to do, what it is called in Programming world? So that I can Google myself also and know more about it.
So, I think there are two questions here really:
Firstly, getting the code working:
For a very simple, but not very extensible or robust implementation, you can "pattern match" on the Strings to decide what to do:
def compare(v:Var,compare:String): Boolean = {
compare match {
case op if op == "<" => v.x < v.y
case op if op == ">" => v.x > v.y
case _ => false // not very robust, anything could be passed in.
}
}
Side Note: It is worth considering how to prevent the user from passing in unexpected values - for example, if the function took a trait of type Operator instead of just taking the String value, then the compiler would prevent invalid values from being passed in.
def getComp(v:Var,compName:String):Int={
compName match {
case name if name == "x" => v.x
case name if name == "y" => v.y
case _ => 0 // not very robust...
}
}
Both of the above statements could easily be implemented using an if statement either, e.g.
def getComp(v:Var,compName:String):Int={
if (compName == "x")
v.x
else if (compName == "y")
v.y
else
0 // again, not very robust.
}
A more advanced implementation of a similar pattern might make use of some of Scala's object oriented features to create different expression types and use parsers to parse a string of text, convert the text to expressions and then evaluate the expressions.
Secondly, in response to "..what is this called in the Programming World?.."
I think this covers a number of areas. It covers an interpreter, because a lot of what you are writing here is basically an interpreter, i.e. you are passing a string that represents a comparison operation and supplying data in the form to be evaluated in the context of that operation. It could also be referred to as a simple DSL (Domain Specific Language).
Also, just a personal recommendation but I believe, if I recall correctly, the Coursera Scala Course which was run by Martin Odersky, covered off a very simple Expression evaluation example in one of the videos (I think perhaps it is week 4). As I recall, it was a very simple one, but it shows at least at a high level how this kind of approach could be implemented in Scala.
In the general case, what you're looking for is reflection.
You can also find more information online on reflection in Scala. As far as I know Scala operators are methods, which would maybe let you use reflection to find them, but I haven't tried it myself.

Why doesn't Scala optimize calls to the same Extractor?

Take the following example, why is the extractor called multiple times as opposed to temporarily storing the results of the first call and matching against that. Wouldn't it be reasonable to assume that results from unapply would not change given the same string.
object Name {
val NameReg = """^(\w+)\s(?:(\w+)\s)?(\w+)$""".r
def unapply(fullName: String): Option[(String, String, String)] = {
val NameReg(fname, mname, lname) = fullName
Some((fname, if (mname == null) "" else mname, lname))
}
}
"John Smith Doe" match {
case Name("Jane", _, _) => println("I know you, Jane.")
case Name(f, "", _) => println(s"Hi ${f}")
case Name(f, m, _) => println(s"Howdy, ${f} ${m}.")
case _ => println("Don't know you")
}
Wouldn't it be reasonable to assume that results from unapply would not change given the same string.
Unfortunately, assuming isn't good enough for a (static) compiler. In order for memoizing to be a legal optimization, the compiler has to prove that the expression being memoized is pure and referentially transparent. However, in the general case, this is equivalent to solving the Halting Problem.
It would certainly be possible to write an optimization pass which tries to prove purity of certain expressions and memoizes them iff and only iff it succeeds, but that may be more trouble than it's worth. Such proofs get very hard very quickly, so they are only likely to succeed for very trivial expressions, which execute very quickly anyway.
What is a pattern match? The spec says it matches the "shape" of the value and binds vars to its "components."
In the realm of mutation, you have questions like, if I match on case class C(var v: V), does a case C(x) capture the mutable field? Well, the answer was no:
https://issues.scala-lang.org/browse/SI-5158
The spec says (sorry) that order of evaluation may be changed, so it recommends against side-effects:
In the interest of efficiency the evaluation of a pattern matching
expression may try patterns in some other order than textual sequence.
This might affect evaluation through side effects in guards.
That's in relation to guard expressions, presumably because extractors were added after case classes.
There's no special promise to evaluate extractors exactly once. (Explicitly in the spec, that is.)
The semantics are only that "patterns are tried in sequence".
Also, your example for regexes can be simplified, since a regex will not be re-evaluated when unapplied to its own Match. See the example in the doc. That is,
Name.NameReg findFirstMatchIn s map {
case Name("Jane",_,_) =>
}
where
object Name { def unapply(m: Regex.Matcher) = ... }

Is there any fundamental limitations that stops Scala from implementing pattern matching over functions?

In languages like SML, Erlang and in buch of others we may define functions like this:
fun reverse [] = []
| reverse x :: xs = reverse xs # [x];
I know we can write analog in Scala like this (and I know, there are many flaws in the code below):
def reverse[T](lst: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
But I wonder, if we could write former code in Scala, perhaps with desugaring to the latter.
Is there any fundamental limitations for such syntax being implemented in the future (I mean, really fundamental -- e.g. the way type inference works in scala, or something else, except parser obviously)?
UPD
Here is a snippet of how it could look like:
type T
def reverse(Nil: List[T]) = Nil
def reverse(x :: xs: List[T]): List[T] = reverse(xs) ++ List(x)
It really depends on what you mean by fundamental.
If you are really asking "if there is a technical showstopper that would prevent to implement this feature", then I would say the answer is no. You are talking about desugaring, and you are on the right track here. All there is to do is to basically stitch several separates cases into one single function, and this can be done as a mere preprocessing step (this only requires syntactic knowledge, no need for semantic knowledge). But for this to even make sense, I would define a few rules:
The function signature is mandatory (in Haskell by example, this would be optional, but it is always optional whether you are defining the function at once or in several parts). We could try to arrange to live without the signature and attempt to extract it from the different parts, but lack of type information would quickly come to byte us. A simpler argument is that if we are to try to infer an implicit signature, we might as well do it for all the methods. But the truth is that there are very good reasons to have explicit singatures in scala and I can't imagine to change that.
All the parts must be defined within the same scope. To start with, they must be declared in the same file because each source file is compiled separately, and thus a simple preprocessor would not be enough to implement the feature. Second, we still end up with a single method in the end, so it's only natural to have all the parts in the same scope.
Overloading is not possible for such methods (otherwise we would need to repeat the signature for each part just so the preprocessor knows which part belongs to which overload)
Parts are added (stitched) to the generated match in the order they are declared
So here is how it could look like:
def reverse[T](lst: List[T]): List[T] // Exactly like an abstract def (provides the signature)
// .... some unrelated code here...
def reverse(Nil) = Nil
// .... another bit of unrelated code here...
def reverse(x :: xs ) = reverse(xs) ++ List(x)
Which could be trivially transformed into:
def reverse[T](list: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
// .... some unrelated code here...
// .... another bit of unrelated code here...
It is easy to see that the above transformation is very mechanical and can be done by just manipulating a source AST (the AST produced by the slightly modified grammar that accepts this new constructs), and transforming it into the target AST (the AST produced by the standard scala grammar).
Then we can compile the result as usual.
So there you go, with a few simple rules we are able to implement a preprocessor that does all the work to implement this new feature.
If by fundamental you are asking "is there anything that would make this feature out of place" then it can be argued that this does not feel very scala. But more to the point, it does not bring that much to the table. Scala author(s) actually tend toward making the language simpler (as in less built-in features, trying to move some built-in features into libraries) and adding a new syntax that is not really more readable goes against the goal of simplification.
In SML, your code snippet is literally just syntactic sugar (a "derived form" in the terminology of the language spec) for
val rec reverse = fn x =>
case x of [] => []
| x::xs = reverse xs # [x]
which is very close to the Scala code you show. So, no there is no "fundamental" reason that Scala couldn't provide the same kind of syntax. The main problem is Scala's need for more type annotations, which makes this shorthand syntax far less attractive in general, and probably not worth the while.
Note also that the specific syntax you suggest would not fly well, because there is no way to distinguish one case-by-case function definition from two overloaded functions syntactically. You probably would need some alternative syntax, similar to SML using "|".
I don't know SML or Erlang, but I know Haskell. It is a language without method overloading. Method overloading combined with such pattern matching could lead to ambiguities. Imagine following code:
def f(x: String) = "String "+x
def f(x: List[_]) = "List "+x
What should it mean? It can mean method overloading, i.e. the method is determined in compile time. It can also mean pattern matching. There would be just a f(x: AnyRef) method that would do the matching.
Scala also has named parameters, which would be probably also broken.
I don't think that Scala is able to offer more simple syntax than you have shown in general. A simpler syntax may IMHO work in some special cases only.
There are at least two problems:
[ and ] are reserved characters because they are used for type arguments. The compiler allows spaces around them, so that would not be an option.
The other problem is that = returns Unit. So the expression after the | would not return any result
The closest I could come up with is this (note that is very specialized towards your example):
// Define a class to hold the values left and right of the | sign
class |[T, S](val left: T, val right: PartialFunction[T, T])
// Create a class that contains the | operator
class OrAssoc[T](left: T) {
def |(right: PartialFunction[T, T]): T | T = new |(left, right)
}
// Add the | to any potential target
implicit def anyToOrAssoc[S](left: S): OrAssoc[S] = new OrAssoc(left)
object fun {
// Use the magic of the update method
def update[T, S](choice: T | S): T => T = { arg =>
if (choice.right.isDefinedAt(arg)) choice.right(arg)
else choice.left
}
}
// Use the above construction to define a new method
val reverse: List[Int] => List[Int] =
fun() = List.empty[Int] | {
case x :: xs => reverse(xs) ++ List(x)
}
// Call the method
reverse(List(3, 2, 1))