pattern matching in scala: Why can I match undefined variables? - scala

I just started learing scala and found a piece of code, thats works just fine, but I just don't get why...
sealed abstract class Nat
case class Zero() extends Nat
case class Succ(n: Nat) extends Nat
def add(n: Nat, m: Nat): Nat = {
n match {
case Zero() => m
case Succ(prev) => add(prev, Succ(m))
}
}
The members of Nat and Zero are defined in an extra file (and used later on) like this:
val zero = Zero()
val one = Succ(zero)
val two = Succ(one)
val three = Succ(Succ(one))
val four = Succ(Succ(two))
My question now is: In the second case 'prev' never got defined. What happens here? The math behind is clear for me, (like n+m == (n-1)+(m+1), repeat until n==Zero()). Ok so far. But all that is defined is Succ() and not a kind of Prev()?

In this case, prev is declared in the case statement, here:
case Succ(prev) => add(prev, Succ(m))
when you are typing case Succ(prev) ... you are using pattern matching, and saying: if n is of type Succ and we call its n parameter prev, then return add(...)
so basically you are naming the n parameter of Succ class as prev to use it after the arrow =>
This Scala feature can even be use with regex where you capture groups that will be put into the variables you define.
More info on the docs: https://docs.scala-lang.org/tour/pattern-matching.html

Scala gives you concise syntax so instead of having to write out something like
if (n.isInstanceOf[Succ]) {
val x = n.asInstanceOf[Succ]
val prev = x.n
add(prev, ...)
}
we can reason at a higher level by considering the structure of data and write simply
case Succ(prev) => add(prev, ...)

case classes in scala automatically define a method called unapply.
Here is the Scala 2 Doc on Case Classes
It is this unapply method that enables this kind of pattern matching.
If I define a case class with a member called value, I can extract that value by utilizing this unapply to obtain it:
case class Number( value: Int )
val valueOfNumber: Int = Number(5).value
println(valueOfNumber) // 5
// Using unapply
val testNumber: Number = Number(200)
val Number(numberValue) = testNumber
println(numberValue) // 200
When you do case Succ(prev) => add(prev, Succ(m)) you are extracting the value n of Succ as prev by matching on the type signatures of the unapply method.
Hence, prev is defined, it is the value, n, contained by the matched Succ

Related

Assigning value to arg with val object(arg) = object

I am following this tutorial on GraphQL with Sangria. I am wondering about the following line
val JsObject(fields) = requestJSON
where requestJSON is an object of JsValue. This way of assigning fields is new to me and my question is, if you could name that pattern or provide me with a link to a tutorial regarding this structure.
The important thing to know is that val definitions support a Pattern on the left-hand side of the assignment, thus providing (subset of the functionality of) Pattern Matching.
So, your example is equivalent to:
val fields = requestJSON match {
case JsObject(foo) => foo
}
See Scala Language Specification Section 4.1 Value Declarations and Definitions for details.
So, for example, if you have a list l and you want to assign the first element and the rest, you could write:
val x :: xs = l
Or, for the fairly common case where a method returns a tuple, you could write:
val (result1, result2) = foo()
It is the Extractor pattern, you can reach the same result implementing the unapply method on your arbitrary object (like shown in the example). When you create a case class the compiler produces an unapply method for you, so you can do:
case class Person(name : String, surname : String)
val person = Person("gianluca", "aguzzi")
val Person(name, surname) = person

How does Scala transform case classes to be accepted as functions?

I am trying to understand how a case class can be passed as an argument to a function which accepts functions as arguments. Below is an example:
Consider the below function
def !![B](h: Out[B] => A): In[B] = { ... }
If I understood correctly, this is a polymorphic method which has a type parameter B and accepts a function h as a parameter. Out and In are other two classes defined previously.
This function is then being used as shown below:
case class Q(p: boolean)(val cont: Out[R])
case class R(p: Int)
def g(c: Out[Q]) = {
val rin = c !! Q(true)_
...
}
I am aware that currying is being used to avoid writing the type annotation and instead just writing _. However, I cannot grasp why and how the case class Q is transformed to a function (h) of type Out[B] => A.
EDIT 1 Updated !! above and the In and Out definitions:
abstract class In[+A] {
def future: Future[A]
def receive(implicit d: Duration): A = {
Await.result[A](future, d)
}
def ?[B](f: A => B)(implicit d: Duration): B = {
f(receive)
}
}
abstract class Out[-A]{
def promise[B <: A]: Promise[B]
def send(msg: A): Unit = promise.success(msg)
def !(msg: A) = send(msg)
def create[B](): (In[B], Out[B])
}
These code samples are taken from the following paper: http://drops.dagstuhl.de/opus/volltexte/2016/6115/
TLDR;
Using a case class with multiple parameter lists and partially applying it will yield a partially applied apply call + eta expansion will transform the method into a function value:
val res: Out[Q] => Q = Q.apply(true) _
Longer explanation
To understand the way this works in Scala, we have to understand some fundamentals behind case classes and the difference between methods and functions.
Case classes in Scala are a compact way of representing data. When you define a case class, you get a bunch of convenience methods which are created for you by the compiler, such as hashCode and equals.
In addition, the compiler also generates a method called apply, which allows you to create a case class instance without using the new keyword:
case class X(a: Int)
val x = X(1)
The compiler will expand this call to
val x = X.apply(1)
The same thing will happen with your case class, only that your case class has multiple argument lists:
case class Q(p: boolean)(val cont: Out[R])
val q: Q = Q(true)(new Out[Int] { })
Will get translated to
val q: Q = Q.apply(true)(new Out[Int] { })
On top of that, Scala has a way to transform methods, which are a non value type, into a function type which has the type of FunctionX, X being the arity of the function. In order to transform a method into a function value, we use a trick called eta expansion where we call a method with an underscore.
def foo(i: Int): Int = i
val f: Int => Int = foo _
This will transform the method foo into a function value of type Function1[Int, Int].
Now that we posses this knowledge, let's go back to your example:
val rin = c !! Q(true) _
If we just isolate Q here, this call gets translated into:
val rin = Q.apply(true) _
Since the apply method is curried with multiple argument lists, we'll get back a function that given a Out[Q], will create a Q:
val rin: Out[R] => Q = Q.apply(true) _
I cannot grasp why and how the case class Q is transformed to a function (h) of type Out[B] => A.
It isn't. In fact, the case class Q has absolutely nothing to do with this! This is all about the object Q, which is the companion module to the case class Q.
Every case class has an automatically generated companion module, which contains (among others) an apply method whose signature matches the primary constructor of the companion class, and which constructs an instance of the companion class.
I.e. when you write
case class Foo(bar: Baz)(quux: Corge)
You not only get the automatically defined case class convenience methods such as accessors for all the elements, toString, hashCode, copy, and equals, but you also get an automatically defined companion module that serves both as an extractor for pattern matching and as a factory for object construction:
object Foo {
def apply(bar: Baz)(quux: Corge) = new Foo(bar)(quux)
def unapply(that: Foo): Option[Baz] = ???
}
In Scala, apply is a method that allows you to create "function-like" objects: if foo is an object (and not a method), then foo(bar, baz) is translated to foo.apply(bar, baz).
The last piece of the puzzle is η-expansion, which lifts a method (which is not an object) into a function (which is an object and can thus be passed as an argument, stored in a variable, etc.) There are two forms of η-expansion: explicit η-expansion using the _ operator:
val printFunction = println _
And implicit η-expansion: in cases where Scala knows 100% that you mean a function but you give it the name of a method, Scala will perform η-expansion for you:
Seq(1, 2, 3) foreach println
And you already know about currying.
So, if we put it all together:
Q(true)_
First, we know that Q here cannot possibly be the class Q. How do we know that? Because Q here is used as a value, but classes are types, and like most programming languages, Scala has a strict separation between types and values. Therefore, Q must be a value. In particular, since we know class Q is a case class, object Q is the companion module for class Q.
Secondly, we know that for a value Q
Q(true)
is syntactic sugar for
Q.apply(true)
Thirdly, we know that for case classes, the companion module has an automatically generated apply method that matches the primary constructor, so we know that Q.apply has two parameter lists.
So, lastly, we have
Q.apply(true) _
which passes the first argument list to Q.apply and then lifts Q.apply into a function which accepts the second argument list.
Note that case classes with multiple parameter lists are unusual, since only the parameters in the first parameter list are considered elements of the case class, and only elements benefit from the "case class magic", i.e. only elements get accessors implemented automatically, only elements are used in the signature of the copy method, only elements are used in the automatically generated equals, hashCode, and toString() methods, and so on.

Why do each new instance of case classes evaluate lazy vals again in Scala?

From what I have understood, scala treats val definitions as values.
So, any instance of a case class with same parameters should be equal.
But,
case class A(a: Int) {
lazy val k = {
println("k")
1
}
val a1 = A(5)
println(a1.k)
Output:
k
res1: Int = 1
println(a1.k)
Output:
res2: Int = 1
val a2 = A(5)
println(a1.k)
Output:
k
res3: Int = 1
I was expecting that for println(a2.k), it should not print k.
Since this is not the required behavior, how should I implement this so that for all instances of a case class with same parameters, it should only execute a lazy val definition only once. Do I need some memoization technique or Scala can handle this on its own?
I am very new to Scala and functional programming so please excuse me if you find the question trivial.
Assuming you're not overriding equals or doing something ill-advised like making the constructor args vars, it is the case that two case class instantiations with same constructor arguments will be equal. However, this does not mean that two case class instantiations with same constructor arguments will point to the same object in memory:
case class A(a: Int)
A(5) == A(5) // true, same as `A(5).equals(A(5))`
A(5) eq A(5) // false
If you want the constructor to always return the same object in memory, then you'll need to handle this yourself. Maybe use some sort of factory:
case class A private (a: Int) {
lazy val k = {
println("k")
1
}
}
object A {
private[this] val cache = collection.mutable.Map[Int, A]()
def build(a: Int) = {
cache.getOrElseUpdate(a, A(a))
}
}
val x = A.build(5)
x.k // prints k
val y = A.build(5)
y.k // doesn't print anything
x == y // true
x eq y // true
If, instead, you don't care about the constructor returning the same object, but you just care about the re-evaluation of k, you can just cache that part:
case class A(a: Int) {
lazy val k = A.kCache.getOrElseUpdate(a, {
println("k")
1
})
}
object A {
private[A] val kCache = collection.mutable.Map[Int, Int]()
}
A(5).k // prints k
A(5).k // doesn't print anything
The trivial answer is "this is what the language does according to the spec". That's the correct, but not very satisfying answer. It's more interesting why it does this.
It might be clearer that it has to do this with a different example:
case class A[B](b: B) {
lazy val k = {
println(b)
1
}
}
When you're constructing two A's, you can't know whether they are equal, because you haven't defined what it means for them to be equal (or what it means for B's to be equal). And you can't statically intitialize k either, as it depends on the passed in B.
If this has to print twice, it would be entirely intuitive if that would only be the case if k depends on b, but not if it doesn't depend on b.
When you ask
how should I implement this so that for all instances of a case class with same parameters, it should only execute a lazy val definition only once
that's a trickier question than it sounds. You make "the same parameters" sound like something that can be known at compile time without further information. It's not, you can only know it at runtime.
And if you only know that at runtime, that means you have to keep all past uses of the instance A[B] alive. This is a built in memory leak - no wonder Scala has no built-in way to do this.
If you really want this - and think long and hard about the memory leak - construct a Map[B, A[B]], and try to get a cached instance from that map, and if it doesn't exist, construct one and put it in the map.
I believe case classes only consider the arguments to their constructor (not any auxiliary constructor) to be part of their equality concept. Consider when you use a case class in a match statement, unapply only gives you access (by default) to the constructor parameters.
Consider anything in the body of case classes as "extra" or "side effect" stuffs. I consider it a good tactic to make case classes as near-empty as possible and put any custom logic in a companion object. Eg:
case class Foo(a:Int)
object Foo {
def apply(s: String) = Foo(s.toInt)
}
In addition to dhg answer, I should say, I'm not aware of functional language that does full constructor memoizing by default. You should understand that such memoizing means that all constructed instances should stick in memory, which is not always desirable.
Manual caching is not that hard, consider this simple code
import scala.collection.mutable
class Doubler private(a: Int) {
lazy val double = {
println("calculated")
a * 2
}
}
object Doubler{
val cache = mutable.WeakHashMap.empty[Int, Doubler]
def apply(a: Int): Doubler = cache.getOrElseUpdate(a, new Doubler(a))
}
Doubler(1).double //calculated
Doubler(5).double //calculated
Doubler(1).double //most probably not calculated

Calculating nullable nonterminals of a grammar in a functional way (preferably in Scala)

I'm new to functional programming and was wondering how one solves the problem of calculating the set of nullable nonterminals in a context-free grammar in a pure functional way without using variable assignments.
A nullable nonterminal is a nonterminal directly yielding empty, e.g., A ::= , or
having a body containing of nullable nonterminals, e.g., A ::= B C D, where all B C and D yield empty.
I'm using the following definitions in Scala to define a grammar:
case class Grammar(name:String, startSymbol:Nonterminal, rules:List[Rule])
case class Rule(head: Nonterminal, body:List[Symbol])
abstract class Symbol
case class Terminal(c:Char) extends Symbol
case class Nonterminal(name:String) extends Symbol
A basic algorithm is that to gather all directly nullable nonterminals and put them in a set.
Then in each iteration try to determine which production rules have all nullable nonterminals
on their body. Those nonterminals will be added to the set until no new nonterminal is added to the
set.
I have implemented this procedure in Scala as:
def getNullableNonterminals(grammar:Grammar) = {
var nieuw : Set[Nonterminal] = (for(Rule(head, Nil) <- grammar.rules) yield head) (collection.breakOut)
var old = Set[Nonterminal]()
while(old != nieuw) {
old = nieuw
for{
Rule(head, symbols) <- grammar.rules
if symbols.length > 0
if symbols.forall( s => s.isInstanceOf[Nonterminal] && old.contains(s.asInstanceOf[Nonterminal]))
} nieuw = nieuw + head
}
nieuw
}
Although the code is much shorter than the equivalent Java version, it uses variables. Any suggestions
to rewrite this piece of code in a functional style?
Here is a more idiomatic Scala solution:
object Algorithm {
def getNullableNonterminals(grammar:Grammar) = {
loop(grammar, Set())
}
#tailrec
private def loop(grammar: Grammar, nullablesSoFar: Set[Nonterminal]): Set[Nonterminal] = {
val newNullables = generateNew(grammar, nullablesSoFar)
if (newNullables.isEmpty)
nullablesSoFar //no new nullables found, so we just return the ones we have
else
loop(grammar, nullablesSoFar ++ newNullables) //add the newly found nullables to the solution set and we keep going
}
private def generateNew(grammar: Grammar, nullableSoFar: Set[Nonterminal]) = {
for {
Rule(head, body) <- grammar.rules
if !nullableSoFar.contains(head)
if body.forall(isNullable(_, nullableSoFar))
} yield head
}
//checks if the symbol is nullable given the current set of nullables
private def isNullable(symbol: Symbol, provenNullable: Set[Nonterminal]) = symbol match {
case Terminal(_) => false
case x#Nonterminal(_) => provenNullable.contains(x)
}
}
The while statement is replaced with a recursive function - loop.
Also, avoid using isInstanceOf - pattern matching is much better suited for this.
Small observation - make the Symbol class sealed, since this can enforce warnings of missing cases in pattern matches.
Here is another approach using memoisation (a reference, another reference), which avoids the need for a fixed-point computation as in yours and M. A. D.'s solution. Moreover, it is a general pattern applicable to loads of scenarios. Have a look at the Scalaz implementation.
def getNullableNonterminals(g: Grammar): Iterable[Nonterminal] = {
/* Cache that is used by isNullable to memoise results. */
var cache: Map[Nonterminal, Boolean] = Map()
/* Assumption: For each nonterminal nt there exists only one rule r
* such that r.head == nt.
*/
var rules: Map[Nonterminal, List[Symbol]] = g.rules.map(r => (r.head, r.body)).toMap
def isNullable(s: Symbol): Boolean = s match {
case _: Terminal => false
case nt: Nonterminal =>
/* Either take the cached result, or compute it and store it in the cache. */
cache.getOrElse(nt, {
/* rules(nt) assumes that there is a rule for every nonterminal */
val nullable = rules(nt) forall isNullable
cache += ((nt, nullable))
nullable
})
}
rules.keys filter isNullable
}
Test case:
val ta = Terminal('a')
val tb = Terminal('b')
val ntX = Nonterminal("X")
val ntY = Nonterminal("Y")
val ntZ = Nonterminal("Z")
val ntP = Nonterminal("P")
val ntQ = Nonterminal("Q")
val ntR = Nonterminal("R")
val ntS = Nonterminal("S")
val rX = Rule(ntX, ntP :: ntQ :: Nil)
val rY = Rule(ntY, ntP :: ta :: ntQ :: Nil)
val rZ = Rule(ntZ, ntR :: Nil)
val rP = Rule(ntP, ntQ :: Nil)
val rQ = Rule(ntQ, Nil)
val rR = Rule(ntR, tb :: Nil)
val rS = Rule(ntS, ntX :: ntY :: ntZ :: Nil)
val g = Grammar("Test", ntS, List(rX, rY, rZ, rP, rQ, rR, rS))
getNullableNonterminals(g) foreach println
// Nonterminal(Q), Nonterminal(X), Nonterminal(P)
I have finally found time to write an example of how to do the grammar
nullability computation using circular attribute grammars. The code below uses
our Kiama language processing library for Scala. You can find the full source
code of the example and tests in Kiama. See SemanticAnalysis.scala for the
main attribution code, e.g., nullable.
In short, the approach does the following:
represents a grammar as an abstract syntax tree structure,
performs name analysis on the tree structure to resolve references from uses
of grammar symbols to definitions of those symbols, and
computes nullability as a circular attribute on the resulting DAG structure.
The attribute definitions I use are quite similar to the ones used as examples
in the paper Circular Reference Attribute Grammars by Magnusson and Hedin from
LDTA 2003. They implement circular attributes in their JastAdd system and I
would highly recommend the paper for anyone wishing to understand this topic.
We use essentially the same algorithms in Kiama.
Here is the definition of the AST that the example uses. Tree is a Kiama
type that provides some common behaviour.
sealed abstract class GrammarTree extends Tree
case class Grammar (startRule : Rule, rules : Seq[Rule]) extends GrammarTree
case class Rule (lhs : NonTermDef, rhs : ProdList) extends GrammarTree
sealed abstract class ProdList extends GrammarTree
case class EmptyProdList () extends ProdList
case class NonEmptyProdList (head : Prod, tail : ProdList) extends ProdList
case class Prod (symbols : SymbolList) extends GrammarTree
sealed abstract class SymbolList extends GrammarTree
case class EmptySymbolList () extends SymbolList
case class NonEmptySymbolList (head : Symbol, tail : SymbolList) extends SymbolList
sealed abstract class Symbol extends GrammarTree
case class TermSym (name : String) extends Symbol
case class NonTermSym (nt : NonTermUse) extends Symbol
sealed abstract class NonTerm extends GrammarTree {
def name : String
}
case class NonTermDef (name : String) extends NonTerm
case class NonTermUse (name : String) extends NonTerm
The code below shows the definition of the nullable attribute. It
starts out false and then a fixed point "loop" is entered to compute until the
value stabilises. The cases show how to compute the attribute for different
types of nodes in the AST.
Kiama's circular attribute constructor incorporates all of the implementation
of the attributes, including storage caching, fixed point detection etc.
val nullable : GrammarTree => Boolean =
circular (false) {
// nullable of the start rule
case Grammar (r, _) =>
r->nullable
// nullable of the right-hand side of the rule
case Rule (_, rhs) =>
rhs->nullable
// nullable of the component productions
case EmptyProdList () =>
false
case NonEmptyProdList (h, t) =>
h->nullable || t->nullable
// nullable of the component symbol lists
case Prod (ss) =>
ss->nullable
case EmptySymbolList () =>
true
case NonEmptySymbolList (h, t) =>
h->nullable && t->nullable
// terminals are not nullable
case TermSym (_) =>
false
// Non-terminal definitions are nullable if the rule in which they
// are defined is nullable. Uses are nullable if their associated
// declaration is nullable.
case NonTermSym (n) =>
n->nullable
case n : NonTermDef =>
(n.parent[Rule])->nullable
case n : NonTermUse =>
(n->decl).map (nullable).getOrElse (false)
}
The reference attribute decl is the one that connects a non-terminal use to
its corresponding definition on the left-hand side of a rule. The parent
field is a reference from a node to its parent in the AST.
Since the nullability of one rule or symbol depends on the nullability of
others, what you get is a set of attribute occurrences that participate in a
dependence cycle. The result is a declarative version of the nullability calculation that
closely resembles a "textbook" definition. (The example also defines
attributes for FIRST and FOLLOW set computations that are defined in terms of
nullability.) Circular attributes combine memoisation and fixed point
computation in a convenient package for this kind of problem.

Scala pattern matching in the for construct

I have the following data model which I'm going to do pattern matching against later:
abstract class A
case class C(s:String) extends A
abstract class B extends A
case class D(i:Int) extends B
case class E(s:Int, e:Int) extends B
A is the abstract super type of the hierarchy. C is a concrete subclass of A. Other concrete subclasses of A are subclasses of B which is in turn a subclass of A.
Now if I write something like this, it works:
def match(a:A) a match {
a:C => println("C")
a:B => println("B")
}
However, in a for loop I cannot match against B. I assume that I need a constructor pattern, but since B is abstract, there is no constructor pattern for B.
val list:List[A] = List(C("a"), D(1), E(2,5), ...)
for (b:B <- list) println(b) // Compile error
for (b#B <- list) println(b) // Compile error
Here, I would like to print only B instances. Any workaround for this case?
You can use collect:
list.collect { case b: B => println(b) }
If you want to better undertand this, I recommend to read about partial functions. Here for example.
Sergey is right; you'll have to give up for if you want to pattern match and filter only B instances. If you still want to use a for comprehension for whatever reason, I think one way is to just resort to using a guard:
for (b <- list if b.isInstanceOf[B]) println(b)
But it's always best to pick pattern-matching instead of isInstanceOf. So I'd go with the collect suggestion (if it made sense in the context of the rest of my code).
Another suggestion would be to define a companion object to B with the same name, and define the unapply method:
abstract class A
case class C(s:String) extends A
abstract class B extends A
object B { def unapply(b: B) = Option(b) } // Added a companion to B
case class D(i:Int) extends B
case class E(s:Int, e:Int) extends B
Then you can do this:
for (B(b) <- list) println(b)
So that's not the 'constructor' of B, but the companion's unapply method.
It works, and that's what friends are for, right?
(See http://www.scala-lang.org/node/112 )
If you ask me, the fact that you can't use pattern matching here is an unfortunate inconsistency of scala. Indeed scala does let you pattern match in for comprehensions, as this example will show:
val list:List[A] = List(C("a"), D(1), E(2,5)
for ((b:B,_) <- list.map(_ -> null)) println(b)
Here I temporarily wrap the elements into pairs (whith a dummy and unused second value) and then pattern match for a pair where the first element is of type B. As the output shows, you get the expected behaviour:
D(1)
E(2,5)
So there you go, scala does support filtering based on pattern matching (even when matching by type), it just seems that the grammar does not handle pattern matching a single element by type.
Obviously I am not advising to use this trick, this was just to illustrate. Using collect is certainly better.
Then again, there is another, more general solution if for some reason you really fancy for comprehensions more than anything:
object Value {
def unapply[T]( value: T ) = Some( value )
}
for ( Value(b:B) <- list ) println(b)
We just introduced a dummy extractor in the Value object which just does nothing, so that Value(b:B) has the same meaning as just b:B, except that the former does compile. And unlike my earlier trick with pairs, it is relatively readable, and Value only has to be written once, you can use it at will then (in particular, no need for writing a new extractor for each type you want to pattern match against, as in #Faiz's answer. I'll let you find a better name than Value though.
Finally, there is another work around that works out of the box (credit goes to Daniel Sobral), but is slightly less readable and requires a dummy identifier (here foo):
for ( b #(foo:B) <- list ) println(b)
// or similarly:
for ( foo #(b:B) <- list ) println(b)
my 2 cents: You can add a condition in the for comprehension checking type but that would NOT be as elegant as using collect which would take only instances of class B.