I recently wrote a parser using scala's parser combinator library. I decided I was curious about the implementation, and went digging.
While reading through the code, I saw that ~ sequencing used a case class to hold the left and right values.
Attached is the following comment:
/** A wrapper over sequence of matches.
*
* Given `p1: Parser[A]` and `p2: Parser[B]`, a parser composed with
* `p1 ~ p2` will have type `Parser[~[A, B]]`. The successful result
* of the parser can be extracted from this case class.
*
* It also enables pattern matching, so something like this is possible:
*
* {{{
* def concat(p1: Parser[String], p2: Parser[String]): Parser[String] =
* p1 ~ p2 ^^ { case a ~ b => a + b }
* }}}
*/
case class ~[+a, +b](_1: a, _2: b) {
override def toString = "("+ _1 +"~"+ _2 +")"
}
Given that such code as mentioned is certainly possible, and that parsers defined using a ~ b can be extracted into values via { case a ~ b => ... }, how exactly does this un-application work? I am aware of the unapply method in scala, but none is provided here. Do case classes provide one by default (I think yes)? If so, how does this particular case class become case a ~ b and not case ~(a,b)? Is this a pattern that can be exploited by scala programmers?
This differs from objects with unapply in this question because no unapply method exists–or does it? Do case classes auto-magically receive unapply methods?
Do case classes provide [unapply] by default (I think yes)?
Your suspicions are correct. unapply() is one of the many things automatically supplied in a case class. You can verify this for yourself by doing the following:
Write a simple class definition in its own file.
Compile the file only through the "typer" phase and save the results. (Invoke scalac -Xshow-phases to see a description of all the compiler phases.)
Edit the file. Add the word case before the class definition.
Repeat step 2.
Compare the two saved results.
From a Bash shell it might look like this.
%%> cat srcfile.scala
class XYZ(arg :Int)
%%> scalac -Xprint:4 srcfile.scala > plainClass.phase4
%%> vi srcfile.scala # add “case”
%%> scalac -Xprint:4 srcfile.scala > caseClass.phase4
%%> diff plainClass.phase4 caseClass.phase4
There will be a lot of compiler noise to wade through, but you'll see that by simply adding case to your class the compiler generates a ton of extra code.
Some of the things to note:
case class instances
have Product and Serializable mixed in to the type
provide public access to the constructor parameters
have methods copy(), productArity, productElement(), and canEqual().
overrides (provides new code for) methods productPrefix, productIterator, hashCode(), toString(), and equals()
the companion object (created by the compiler)
has methods apply() and unapply()
overrides toString()
If so, how does this particular case class become case a ~ b and not case ~(a,b)?
This turns out to be a nice (if rather obscure) convenience offered by the language.
The unapply() call returns a Tuple that can be patterned to infix notation. Again, this is pretty easy to verify.
class XX(val c:Char, val n:Int)
object XX {
def unapply(arg: XX): Option[(Char, Int)] = Some((arg.c,arg.n))
}
val a XX b = new XX('g', 9)
//a: Char = g
//b: Int = 9
Related
I am trying to understand how a case class can be passed as an argument to a function which accepts functions as arguments. Below is an example:
Consider the below function
def !![B](h: Out[B] => A): In[B] = { ... }
If I understood correctly, this is a polymorphic method which has a type parameter B and accepts a function h as a parameter. Out and In are other two classes defined previously.
This function is then being used as shown below:
case class Q(p: boolean)(val cont: Out[R])
case class R(p: Int)
def g(c: Out[Q]) = {
val rin = c !! Q(true)_
...
}
I am aware that currying is being used to avoid writing the type annotation and instead just writing _. However, I cannot grasp why and how the case class Q is transformed to a function (h) of type Out[B] => A.
EDIT 1 Updated !! above and the In and Out definitions:
abstract class In[+A] {
def future: Future[A]
def receive(implicit d: Duration): A = {
Await.result[A](future, d)
}
def ?[B](f: A => B)(implicit d: Duration): B = {
f(receive)
}
}
abstract class Out[-A]{
def promise[B <: A]: Promise[B]
def send(msg: A): Unit = promise.success(msg)
def !(msg: A) = send(msg)
def create[B](): (In[B], Out[B])
}
These code samples are taken from the following paper: http://drops.dagstuhl.de/opus/volltexte/2016/6115/
TLDR;
Using a case class with multiple parameter lists and partially applying it will yield a partially applied apply call + eta expansion will transform the method into a function value:
val res: Out[Q] => Q = Q.apply(true) _
Longer explanation
To understand the way this works in Scala, we have to understand some fundamentals behind case classes and the difference between methods and functions.
Case classes in Scala are a compact way of representing data. When you define a case class, you get a bunch of convenience methods which are created for you by the compiler, such as hashCode and equals.
In addition, the compiler also generates a method called apply, which allows you to create a case class instance without using the new keyword:
case class X(a: Int)
val x = X(1)
The compiler will expand this call to
val x = X.apply(1)
The same thing will happen with your case class, only that your case class has multiple argument lists:
case class Q(p: boolean)(val cont: Out[R])
val q: Q = Q(true)(new Out[Int] { })
Will get translated to
val q: Q = Q.apply(true)(new Out[Int] { })
On top of that, Scala has a way to transform methods, which are a non value type, into a function type which has the type of FunctionX, X being the arity of the function. In order to transform a method into a function value, we use a trick called eta expansion where we call a method with an underscore.
def foo(i: Int): Int = i
val f: Int => Int = foo _
This will transform the method foo into a function value of type Function1[Int, Int].
Now that we posses this knowledge, let's go back to your example:
val rin = c !! Q(true) _
If we just isolate Q here, this call gets translated into:
val rin = Q.apply(true) _
Since the apply method is curried with multiple argument lists, we'll get back a function that given a Out[Q], will create a Q:
val rin: Out[R] => Q = Q.apply(true) _
I cannot grasp why and how the case class Q is transformed to a function (h) of type Out[B] => A.
It isn't. In fact, the case class Q has absolutely nothing to do with this! This is all about the object Q, which is the companion module to the case class Q.
Every case class has an automatically generated companion module, which contains (among others) an apply method whose signature matches the primary constructor of the companion class, and which constructs an instance of the companion class.
I.e. when you write
case class Foo(bar: Baz)(quux: Corge)
You not only get the automatically defined case class convenience methods such as accessors for all the elements, toString, hashCode, copy, and equals, but you also get an automatically defined companion module that serves both as an extractor for pattern matching and as a factory for object construction:
object Foo {
def apply(bar: Baz)(quux: Corge) = new Foo(bar)(quux)
def unapply(that: Foo): Option[Baz] = ???
}
In Scala, apply is a method that allows you to create "function-like" objects: if foo is an object (and not a method), then foo(bar, baz) is translated to foo.apply(bar, baz).
The last piece of the puzzle is η-expansion, which lifts a method (which is not an object) into a function (which is an object and can thus be passed as an argument, stored in a variable, etc.) There are two forms of η-expansion: explicit η-expansion using the _ operator:
val printFunction = println _
And implicit η-expansion: in cases where Scala knows 100% that you mean a function but you give it the name of a method, Scala will perform η-expansion for you:
Seq(1, 2, 3) foreach println
And you already know about currying.
So, if we put it all together:
Q(true)_
First, we know that Q here cannot possibly be the class Q. How do we know that? Because Q here is used as a value, but classes are types, and like most programming languages, Scala has a strict separation between types and values. Therefore, Q must be a value. In particular, since we know class Q is a case class, object Q is the companion module for class Q.
Secondly, we know that for a value Q
Q(true)
is syntactic sugar for
Q.apply(true)
Thirdly, we know that for case classes, the companion module has an automatically generated apply method that matches the primary constructor, so we know that Q.apply has two parameter lists.
So, lastly, we have
Q.apply(true) _
which passes the first argument list to Q.apply and then lifts Q.apply into a function which accepts the second argument list.
Note that case classes with multiple parameter lists are unusual, since only the parameters in the first parameter list are considered elements of the case class, and only elements benefit from the "case class magic", i.e. only elements get accessors implemented automatically, only elements are used in the signature of the copy method, only elements are used in the automatically generated equals, hashCode, and toString() methods, and so on.
Is there a better explanation than "this is how it works". I mean I tried this one:
class TestShortMatch[T <: AnyRef] {
def foo(t: T): Unit = {
val f = (_: Any) match {
case Val(t) => println(t)
case Sup(l) => println(l)
}
}
class Val(t: T)
class Sup(l: Number)
}
and compiler complaints:
Cannot resolve symbol 'Val'
Cannot resolve symbol 'Sup'
Of course if I add case before each of the classes it will work fine. But what is the reason? Does compiler make some optimization and generate a specific byte-code?
The reason is twofold. Pattern matching is just syntactic sugar for using extractors and case classes happen to give you a couple methods for free, one of which is an extractor method that corresponds to the main constructor.
If you want your example above to work, you need to define an unapply method inside objects Val and Sup. To do that you'd need extractor methods (which are only defined on val fields, so you'll have to make your fields vals):
class Val[T](val t: T)
class Sup(val l: Number)
object Val {
def unapply[T](v: Val[T]): Option[T] = Some(v.t)
}
object Sup {
def unapply(s: Sup): Option[Number] = Some(s.l)
}
And which point you can do something like val Val(v) = new Val("hi"). More often than not, though, it is better to make your class a case class. Then, the only times you should be defining extra extractors.
The usual example (to which I can't seem to find a reference) is coordinates:
case class Coordinate(x: Double, val: Double)
And then you can define a custom extractors like
object Polar {
def unapply(c: Coordinate): Option[(Double,Double)] = {...}
}
object Cartesian {
def unapply(c: Coordinate): Option[(Double,Double)] = Some((c.x,c.y))
}
to convert to the two different representations, all when you pattern match.
You can use pattern matching on arbitrary classes, but you need to implement an unapply method, used to "de-construct" the object.
With a case class, the unapply method is automatically generated by the compiler, so you don't need to implement it yourself.
When you write match exp { case Val(pattern) => ... case ... }, that is equivalent to something like this:
match Val.unapply(exp) {
case Some(pattern) =>
...
case _ =>
// code to match the other cases goes here
}
That is, it uses the result of the companion object's unapply method to see whether the match succeeded.
If you define a case class, it automatically defines a companion object with a suitable unapply method. For a normal class it doesn't. The motivation for that is the same as for the other things that gets automatically defined for case classes (like equals and hashCode for example): By declaring a class as a case class, you're making a statement about how you want the class to behave. Given that, there's a good chance that the auto generated will do what you want. For a general class, it's up to you to define these methods like you want them to behave.
Note that parameters for case classes are vals by default, which isn't true for normal classes. So your class class Val(t: T) doesn't even have any way to access t from the outside. So it isn't even possible to define an unapply method that gets at the value of t. That's another reason why you don't get an automatically generated unapply for normal classes: It isn't even possible to generate one unless all parameters are vals.
In Action.scala from play framework, it has the following code, why it defines a trait "Handler" without any method or field, what's the purpose or benefit of defining an empty trait?
trait Handler
/**
* A handler that is able to tag requests. Usually mixed in to other handlers.
*/
trait RequestTaggingHandler extends Handler {
def tagRequest(request: RequestHeader): RequestHeader
}
Building on #user2864740
A simple example. (This is just one use-case)
Let's define a data structure for simple expressions. We want numbers to exist and a plus which combines expressions.
trait Expression
case class Number(i: Int) extends Expression
case class Plus(e1: Expression, e2: Expression) extends Expression
Now in order to evaluate an Expression, we define a method like this.
def evaluate(e: Expression): Int = e match {
case Number(i) => i
case Plus(e1, e2) => evaluate(e1) + evaluate(e2)
}
Since we have Expression as a parameter for Plus, we can put Plus or Number inside it.
val myExpression = Plus(Plus(Number(1),Number(2)), Number(4))
evaluate(myExpression) //yields 7
We just used the empty trait as a common super type (a connection) for Number and Plus, enabling us to pattern-match for evaluate and use Plus inside Plus.
I hope this is not too confusing.
I have the following data model which I'm going to do pattern matching against later:
abstract class A
case class C(s:String) extends A
abstract class B extends A
case class D(i:Int) extends B
case class E(s:Int, e:Int) extends B
A is the abstract super type of the hierarchy. C is a concrete subclass of A. Other concrete subclasses of A are subclasses of B which is in turn a subclass of A.
Now if I write something like this, it works:
def match(a:A) a match {
a:C => println("C")
a:B => println("B")
}
However, in a for loop I cannot match against B. I assume that I need a constructor pattern, but since B is abstract, there is no constructor pattern for B.
val list:List[A] = List(C("a"), D(1), E(2,5), ...)
for (b:B <- list) println(b) // Compile error
for (b#B <- list) println(b) // Compile error
Here, I would like to print only B instances. Any workaround for this case?
You can use collect:
list.collect { case b: B => println(b) }
If you want to better undertand this, I recommend to read about partial functions. Here for example.
Sergey is right; you'll have to give up for if you want to pattern match and filter only B instances. If you still want to use a for comprehension for whatever reason, I think one way is to just resort to using a guard:
for (b <- list if b.isInstanceOf[B]) println(b)
But it's always best to pick pattern-matching instead of isInstanceOf. So I'd go with the collect suggestion (if it made sense in the context of the rest of my code).
Another suggestion would be to define a companion object to B with the same name, and define the unapply method:
abstract class A
case class C(s:String) extends A
abstract class B extends A
object B { def unapply(b: B) = Option(b) } // Added a companion to B
case class D(i:Int) extends B
case class E(s:Int, e:Int) extends B
Then you can do this:
for (B(b) <- list) println(b)
So that's not the 'constructor' of B, but the companion's unapply method.
It works, and that's what friends are for, right?
(See http://www.scala-lang.org/node/112 )
If you ask me, the fact that you can't use pattern matching here is an unfortunate inconsistency of scala. Indeed scala does let you pattern match in for comprehensions, as this example will show:
val list:List[A] = List(C("a"), D(1), E(2,5)
for ((b:B,_) <- list.map(_ -> null)) println(b)
Here I temporarily wrap the elements into pairs (whith a dummy and unused second value) and then pattern match for a pair where the first element is of type B. As the output shows, you get the expected behaviour:
D(1)
E(2,5)
So there you go, scala does support filtering based on pattern matching (even when matching by type), it just seems that the grammar does not handle pattern matching a single element by type.
Obviously I am not advising to use this trick, this was just to illustrate. Using collect is certainly better.
Then again, there is another, more general solution if for some reason you really fancy for comprehensions more than anything:
object Value {
def unapply[T]( value: T ) = Some( value )
}
for ( Value(b:B) <- list ) println(b)
We just introduced a dummy extractor in the Value object which just does nothing, so that Value(b:B) has the same meaning as just b:B, except that the former does compile. And unlike my earlier trick with pairs, it is relatively readable, and Value only has to be written once, you can use it at will then (in particular, no need for writing a new extractor for each type you want to pattern match against, as in #Faiz's answer. I'll let you find a better name than Value though.
Finally, there is another work around that works out of the box (credit goes to Daniel Sobral), but is slightly less readable and requires a dummy identifier (here foo):
for ( b #(foo:B) <- list ) println(b)
// or similarly:
for ( foo #(b:B) <- list ) println(b)
my 2 cents: You can add a condition in the for comprehension checking type but that would NOT be as elegant as using collect which would take only instances of class B.
Is it possible to combine guard conditions with pattern matching within sealed case class declarations?
I realise its possible to include guard conditions within the match block but I feel it would be beneficial to define this conditions up front in the sealed case classes.
This would allow developers to define strict set of possible inputs which the compiler would check when pattern matching.
So in summary I'd like to be able to do the equivalent of something like this:
// create a set of pattern matchable cases with guards built in
sealed abstract class Args
case class ValidArgs1(arg1:Int,arg2:Int) if arg1>1 && arg2<10 extends Args
case class ValidArgs2(arg1:Int,arg2:Int) if arg1>5 && arg2<6 extends Args
case class InvalidArgs(arg1:Int,arg2:Int) if arg1<=1 && arg2>=10 extends Args
// the aim of this is to achieve pattern matching against an exhaustive set of
// pre-defined possibilities
def process(args:Args){
args match
{
case ValidArgs1 = > // do this
case ValidArgs2= > // do this
case InvalidArgs = > // do this
}
}
+1 for an interesting speculative question. Since you are not operating at the type level, you cannot verify the instantiation at compile time, except maybe for very special checks with macros, e.g. when you are passing literals to the constructor.
On the other hand, your scenario, the pattern matching, is a runtime action. For that to work, you could use extractors instead of case classes.
case class Args(arg1: Int, arg2: Int)
object ValidArgs1 {
def apply(arg1: Int, arg2: Int): Args = {
val res = Args(arg1, arg2)
require(unapply(res))
res
}
def unapply(args: Args): Boolean = args.arg1 > 1 && args.arg2 < 10
}
def process(args: Args) = args match {
case ValidArgs1() => "ok"
case _ => "invalid"
}
process(ValidArgs1(2, 9))
process(Args(1, 10))
process(Args(3, 4))
I don't believe that you will be able to have general constraints/assertions that are checked at compile-time in Scala, because Scala does not have a static verifier which would be needed to do this. If you are interested, have a look at (research) languages/tools such as ESC/Java, Spec#, Dafny or VeriFast.
There might be ways of having a very limited amount of static checking with the regular compiler Scala by using type-level programming or Scala macros, but this is just a wild guess of mine, as I am not familiar with either of them. To be honest, I must admit that I would be quite surprised if macros can actually help here.
What works is runtime assertion checking, e.g.
case class Foo(arg1: Int, arg2: Int) {
require(arg1 < arg2, "The first argument must be strictly less than " +
"the second argument.")
}
Foo(0, 0)
/* java.lang.IllegalArgumentException: requirement failed:
* The first argument must be strictly less than the second
* argument.
*/
but that isn't probably what you had in mind.