creating two related ASTs with sealed case classes in Scala - scala

Whenever I've had to create an AST in Scala, I've used the abstract sealed trait/ case class pattern. It's worked really well so far, having compiler checked pattern matching is a big win.
However now I've hit a problem that I cant wrap my head around. What if I have 2 languages where one is a subset of the other? As a simple example consider a lambda calculus where every variable is bound, and another related language where the variables could be bound or free.
First language:
abstract sealed class Expression
case class Variable(val scope: Lambda, val name:String) extends Expression
case class Lambda(val v: Variable, val inner: Expression) extends Expression
case class Application(val function: Expression, val input: Expression) extends Expression
Second Language:
abstract sealed class Expression
case class Variable(val name:String) extends Expression
case class Lambda(val v: Variable, val inner: Expression) extends Expression
case class Application(val function: Expression, val input: Expression) extends Expression
Where the only change is the removal of scope from Variable.
As you can see there is a lot of redundancy. But because I'm using sealed classes, its hard to think of a good way to extend it. Another challenge to combining them would be that now every Lambda and Application needs to keep track of the language of its parameters, at the type level.
This example is not so bad because it is very small, but imagine the same problem for strict HTML/weak HTML.

The classical answer to this problem is to have a single general AST and an additional pass for validation. You'll have to live with ASTs that are well-formed syntactically, but won't pass validation (type-checking).
If you want to distinguish at the type level, the type-checking pass could produce a new AST. You might be able to use path-dependent types for that.
As a side-note, your example seems to have a cycle: to create a Lambda you need a Variable, but to create a Variable you need the outer Lambda.

When deciding how to generalize, it is sometimes helpful to think of an example function that would need to operate on the generalized structure. So, take some operation that you would want to perform on both bound and free trees. Take eta-reduction:
def tryEtaReduce(x: Expression): Option[Expression] =
x match {
case Lambda(v1, Application(f, v2: Variable)) if v1 == v2 => Some(f)
case _ => None
}
For the above function, a generalization like the following will work, although it has an obvious ugliness:
trait AST {
sealed trait Expression
type Scope
case class Variable(scope: Scope, name: String) extends Expression
case class Lambda(v: Variable, inner: Expression) extends Expression
case class Application(function: Expression, input: Expression) extends Expression
}
object BoundAST extends AST {
type Scope = Lambda
}
object FreeAST extends AST {
type Scope = Unit
}
trait ASTOps {
val ast: AST
import ast._
def tryEtaReduce(x: Expression): Option[Expression] =
x match {
case Lambda(v1, Application(f, v2: Variable)) if v1 == v2 =>
Some(f)
case _ =>
None
}
}

Related

Type-safety for Patternmatching on Parameters with Dependent Types in Scala

I am currently working with a type hierarchy which represents a tree and an accompanying type hierarchy that represents steps of root-to-node paths in that tree. There are different kinds of nodes so different steps can be taken at each node. Therefore, the node types have a type member which is set to a trait including all valid steps. Take the following example:
// Steps
sealed trait Step
sealed trait LeafStep extends Step
sealed trait HorizontalStep extends Step
sealed trait VerticalStep extends Step
object Left extends HorizontalStep
object Right extends HorizontalStep
object Up extends VerticalStep
object Down extends VerticalStep
// The Tree
sealed trait Tree {
type PathType <: Step
}
case class Leaf() extends Tree {
override type PathType = LeafStep
}
case class Horizontal(left: Tree, right: Tree) extends Tree {
override type PathType = HorizontalStep
}
case class Vertical(up: Tree, down: Tree) extends Tree {
override type PathType = VerticalStep
}
In this example, given a tree the path Seq(Up, Right) would tell us to go to the "right" child of the "up" child of the root (assuming the tree's nodes have the fitting types). Of course, navigating the tree involves a bunch of code using PartialFunctions that is not shown in the example. However, in that process I want to provide a type-safe callback which is notified for every step that is taken and its arguments include both the step and the respective tree nodes.
My current approach is a function def callback(from: Tree, to: Tree)(step: from.PathType). That works without problems on the caller side, but I'm running into an issue when actually implementing such a callback that works with the data passed to it.
def callback(from: Tree, to: Tree)(step: from.PathType) = {
from match {
case f: Horizontal => step match {
case Left => ??? // do useful stuff here
case Right => ??? // do other useful stuff here
}
case _ => ()
}
}
In the function, the compiler is not convinced that Left and Right are of type from.PathType. Of course, I could simply add .asInstanceOf[f.PathType] and the code seems to work with that. However, the outer match gives us that f is of type Horizontal. We know that PathType of Horizontal objects is HorizontalStep and since f is identical to from we also know that from.PathType is of that same type. Finally, we can check that both Left and Right extend HorizontalStep. So, the above code should always be type-safe even without the cast.
Is that reasoning something the Scala compiler just does not check or am I missing a case where the types could differ? Is there a better way to achieve my goal of a type-safe callback? I'm using Scala 2.12.15
I'm afraid I do not have a thorough explanation on why this does not typecheck. (It seems as if for such dependent-typed parameters, the compiler just isn't able to apply the knowledge that was gained from pattern matching onto the other parameter (that was not explictly matched)).
Nevertheless, here is something that would work:
First, use a type parameter instead of a type member:
// (Steps as above)
// The Tree
sealed trait Tree[PathType <: Step]
// type alias to keep parameter lists simpler
// where we do not care..
type AnyTree = Tree[_ <: Step]
case class Leaf() extends Tree[LeafStep]
case class Horizontal(left: AnyTree, right: AnyTree) extends Tree[HorizontalStep]
case class Vertical(up: AnyTree, down: AnyTree) extends Tree[VerticalStep]
Then, this would fully typecheck:
def callback[S <: Step, F <: Tree[S]](from: F, to: AnyTree)(step: S) = {
def horizontalCallback(from: Horizontal)(step: HorizontalStep) = {
step match {
case Left => ??? // do useful stuff here
case Right => ??? // do other useful stuff here
}
}
from match {
case f: Horizontal =>
horizontalCallback(f)(step)
case _ =>
()
}
}
Confusingly, the compiler would not be able to properly check the pattern match on the step if one places it directly in the outer match like this (Edit: that is only true for 2.12 where this does not give a "match may not be exhaustive" warning - with 2.13 or 3.2, this checks fine):
def callback[S <: Step, F <: Tree[S]](from: F, to: AnyTree)(step: S) = {
from match {
case f: Horizontal =>
step match {
case Left => ??? // do useful stuff here
case Right => ??? // do other useful stuff here
}
case _ =>
()
}
}

Scala "or" in generic bounds

I know that I can write something like this:
case class Something[T <: Foo with Bar](state: T)
This accepts classes which have the traits (or class and trait) Foo and Bar. This is an AND example where it is needed to extend both Foo and Bar. Is there an option which allows me to pass classes extending Foo OR Bar to pattern match against them?
The use case is that I have multiple Classes with different behaviors which consume states which are of shared types:
trait FooState
trait BarState
trait BazState
case class Foo(state: FooState) // must not accept BarState or BazState
case class Bar(state: BarState) // must not accept FooState or BazState
case class Baz(state: BazState) // must not accept FooState or BarState
case class FooBar(state: FooState or BarState) // must not accept BazState
case class FooBaz(state: FooState or BazState) // must not accept BarState
case class BarBaz(state: BarState or BazState) // must not accept FooState
I know I can create another trait for every compound class, but this would force me to add it to everything that extends any of these previous traits.
Yes, you would usually use a typeclass to achieve what you want, and a context bound. Here's how:
trait Acceptable
object Acceptable {
implicit val fooIsGood = new Acceptable[Foo] {}
implicit val barIsGood = new Acceptable[Bar] {}
}
case class Something[T : Acceptable](state: T)
And you can play with it to implement whatever functionality you want using this pattern. Achieving a real union type bound be done with Either or co-products, but in most scenarios this may be simpler.
One possible way to do this is to use the Either type:
case class FooBar(state: Either[FooState, BarState]) {
def someOperation() = {
state match {
case Left(fooState) => ???
case Right(barState) => ???
}
}
}
What you've described is a union type. The current version of Scala does not support them as you've described them, however it is planned for Dotty.
If you need more flexibility than that (more than two types for example) consider using a Coproduct from a functional programming library. Scalaz, Cats and Shapeless all expose them.

Pattern matching with generics

Given the following class pattern match:
clazz match {
case MyClass => someMethod[MyClass]
}
Is it possible to refer to MyClass in a generic way based on what the pattern match came up with? For example, if I have multiple subclasses of MyClass, can I write a simple pattern match to pass the matched type to someMethod:
clazz match {
case m <: MyClass => someMethod[m]
}
Unfortunately types are not really first class citizens in Scala. This means for example that you cannot do pattern matching on types. A lot of information is lost due to stupid type erasure inherited from the Java platform.
I don't know if there are any improvement requests for this, but this is one of the worst problems in my option, so someone should really come up with such a request.
The truth is you will need to pass around evidence parameters, at best in the form of implicit parameters.
The best I can think of goes in the line of
class PayLoad
trait LowPriMaybeCarry {
implicit def no[C] = new NoCarry[C]
}
object MaybeCarry extends LowPriMaybeCarry {
implicit def canCarry[C <: PayLoad](c: C) = new Carry[C]
}
sealed trait MaybeCarry[C]
final class NoCarry[C] extends MaybeCarry[C]
final class Carry[C <: PayLoad] extends MaybeCarry[C] {
type C <: PayLoad
}
class SomeClass[C <: PayLoad]
def test[C]( implicit mc: MaybeCarry[C]) : Option[SomeClass[_]] = mc match {
case c: Carry[_] => Some(new SomeClass[ c.C ])
case _ => None
}
but still I can't get the implicits to work:
test[String]
test[PayLoad] // ouch, not doin it
test[PayLoad](new Carry[PayLoad]) // sucks
So if you want to save yourself serous brain damage, I would forget about the project or look for another language. Maybe Haskell is better here? I'm still hoping that we can eventually match types, but my hopes are pretty low.
Maybe the guys from scalaz have come up with a solution, they pretty much exploited the type system of Scala to the limits.
Your code is not really clear, because at least in java clazz is a typical name for variables of type java.lang.Class and variations. I still believe that clazz is not an instance of Class but of your own class.
In Java and Scala, given an object o: AnyRef you can get access to its class at runtime via o.getClass: Class[_], and for instance create instances of that class through the Reflection API. However, type parameters are passed at compile-time, so you can't pass a type as-is at compile time. Either you use AnyRef all over the place as type (which will work, I assume) or you use the reflection API if you have more advanced needs.

What are the disadvantages to declaring Scala case classes?

If you're writing code that's using lots of beautiful, immutable data structures, case classes appear to be a godsend, giving you all of the following for free with just one keyword:
Everything immutable by default
Getters automatically defined
Decent toString() implementation
Compliant equals() and hashCode()
Companion object with unapply() method for matching
But what are the disadvantages of defining an immutable data structure as a case class?
What restrictions does it place on the class or its clients?
Are there situations where you should prefer a non-case class?
First the good bits:
Everything immutable by default
Yes, and can even be overridden (using var) if you need it
Getters automatically defined
Possible in any class by prefixing params with val
Decent toString() implementation
Yes, very useful, but doable by hand on any class if necessary
Compliant equals() and hashCode()
Combined with easy pattern-matching, this is the main reason that people use case classes
Companion object with unapply() method for matching
Also possible to do by hand on any class by using extractors
This list should also include the uber-powerful copy method, one of the best things to come to Scala 2.8
Then the bad, there are only a handful of real restrictions with case classes:
You can't define apply in the companion object using the same signature as the compiler-generated method
In practice though, this is rarely a problem. Changing behaviour of the generated apply method is guaranteed to surprise users and should be strongly discouraged, the only justification for doing so is to validate input parameters - a task best done in the main constructor body (which also makes the validation available when using copy)
You can't subclass
True, though it's still possible for a case class to itself be a descendant. One common pattern is to build up a class hierarchy of traits, using case classes as the leaf nodes of the tree.
It's also worth noting the sealed modifier. Any subclass of a trait with this modifier must be declared in the same file. When pattern-matching against instances of the trait, the compiler can then warn you if you haven't checked for all possible concrete subclasses. When combined with case classes this can offer you a very high level level of confidence in your code if it compiles without warning.
As a subclass of Product, case classes can't have more than 22 parameters
No real workaround, except to stop abusing classes with this many params :)
Also...
One other restriction sometimes noted is that Scala doesn't (currently) support lazy params (like lazy vals, but as parameters). The workaround to this is to use a by-name param and assign it to a lazy val in the constructor. Unfortunately, by-name params don't mix with pattern matching, which prevents the technique being used with case classes as it breaks the compiler-generated extractor.
This is relevant if you want to implement highly-functional lazy data structures, and will hopefully be resolved with the addition of lazy params to a future release of Scala.
One big disadvantage: a case classes can't extend a case class. That's the restriction.
Other advantages you missed, listed for completeness: compliant serialization/deserialization, no need to use "new" keyword to create.
I prefer non-case classes for objects with mutable state, private state, or no state (e.g. most singleton components). Case classes for pretty much everything else.
I think the TDD principle apply here: do not over-design. When you declare something to be a case class, you are declaring a lot of functionality. That will decrease the flexibility you have in changing the class in the future.
For example, a case class has an equals method over the constructor parameters. You may not care about that when you first write your class, but, latter, may decide you want equality to ignore some of these parameters, or do something a bit different. However, client code may be written in the mean time that depends on case class equality.
Are there situations where you should prefer a non-case class?
Martin Odersky gives us a good starting point in his course Functional Programming Principles in Scala (Lecture 4.6 - Pattern Matching) that we could use when we must choose between class and case class.
The chapter 7 of Scala By Example contains the same example.
Say, we want to write an interpreter for arithmetic expressions. To
keep things simple initially, we restrict ourselves to just numbers
and + operations. Such expres- sions can be represented as a class
hierarchy, with an abstract base class Expr as the root, and two
subclasses Number and Sum. Then, an expression 1 + (3 + 7) would be represented as
new Sum( new Number(1), new Sum( new Number(3), new Number(7)))
abstract class Expr {
def eval: Int
}
class Number(n: Int) extends Expr {
def eval: Int = n
}
class Sum(e1: Expr, e2: Expr) extends Expr {
def eval: Int = e1.eval + e2.eval
}
Furthermore, adding a new Prod class does not entail any changes to existing code:
class Prod(e1: Expr, e2: Expr) extends Expr {
def eval: Int = e1.eval * e2.eval
}
In contrast, add a new method requires modification of all existing classes.
abstract class Expr {
def eval: Int
def print
}
class Number(n: Int) extends Expr {
def eval: Int = n
def print { Console.print(n) }
}
class Sum(e1: Expr, e2: Expr) extends Expr {
def eval: Int = e1.eval + e2.eval
def print {
Console.print("(")
print(e1)
Console.print("+")
print(e2)
Console.print(")")
}
}
The same problem solved with case classes.
abstract class Expr {
def eval: Int = this match {
case Number(n) => n
case Sum(e1, e2) => e1.eval + e2.eval
}
}
case class Number(n: Int) extends Expr
case class Sum(e1: Expr, e2: Expr) extends Expr
Adding a new method is a local change.
abstract class Expr {
def eval: Int = this match {
case Number(n) => n
case Sum(e1, e2) => e1.eval + e2.eval
}
def print = this match {
case Number(n) => Console.print(n)
case Sum(e1,e2) => {
Console.print("(")
print(e1)
Console.print("+")
print(e2)
Console.print(")")
}
}
}
Adding a new Prod class requires potentially change all pattern matching.
abstract class Expr {
def eval: Int = this match {
case Number(n) => n
case Sum(e1, e2) => e1.eval + e2.eval
case Prod(e1,e2) => e1.eval * e2.eval
}
def print = this match {
case Number(n) => Console.print(n)
case Sum(e1,e2) => {
Console.print("(")
print(e1)
Console.print("+")
print(e2)
Console.print(")")
}
case Prod(e1,e2) => ...
}
}
Transcript from the videolecture 4.6 Pattern Matching
Both of these designs are perfectly fine and choosing between them is sometimes a matter of style, but then nevertheless there are some criteria that are important.
One criteria could be, are you more often creating new sub-classes of expression or are you more often creating new methods? So it's a criterion that looks at the future extensibility and the possible extension pass of your system.
If what you do is mostly creating new subclasses, then actually the object oriented decomposition solution has the upper hand. The reason is that it's very easy and a very local change to just create a new subclass with an eval method, where as in the functional solution, you'd have to go back and change the code inside the eval method and add a new case to it.
On the other hand, if what you do will be create lots of new methods, but the class hierarchy itself will be kept relatively stable, then pattern matching is actually advantageous. Because, again, each new method in the pattern matching solution is just a local change, whether you put it in the base class, or maybe even outside the class hierarchy. Whereas a new method such as show in the object oriented decomposition would require a new incrementation is each sub class. So there would be more parts, That you have to touch.
So the problematic of this extensibility in two dimensions, where you might want to add new classes to a hierarchy, or you might want to add new methods, or maybe both, has been named the expression problem.
Remember: we must use this like a starting point and not like the only criteria.
I am quoting this from Scala cookbook by Alvin Alexander chapter 6: objects.
This is one of the many things that I found interesting in this book.
To provide multiple constructors for a case class, it’s important to know what the case class declaration actually does.
case class Person (var name: String)
If you look at the code the Scala compiler generates for the case class example, you’ll see that see it creates two output files, Person$.class and Person.class. If you disassemble Person$.class with the javap command, you’ll see that it contains an apply method, along with many others:
$ javap Person$
Compiled from "Person.scala"
public final class Person$ extends scala.runtime.AbstractFunction1 implements scala.ScalaObject,scala.Serializable{
public static final Person$ MODULE$;
public static {};
public final java.lang.String toString();
public scala.Option unapply(Person);
public Person apply(java.lang.String); // the apply method (returns a Person) public java.lang.Object readResolve();
public java.lang.Object apply(java.lang.Object);
}
You can also disassemble Person.class to see what it contains. For a simple class like this, it contains an additional 20 methods; this hidden bloat is one reason some developers don’t like case classes.

case class and traits

I want create a special calculator. I think that case class is a good idea for operations:
sealed class Expr
case class add(op1:Int, op2:Int) extends Expr
case class sub(op1:Int, op2:Int) extends Expr
case class mul(op1:Int, op2:Int) extends Expr
case class div(op1:Int, op2:Int) extends Expr
case class sqrt(op:Int) extends Expr
case class neg(op:Int) extends Expr
/* ... */
Now I can use match-case for parsing input.
Maybe, I should also use traits (ie: trait Distributivity, trait Commutativity and so on), Is that posible? Is that a good idea?
Before you start adding traits of not-so-clear additional value, you should get the basics right. The way you do it now makes these classes not very useful, at least not when building a classical AST (or "parse tree"). Imagine 4 * (3+5). Before you can use the multiplication operation, you have to evaluate the addition first. That makes things complicated. What you usually want to have is the ability to write your formula "at once", e.g. Mul(4,Add(3, 5)). However that won't work that way, as you can't get Ints or Doubles into your own class hierarchy. The usual solution is a wrapper class for Numbers, say "Num". Then we have: Mul(Num(4),Add(Num(3), Num(5)). This might look complicated, but now you have "all at once" and you can do things like introducing constants and variables, simplification (e.g. Mul(Num(1),x) --> x), derivation...
To get this, you need something along the lines
sealed trait Expr {
def eval:Int
}
case class Num(n:Int) extends Expr {
def eval = n
}
case class Neg(e: Expr) extends Expr {
def eval = - e.eval()
}
case class Add(e1: Expr, e2: Expr) extends Expr {
def eval = e1.eval + e2.eval
}
...
Now you can write a parser that turns "4*(3+5)" into Mul(Num(4),Add(Num(3), Num(5)), and get the result by calling eval on that expression.
Scala contains already a parse library called parser combinators. For an example close to the code above see http://jim-mcbeath.blogspot.com/2008/09/scala-parser-combinators.html