How can I make my immutable binary search-tree generic in Scala? - scala

I am newcomer to Scala. I'm trying to develop my own immutable binary search tree.
Firstly, I developed a binary search tree that takes Int on its nodes. After that , I decided to develop generic binary search tree.
When I compiled these codes , I took these error message from terminal.
trait GenericBST[+T] {
def add[TT >: T](x: T): GenericBST[TT] = this match {
case Empty => Branch(x, Empty, Empty)
case Branch(d, l, r) =>
if(d > x) Branch(d, l.add(x), r)
else if(d < x) Branch(d, l, r.add(x))
else this
}
}
case class Branch[+T](x: T, left: GenericBST[T], right: GenericBST[T]) extends GenericBST[T]
case object Empty extends GenericBST[Nothing]
error: value < is not member of type paramater T.
The error is sensible, how can I fix it?
Don't forget I am newcomer for Scala, so please explain this in detail for me.

T represents any type, but in order to use > and < you need a type for which ordering makes sense.
In scala words, it means to you have to put a bound of the type T, restricting it to all T for which an Ordering[T] exists. You can use a context bound, or equivalently require an implicit ord of type Ordering[TT].
trait GenericBST[+A] {
def add[B >: A](x: B)(implicit ord: Ordering[B]): GenericBST[B] = {
import ord.mkOrderingOps
this match {
case Empty => Branch(x, Empty, Empty)
case Branch(e, l, r) =>
if (e > x) Branch(e, l.add(x), r)
else if (e < x) Branch(e, l, r.add(x))
else this
}
}
}
case class Branch[+A](x: A, left: GenericBST[A], right: GenericBST[A]) extends GenericBST[A]
case object Empty extends GenericBST[Nothing]
Importing ord.mkOrderingOps allows for the syntax
e > x
instead of
ord.gt(e, x)
You could also use a context bound directly, but it would require some extra work to get the implicit ord in scope (and it's arguably less readable):
def add[B >: A : Ordering](x: B): GenericBST[B] = {
val ord = implicitly[Ordering[B]]
import ord.mkOrderingOps
...
}
Absolutely not relevant, but you might be wondering why I used A and B in my example, as opposed to T and TT. According to the official style guide:
For simple type parameters, a single upper-case letter (from the English alphabet) should be used, starting with A (this is different than the Java convention of starting with T)

Related

type stable parametric polymorphism

I don't understand why the following scala code doesn't compile:
sealed trait A
case class B() extends A {
def funcB: B = this
}
case class C() extends A {
def funcC: C = this
}
def f[T <: A](s:T): T = s match {
case s: B => s.funcB
case s: C => s.funcC
}
It works to replace f with
def f[T <: A](s:T): A = s match {
case s: B => s.funcB
case s: C => s.funcC
}
and then cast to the subtype when f is called, using asInstanceOf, for example. But I would like to be able to construct a function which unifies some previously defined methods, and have them be type stable. Can anyone please explain?
Also, note that the following f also compiles:
def f[T <: A](s:T): T = s match {
case s: B => s
case s: C => s
}
What makes it work?
In particular, in Scala 3 you could use match types
scala> type Foo[T <: A] = T match {
| case B => B
| case C => C
| }
|
| def f[T <: A](s:T): Foo[T] = s match {
| case s: B => s.funcB
| case s: C => s.funcC
| }
def f[T <: A](s: T): Foo[T]
scala> f(B())
val res0: B = B()
scala> f(C())
val res1: C = C()
In general, for the solution to "return current type" problems see Scala FAQ How can a method in a superclass return a value of the “current” type?
Compile-time techniques such as type classes and match types can be considered as kind of compile-time pattern matching which instruct the compiler to reduce to the most specific informationally rich type used at call site instead of otherwise having to determine a probably poorer upper bound type.
Why it does not work?
The key concept to understand is that parametric polymorphism is a kind of universal quantification which means it must make sense to the compiler for all instantiations of type parameters at call-sites. Consider typing specification
def f[T <: A](s: T): T
which the compiler might interpret something like so
For all types T that are a subtype of A, then f should return that
particular subtype T.
hence the expression expr representing the body of f
def f[T <: A](s:T): T = expr
must type to that particular T. Now lets try to type our expr
s match {
case s: B => s.funcB
case s: C => s.funcC
}
The type of
case s: B => s.funcB
is B, and the type of
case s: C => s.funcC
is C. Given we have B and C, now compiler has to take the least upper bound of the two which is A. But A is certainly not always T. Hence the typecheck fails.
Now lets do the same exercise with
def f[T <: A](s: T): A
This specification means (and observe the "for all" again)
For all types T that are a subtype of A, then f should return
their supertype A.
Now lets type the method body expressions
s match {
case s: B => s.funcB
case s: C => s.funcC
}
As before we arrive at types B and C, so compiler takes the upper bound which is the supertype A. And indeed this is the very return type we specified. So typecheck succeeds. However despite succeeding, at compile-time we lost some typing information as compiler will no longer consider all the information that comes with specific T passed in at call-site but only the information available via its supertype A. For example, if T has a member not existing in A, then we will not be able to call it.
What to avoid?
Regarding asInstanceOf, this is us telling the compiler to stop helping us because we will take the rains. Two groups of people tend to use it in Scala to make things work, the mad scientist library authors and ones transitioning from other more dynamically typed languages. However in most application level code it is considered bad practice.
It all comes down to our old friend (fiend?) the compile-time/run-time barrier. (And ne'er the twain shall meet.)
T is resolved at compile-time at the call site. When the compiler sees f(B) then T means B and when the compiler sees f(C) then T becomes C.
But match { case ... is resolved at run-time. The compiler can't know which case branch will be chosen. From the compiler's point of view all case options are equally likely. So if T is resolved to B but the code might take a C branch...well, the compiler can't allow that.
Looking at what does compile:
def f[T <: A](s:T): A = s match { //f() returns an A
case s: B => s.funcB //B is an A sub-type
case s: C => s.funcC //C is an A sub-type
} //OK, all is good
Your 2nd "also works" example does not compile for me.
To answer the question why it does not work.
f returns the result of the statement s match {...}.
The type of that statement is A (sometimes it returns B, and sometimes it returns C), not T as it is supposed to be. T is sometimes C, and sometimes B, s match {...} is never either of those. It is the supertype of them, which is A.
Re. this:
s match {
case s: B => s
case s: C => s
}
The type of this statement is obviously T, because s is T. It certainly does compile despite what #jwvh might be saying :)

Scala: type inference issue

I have:
sealed trait Par[A]{def foo = ???}
case class Unit[A](v: () => A) extends Par[A]
case class Map2[L, R, A](parLeft: Par[L],
parRight: Par[R],
map: (L, R) => A) extends Par[A]
my problem is that when I pattern match on p:Par[A] to do something like this:
def getSequentially: A = this match {
case Par.Unit(f) => f()
case Par.Map2(a, b, f) => f(a.getSequentially, b.getSequentially)
}
L and R are inferred to be Any in Intellij's type inspector and getSequentially calls are highlighted in red, warning: type mismatch, expected Nothing, actual Any, since f is expected to be of type: (Nothing, Nothing) => A. Although it actually runs and compiles.
I think I understand what the problem is and I should be able to solve it with existential types in the definition of Map2. Only problem is that the map parameter has a dependent type, so I don't know how to do that. Maybe I should do a variation of the AUX pattern?
My question is, firstly why it compiles and secondly, if there is a way of restructuring the type dependency so that it no longer issues a warning.
If you want to have existentials you can use typed patterns:
sealed trait Par[A]{
def getSequentially: A = this match {
case x: Par.Unit[A] => x.v()
case x: Par.Map2[_, _, A] => x.map(x.parLeft.getSequentially, x.parRight.getSequentially)
}
}
object Par {
case class Unit[A](v: () => A) extends Par[A]
case class Map2[L, R, A](parLeft: Par[L],
parRight: Par[R],
map: (L, R) => A) extends Par[A]
}
IntelliJ seems not to highlight this.

Restricting a method to a subset of values of the caller's type parameter

Summary: I want to add an instance method to instances of a parametrized type, but only for some values of the type parameter. Specifically, I have List[E], but I only want instances of List[List[_]] to have a flatten() method.
I am learning the basics of Scala and functional programming by following along with the exercises in Functional Programming in Scala by Chiusano & Bjarnason.
Suppose I have a type List[E] and a companion object List that has methods for working with instances of List[E].
sealed trait List[+E]
case object Nil extends List[Nothing]
case class Cons[+E](head: E, tail: List[E]) extends List[E]
object List {
def flatten[E](aListOfLists: List[List[E]]): List[E] = Nil
def foldLeft[E, F](aList: List[E])(acc: F)(f: (F, E) ⇒ F): F = acc
}
Now suppose I want to create analogous methods on List instances that simply forward the calls to the companion object. I would try to augment the trait definition as follows.
sealed trait List[+E] {
def foldLeft[F](acc: F)(f: (F, E) => F) = List.foldLeft(this)(acc)(f)
}
I run into a complication: List.foldLeft() works with any List[E], but List.flatten() expects a List[List[E]] argument. Thus, I only want instances of List[List[_]] to have this method. How can I add flatten() to the appropriate subset of List instances? How do I use Scala's type system to express this restriction?
We can build up what we need piece by piece. First we know that we need a type parameter for our flatten, since we don't otherwise have a way to refer to the inner element type:
sealed trait List[+E] {
def flatten[I] // ???
}
Next we need some way of establishing that our E is List[I]. We can't add constraints to E itself, since in many cases it won't be List[I] for any I, but we can require implicit evidence that this relationship must hold if we want to be able to call flatten:
sealed trait List[+E] {
def flatten[I](implicit ev: E <:< List[I]) = ???
}
Note that for reasons related to variance (and type inference) we need to use <:< instead of =:=.
Next we can add the return type, which we know must be List[I]:
sealed trait List[+E] {
def flatten[I](implicit ev: E <:< List[I]): List[I] = ???
}
Now we want to be able to call List.flatten on a List[List[I]]. Our ev allows us to convert values of type E into List[I], but we don't have E values, we just have a List[E]. There are a number of ways you could fix this, but I'll just go ahead and define a map method and use that:
sealed trait List[+E] {
def map[B](f: E => B): List[B] = this match {
case Nil => Nil
case Cons(h, t) => Cons(f(h), t.map(f))
}
def flatten[I](implicit ev: E <:< List[I]): List[I] = List.flatten(map(ev))
}
And then:
val l1 = Cons(1, Cons(2, Nil))
val l2 = Cons(3, Cons(4, Cons(5, Nil)))
val nested = Cons(l1, Cons(l2, Nil))
val flattened: List[Int] = nested.flatten
This won't actually work, since your List.flatten is broken, but it should when you fix it.

Deriving type class instances for case classes with exactly one field

I'm working on a CSV parsing library (tabulate). It uses simple type classes for encoding / decoding: encoding, for example, is done with instances of CellEncoder (to encode a single cell) and RowEncoder (to encode entire rows).
Using shapeless, I've found it pretty straightforward to automatically derive the following type class instances:
RowEncoder[A] if A is a case class whose fields all have a CellEncoder.
RowEncoder[A] if A is an ADT whose alternatives all have a RowEncoder.
CellEncoder[A] if A is an ADT whose alternatives all have a CellEncoder.
The thing is, this last one turns out to be almost entirely useless in real life situations: an ADT's alternatives are almost always case classes, and I cannot derive a CellEncoder for a case class that has more than one field.
What I'd like to be able to do, however, is derive a CellEncoder for case classes that have a single field whose type has a CellEncoder. That would cover, for example, Either, scalaz's \/, cats' Xor...
This is what I have so far:
implicit def caseClass1CellEncoder[A, H](implicit gen: Generic.Aux[A, H :: HNil], c: CellEncoder[H]): CellEncoder[A] =
CellEncoder((a: A) => gen.to(a) match {
case h :: t => c.encode(h)
})
This works fine when used explicitly:
case class Bar(xs: String)
caseClass1CellEncoder[Bar, String]
res0: tabulate.CellEncoder[Bar] = tabulate.CellEncoder$$anon$2#7941904b
I can't however get it to work implicitly, the following fails:
implicitly[CellEncoder[Bar]]
>> could not find implicit value for parameter e: tabulate.CellEncoder[Test.this.Bar]
I've also tried the following, with no more success:
implicit def testEncoder[A, H, R <: H :: HNil](implicit gen: Generic.Aux[A, R], c: CellEncoder[H]): CellEncoder[A] =
CellEncoder((a: A) => gen.to(a) match {
case h :: t => c.encode(h)
})
Am I missing something? Is what I'm trying to do even possible?
It's a little tricky to get the H inferred correctly, but you can do it with a <:< instance:
import shapeless._
case class CellEncoder[A](encode: A => String)
implicit val stringCellEncoder: CellEncoder[String] = CellEncoder(identity)
implicit val intCellEncoder: CellEncoder[Int] = CellEncoder(_.toString)
case class Bar(xs: String)
implicit def caseClass1CellEncoder[A, R, H](implicit
gen: Generic.Aux[A, R],
ev: R <:< (H :: HNil),
c: CellEncoder[H]
): CellEncoder[A] = CellEncoder(
(a: A) => ev(gen.to(a)) match {
case h :: t => c.encode(h)
}
)
(I've made up a simple CellEncoder for the sake of a complete working example.)
This works because R can be inferred when the compiler is looking for an Generic.Aux[A, R] instance, and can then guide the inference of H when looking for a value for ev.

Creating sum tree of binary tree scala

For a homework assignment I wrote some scala code in which I have the following classes and object (used for modeling a binary tree):
object Tree {
def fold[B](t: Tree, e: B, n: (Int, B, B) => B): B = t match {
case Node(value, l, r) => n(value,fold(l,e,n),fold(r,e,n))
case _ => e
}
def sumTree(t: Tree): Tree =
fold(t, Nil(), (a, b: Tree, c: Tree) => {
val left = b match {
case Node(value, _, _) => value
case _ => 0
}
val right = c match {
case Node(value, _, _) => value
case _ => 0
}
Node(a+left+right,b,c)
})
}
abstract case class Tree
case class Node(value: Int, left: Tree, right: Tree) extends Tree
case class Nil extends Tree
My question is about the sumTree function which creates a new tree where the nodes have values equal to the sum of the values of its children plus it's own value.
I find it rather ugly looking and I wonder if there is a better way to do this. If I use recursion which works top-down this would be easier, but I could not come up with such a function.
I have to implement the fold function, with a signature as in the code, to calculate sumTree
I got the feeling this can be implemented in a better way, maybe you have suggestions?
First of all, I believe and if I may say so, you've done a very good job. I can suggest a couple of slight changes to your code:
abstract class Tree
case class Node(value: Int, left: Tree, right: Tree) extends Tree
case object Nil extends Tree
Tree doesn't need to be a case-class, besides using a case-class as non-leaf node is deprecated because of possible erroneous behaviour of automatically generated methods.
Nil is a singleton and best defined as a case-object instead of case-class.
Additionally consider qualifying super class Tree with sealed. sealed tells compiler that the class can only be inherited from within the same source file. This lets compiler emit warnings whenever a following match expression is not exhaustive - in other words doesn't include all possible cases.
sealed abstract class Tree
The next couple of improvement could be made to the sumTree:
def sumTree(t: Tree) = {
// create a helper function to extract Tree value
val nodeValue: Tree=>Int = {
case Node(v,_,_) => v
case _ => 0
}
// parametrise fold with Tree to aid type inference further down the line
fold[Tree](t,Nil,(acc,l,r)=>Node(acc + nodeValue(l) + nodeValue(r) ,l,r))
}
nodeValue helper function can also be defined as (the alternative notation I used above is possible because a sequence of cases in curly braces is treated as a function literal):
def nodeValue (t:Tree) = t match {
case Node(v,_,_) => v
case _ => 0
}
Next little improvement is parametrising fold method with Tree (fold[Tree]). Because Scala type inferer works through the expression sequentially left-to-right telling it early that we're going to deal with Tree's lets us omit type information when defining function literal which is passed to fold further on.
So here is the full code including suggestions:
sealed abstract class Tree
case class Node(value: Int, left: Tree, right: Tree) extends Tree
case object Nil extends Tree
object Tree {
def fold[B](t: Tree, e: B, n: (Int, B, B) => B): B = t match {
case Node(value, l, r) => n(value,fold(l,e,n),fold(r,e,n))
case _ => e
}
def sumTree(t: Tree) = {
val nodeValue: Tree=>Int = {
case Node(v,_,_) => v
case _ => 0
}
fold[Tree](t,Nil,(acc,l,r)=>Node(acc + nodeValue(l) + nodeValue(r) ,l,r))
}
}
The recursion you came up with is the only possible direction that lets you traverse the tree and produce a modified copy of the immutable data structure. Any leaf nodes have to be created first before being added to the root, because individual nodes of the tree are immutable and all objects necessary to construct a node have to be known before the construction: leaf nodes need to be created before you can create root node.
As Vlad writes, your solution has about the only general shape you can have with such a fold.
Still there is a way to get rid of the node value matching, not only factor it out. And personally I would prefer it that way.
You use match because not every result you get from a recursive fold carries a sum with it. Yes, not every Tree can carry it, Nil has no place for a value, but your fold is not limited to Trees, is it?
So let's have:
case class TreePlus[A](value: A, tree: Tree)
Now we can fold it like this:
def sumTree(t: Tree) = fold[TreePlus[Int]](t, TreePlus(0, Nil), (v, l, r) => {
val sum = v+l.value+r.value
TreePlus(sum, Node(sum, l.tree, r.tree))
}.tree
Of course the TreePlus is not really needed as we have the canonical product Tuple2 in the standard library.
Your solution is probably more efficient (certainly uses less stack), but here's a recursive solution, fwiw
def sum( tree:Tree):Tree ={
tree match{
case Nil =>Nil
case Tree(a, b, c) =>val left = sum(b)
val right = sum(c)
Tree(a+total(left)+total(right), left, right)
}
}
def total(tree:Tree):Int = {
tree match{
case Nil => 0
case Tree(a, _, _) =>a
}
You've probably turned in your homework already, but I think it's still worth pointing out that the way your code (and the code in other people's answers) looks like is a direct result of how you modeled the binary trees. If, instead of using an algebraic data type (Tree, Node, Nil), you had gone with a recursive type definition, you wouldn't have had to use pattern matching to decompose your binary trees. Here's my definition of a binary tree:
case class Tree[A](value: A, left: Option[Tree[A]], right: Option[Tree[A]])
As you can see there's no need for Node or Nil here (the latter is just glorified null anyway - you don't want anything like this in your code, do you?).
With such definition, fold is essentially a one-liner:
def fold[A,B](t: Tree[A], z: B)(op: (A, B, B) => B): B =
op(t.value, t.left map (fold(_, z)(op)) getOrElse z, t.right map (fold(_, z)(op)) getOrElse z)
And sumTree is also short and sweet:
def sumTree(tree: Tree[Int]) = fold(tree, None: Option[Tree[Int]]) { (value, left, right) =>
Some(Tree(value + valueOf(left, 0) + valueOf(right, 0), left , right))
}.get
where valueOf helper is defined as:
def valueOf[A](ot: Option[Tree[A]], df: A): A = ot map (_.value) getOrElse df
No pattern matching needed anywhere - all because of a nice recursive definition of binary trees.