Exercise: implementing Stream in Scala - scala

I am following the book Functional programming in Scala and in particular the section where you implement a simple Stream trait and companion object. For reference, here is what we have so far in the companion obejct
object Stream {
def empty[A]: Stream[A] =
new Stream[A] {
def uncons = None
}
def cons[A](hd: => A, tl: => Stream[A]): Stream[A] =
new Stream[A] {
lazy val uncons = Some((hd, tl))
}
def apply[A](as: A*): Stream[A] =
if (as.isEmpty)
empty
else
cons(as.head, apply(as.tail: _*))
}
and the trait so far:
trait Stream[A] {
import Stream._
def uncons: Option[(A, Stream[A])]
def toList: List[A] = uncons match {
case None => Nil: List[A]
case Some((a, as)) => a :: as.toList
}
def #::(a: => A) = cons(a, this)
def take(n: Int): Stream[A] =
if (n <= 0)
empty
else (
uncons
map { case (a, as) => a #:: (as take (n - 1)) }
getOrElse empty
)
}
The next exercise requires me to write an implementation for takeWhile and I thought the following would do
def takeWhile(f: A => Boolean): Stream[A] = (
uncons
map { case (a, as) => if (f(a)) (a #:: (as takeWhile f)) else empty }
getOrElse empty
)
Unfortunately, is seems that I get a variance error that I am not able to track down:
error: type mismatch; found : Stream[_2] where type _2 <: A
required: Stream[A]
Note: _2 <: A, but trait Stream is invariant in type A.
You may wish to define A as +A instead. (SLS 4.5)
getOrElse empty
^
I could add a variance annotation, but before doing that I would like to understand what is going wrong here. Any suggestions?

this seems to be an issue with type inference, because it works if you explicitly specify the type of the subexpression uncons map { case (a, as) => if (f(a)) (a #:: (as takeWhile f)) else empty }.
def takeWhile(f: A => Boolean): Stream[A] = {
val mapped:Option[Stream[A]] = uncons map {
case (a, as) => if (f(a)) (a #:: (as takeWhile f)) else empty
}
mapped getOrElse empty
}

To complete a bit the other answer, the empty on this line:
map { case (a, as) => if (f(a)) (a #:: (as takeWhile f)) else empty }
is inferred as empty[Nothing], which means that (a #:: (as takeWhile f)) else empty is inferred as Stream[Foo <: A] and since a Stream[A] is expected and Stream is invariant, you have an error.
So this gives us the cleanest way to fix this: just annotate empty:
map { case (a, as) => if (f(a)) (a #:: (as takeWhile f)) else empty[A] }
And then it compiles fine.
This does not happen with the original Stream because it is covariant, so either you actually want Stream.empty to be a Stream[Nothing] (just like Nil is a List[Nothing]), or you don't care.
Now, as to exactly why it is inferred as empty[Nothing] and not empty[A], this is probably hidden somewhere in SLS 6.26.4 "Local Type Inference", but this part cannot really be accused of being easy to read...
As a rule a thumb, always be suspicious whenever you call methods:
that have type parameters whose only way to infer is the expected return type (usually because they have no arguments),
AND at the same time the expected return type is itself supposed to be inferred from somewhere else.

Related

Is it possible to pattern match on a by-name parameter without evaluating it?

Was playing with Lazy Structure Stream as below
import Stream._
sealed trait Stream[+A] {
..
def toList: List[A] = this match {
case Empty => Nil
case Cons(h, t) => println(s"${h()}::t().toList"); h()::t().toList
}
def foldRight[B](z: B) (f: ( A, => B) => B) : B = this match {
case Empty => println(s"foldRight of Empty return $z"); z
case Cons(h, t) => println(s"f(${h()}, t().foldRight(z)(f))"); f(h(), t().foldRight(z)(f))
}
..
}
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](h: => A, t: => Stream[A]): Stream[A] = {
lazy val hd = h
lazy val tl = t
Cons[A](() => hd, () => tl)
}
def empty[A]: Stream[A] = Empty
def apply[A](la: A*): Stream[A] = la match {
case list if list.isEmpty => empty[A]
case _ => cons(la.head, apply(la.tail:_*))
}
}
For a function takeWhile via foldRight i initially wrote:
def takeWhileFoldRight_0(p: A => Boolean) : Stream[A] = {
foldRight(empty[A]) {
case (a, b) if p(a) => println(s"takeWhileFoldRight cons($a, b) with p(a) returns: cons($a, b)"); cons(a, b)
case (a, b) if !p(a) => println(s"takeWhileFoldRight cons($a, b) with !p(a) returns: empty[A]"); empty[A]
}
}
Which when called as:
Stream(4,5,6).takeWhileFoldRight_0(_%2 == 0).toList
result in the following trace:
f(4, t().foldRight(z)(f))
f(5, t().foldRight(z)(f))
f(6, t().foldRight(z)(f))
foldRight of Empty return Empty
takeWhileFoldRight cons(6, b) with p(a) returns: cons(6, b)
takeWhileFoldRight cons(5, b) with !p(a) returns: empty[A]
takeWhileFoldRight cons(4, b) with p(a) returns: cons(4, b)
4::t().toList
res2: List[Int] = List(4)
Then questioning and questioning i figured that it might have been the unapply method in the pattern match that evaluate eagerly.
So i changed to
def takeWhileFoldRight(p: A => Boolean) : Stream[A] = {
foldRight(empty[A]) { (a, b) =>
if (p(a)) cons(a, b) else empty[A]
}
}
which when called as
Stream(4,5,6).takeWhileFoldRight(_%2 == 0).toList
result in the following trace:
f(4, t().foldRight(z)(f))
4::t().toList
f(5, t().foldRight(z)(f))
res1: List[Int] = List(4)
Hence my question:
Is there a way to recover the power of pattern match when working with by-name parameter ?
Said differently case i match parameter that are by-name without evaluating them eagerly ?
Or i have to go to a set of ugly nested "if" :p in that kind of scenario
Take a closer look at this fragment:
def toList: List[A] = this match {
case Empty => Nil
case Cons(h, t) => println(s"${h()}::t().toList"); h()::t().toList
}
def foldRight[B](z: B) (f: ( A, => B) => B) : B = this match {
case Empty => println(s"foldRight of Empty return $z"); z
case Cons(h, t) => println(s"f(${h()}, t().foldRight(z)(f))"); f(h(), t().foldRight(z)(f))
}
..
}
Here h and t in Cons aren't evaluated by unapply - after all unapply returns () => X functions without calling them. But you do. Twice for each match - once for printing and once for passing the result on. And you aren't remembering the result, so any future fold, map, etc would evaluate the function anew.
Depending on what behavior you want to have you should either:
Calculate the results once, right after matching them:
case Cons(h, t) =>
val hResult = h()
val tResult = t()
println(s"${hResult}::tail.toList")
hResult :: tResult.toList
or
not use case class because it cannot memoize the result and you might need to memoize it:
class Cons[A](fHead: () => A, fTail: () => Stream[A]) extends Stream[A] {
lazy val head: A = fHead()
lazy val tail: Stream[A] = fTail()
// also override: toString, equals, hashCode, ...
}
object Cons {
def apply[A](head: => A, tail: => Stream[A]): Stream[A] =
new Cons(() => head, () => tail)
def unapply[A](stream: Stream[A]): Option[(A, Stream[A])] = stream match {
case cons: Cons[A] => Some((cons.head, cons.tail)) // matches on type, doesn't use unapply
case _ => None
}
}
If you understand what you're doing you could also create a case class with overridden apply and unapply (like above) but that is almost always a signal that you shouldn't use a case class in the first place (because most likely toString, equals, hashCode, etc would have nonsensical implementation).

How stream passes incremental?

I am trying to understand how Stream works and have following Stream implementation:
sealed trait Stream[+A] {
def toList: List[A] = {
#annotation.tailrec
def go(s: Stream[A], acc: List[A]): List[A] = s match {
case Cons(h, t) => go(t(), h() :: acc)
case _ => acc
}
go(this, List()).reverse
}
def foldRight[B](z: => B)(f: (A, => B) => B): B =
this match {
case Cons(h, t) => f(h(), t().foldRight(z)(f))
case _ => z
}
def map[B](f: A => B): Stream[B] =
this.foldRight(Stream.empty[B])((x, y) => Stream.cons(f(x), y))
def filter(f: A => Boolean): Stream[A] =
this.foldRight(Stream.empty[A])((h, t) => if (f(h)) Stream.cons(h, t) else t)
}
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](hd: => A, t1: => Stream[A]): Stream[A] = {
lazy val head = hd
lazy val tail = t1
Cons(() => head, () => tail)
}
def empty[A]: Stream[A] = Empty
def apply[A](as: A*): Stream[A] =
if (as.isEmpty) empty else cons(as.head, apply(as.tail: _*))
}
and the code that is using Stream:
Stream(1,2,3,4).map((x) => {
println(x)
x + 10
}).filter((x) => {
println(x)
x % 2 == 0
}).toList
as output I've got:
1
11
2
12
3
13
4
14
res4: List[Int] = List(12, 14)
As you can see on the output, there is no intermediate result, the source will be pass one for one, how is that possible?
I can not image, how does it work.
Let's take a look at what the methods you used do on Stream:
map and filter are both implemented with foldRight. To make it clearer, let's inline foldRight inside map (the same can be done with filter), using the referential transparency principle:
def map[B](f: A => B) = this match {
case Cons(h, t) => Stream.cons(f(h()), t().map(f))
case _ => Empty
}
Now, where in this code is f evaluated? Never, since Stream.cons parameters are call-by-name, so we only give the description for the new stream, not its values.
Once you are convinced of this fact, you can easily see that the same will apply for filter, so we can move forward to toList.
It will evaluate each element in the Stream, putting the values in a List that will be reversed at the end.
But evaluating an element of the Stream which has been filtered and mapped is precisely reading the description of the values, so the actual functions are evaluated here. Hence the console output in order: first the map function is called then the filter function, for each element, one at a time (since we are now on the lazily mapped and filtered Stream).

Have to specify the parametric type

I have a Stream trait, that contains following method:
sealed trait Stream[+A] {
def takeWhile2(f: A => Boolean): Stream[A] =
this.foldRight(Stream.empty[A])((x, y) => {
if (f(x)) Stream.cons(x, y) else Stream.empty
})
#annotation.tailrec
final def exists(p: A => Boolean): Boolean = this match {
case Cons(h, t) => p(h()) || t().exists(p)
case _ => false
}
}
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](hd: => A, t1: => Stream[A]): Stream[A] = {
lazy val head = hd
lazy val tail = t1
Cons(() => head, () => tail)
}
def empty[A]: Stream[A] = Empty
def apply[A](as: A*): Stream[A] =
if (as.isEmpty) empty else cons(as.head, apply(as.tail: _*))
}
Take a look at takeWhile2 body, it calls foldRight function.
When I would pass Stream.empty instead of Stream.empty[A], I would get compiler error, why?
That's because foldRight infers its type parameter from its first parameter list (ie its zero element).
Since this first element is Stream.empty, the type inferred is Stream[Nothing], and so it expects the second parameter to be a (A, Stream[Nothing]) => Stream[Nothing], which is clearly not the case.
The same issue is true with any fold operator on collections, Option, ...
That's because you have casted (x,y) as Stream.empty[A] when f(x) is true but when f(x) is false it will return Stream.empty[Nothing] i.e. if you don't specify a dataType default of Nothing is used. So the Stream[A] (expected return type) doesn't match with returned value of Stream[Nothing]

scala Stream.takeWhile

I am implementing takeWhile method of trait Stream via foldRight.
My foldRight is following:
trait Stream[+A] {
def foldRight[B](z: => B)(f: (A, => B) => B): B =
uncons.map(t => {
f(t._1, t._2.foldRight(z)(f))
}).getOrElse(z)
}
My takeWhile is
def takeWhile(p: A => Boolean): Stream[A] =
uncons.filter(t => p(t._1)).map(t => Stream.cons(t._1, t._2.takeWhile(p))).getOrElse(Stream.empty)
But I want it to be implemented via foldRight. Here is the code:
def takeWhileViaFoldRight(p: A => Boolean): Stream[A] =
foldRight(Stream.empty)((x, acc) => {
if (p(x)) Stream.cons(x, acc) else Stream.empty
})
But my x in Stream.cons expression is underlined red with the following error: type mismatch; found : x.type (with underlying type A) required: Nothing. I guess this is because foldRight start value is Stream.empty -- with no type A indicated hence considered to be Nothing. If this is the case -- how can I tell foldRight that its return value is A, not Nothing? If not -- what's the problem then?
The courtesy of jdevelop's comment:
foldRight(Stream.empty[A])
will do the thing.

Correct encoding of this existential type in Scala?

I'm interested in encoding this Stream type from the Stream Fusion paper from Coutts et al. I'm exploring stream fusion in Scala, attempting to use macros in place of GHC's rewrite rules.
data Stream a = ∃s. Stream (s → Step a s) s
data Step a s = Done
| Yield a s
| Skip s
I've tried a few different approaches but I'm not sure how to encode the type of Stream in Scala such that both occurrences of S refer to the same type. I've written the Step type easily as.
sealed abstract class Step[+A, +S]
case object Done extends Step[Nothing, Nothing]
case class Yield[A, S](a: A, s: S) extends Step[A, S]
case class Skip[S](s: S) extends Step[Nothing, S]
So far this type seems correct. I've used covariance so that a function of type A => A will work even if we receive a Yield and return a Done or Step. Just like in Haskell.
My sticking point has been the signature of Stream. I've been attempting to define it as just a case class. The only signature that has worked so far is using an Exists type operator and Tuple to perserve the equality of type S in both components as below.
type Exists[P[_]] = P[T] forSome { type T }
case class Stream[A](t: Exists[({ type L[S] = (S => Step[A, S], S)})#L])
Is there a way to encode it such that the tuple is not needed? Something closer to Haskell's (assuming existential operator) this:
case class Stream(∃ S. f: S => Step[A, S], s: S)
where each member can be separate field.
It also occurs to me that I could encode this in an SML Module/Functor style like so:
trait Stream[A] {
type S <: AnyRef
val f: S => Step[A, S]
val s: S
}
object Stream {
def apply[A, S1 <: AnyRef](next: S1 => Step[A, S1], st: S1): Stream[A] = new Stream[A] {
type S = S1
val f = next
val s = st
}
def unapply[A](s: Stream[A]): Option[(s.f.type, s.s.type)] = Some(s.f, s.s)
}
but this is a little more complicated. I was hoping there exists a clearer way, that I am ignorant of. Also as I attempted to explore this path, I had to do a few things to satisfy the compiler such as add the AnyRef bound, and the unapply method doesn't work. With this error message from scalac:
scala> res2 match { case Stream(next, s) => (next, s) }
<console>:12: error: error during expansion of this match (this is a scalac bug).
The underlying error was: type mismatch;
found : Option[(<unapply-selector>.f.type, <unapply-selector>.s.type)]
required: Option[(s.f.type, s.s.type)]
res2 match { case Stream(next, s) => (next, s) }
^
First off, Step looks perfect to me. As for Stream, I think you're on the right track with the abstract type. Here's what I came up with (including implementations of the remaining methods in section 2.1 of the Coutts paper):
abstract class Stream[A] {
protected type S
def next: S => Step[A, S]
def state: S
def map[B](f: A => B): Stream[B] = {
val next: S => Step[B, S] = this.next(_) match {
case Done => Done
case Skip(s) => Skip(s)
case Yield(a, s) => Yield(f(a), s)
}
Stream(next, state)
}
def unstream: List[A] = {
def unfold(s: S): List[A] = next(s) match {
case Done => List.empty
case Skip(s) => unfold(s)
case Yield(a, s) => a :: unfold(s)
}
unfold(state)
}
}
object Stream {
def apply[A, S0](n: S0 => Step[A, S0], s: S0) = new Stream[A] {
type S = S0
val next = n
val state = s
}
def apply[A](as: List[A]): Stream[A] = {
val next: List[A] => Step[A, List[A]] = {
case a :: as => Yield(a, as)
case Nil => Done
}
Stream(next, as)
}
def unapply[A](s: Stream[A]): Option[(s.S => Step[A, s.S], s.S)] =
Some((s.next, s.state))
}
A couple things to note:
My unapply has a dependent method type: it depends on the s.S. I think that might have been your stumbling block.
The unfold method in unstream is not tail-recursive.
The thing I'm still not really clear on myself is why it's important for S to be existential / hidden / whatever. If it's not, you could just write:
case class Stream[A, S](next: S => Step[A, S], state: S)
... but I assume there's a reason for it. That being said, I'm also not sure this approach actually hides S the way you want. But this is my story and I'm sticking to it.