What is meant by not generating the answer lazily in this code? - scala

I came across the following code:
/*
Unlike `take`, `drop` is not incremental. That is, it doesn't generate the
answer lazily. It must traverse the first `n` elements of the stream eagerly.
*/
#annotation.tailrec
final def drop(n: Int): Stream[A] = this match {
case Cons(_, t) if n > 0 => t().drop(n - 1)
case _ => this
}
/*
`take` first checks if n==0. In that case we need not look at the stream at all.
*/
def take(n: Int): Stream[A] = this match {
case Cons(h, t) if n > 1 => cons(h(), t().take(n - 1))
case Cons(h, _) if n == 1 => cons(h(), empty)
case _ => empty
}
Could someone explain what is meant by the comment:
Unlike take, drop is not incremental. That is, it doesn't generate the
answer lazily. It must traverse the first n elements of the stream eagerly.
To me, it looks like both the drop and take functions have to traverse the first n elements of the stream eagerly? What is it about the drop function that causes the first n elements to be eagerly traversed?
(Full code context here: https://github.com/fpinscala/fpinscala/blob/master/answers/src/main/scala/fpinscala/laziness/Stream.scala)

The definition for Cons is:
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
Notice that the second parameter, t, takes a function (from Unit to Stream[A]), not the evaluation of that function. This is not evaluated until required, and hence is lazy, as is the take method that calls it.
Compare this to drop which calls t() itself rather than passing it into the Cons, forcing the immediate evaluation.

The key point is that cons is lazy. That is if the recursion is inside of cons, the recursion won't happen until the tail of the generated list is actually accessed. Whereas if the recursion is outside, it happens right away.
So drop is eager because the recursion is not inside a cons (or any other lazy construct).

Related

Why is this function not tail recursive?

I'm writing a simple contains method for a list like structure. I want it to be optimized for tail recursion but can't figure out why the compiler is complaining.
The Cons case is tail recursive but not the repeat case even though they're making the same call on the same data structure. Maybe I'm not understanding tail recursion properly. If someone could clear this up I would a grateful.
final def contains[B >: A](target: B): Boolean = this match{
case Empty => false
case Cons( h, t ) => h == target || t.contains( target )
case Repeat( _, l ) => l.contains( target )
}
A tail-recursive function is defined as one whose last statement is either returning a plain value or calling itself, and that has to be the only recursive call.
First, your function doesn't have recursive calls, because you are not calling contains function again. But, you are calling the contains method of another instance.
That can be solved, by moving the logic outside the class.
However, there is another common problem.
This: h == target || t.contains( target ) is not a tail-recursive call, since the last operation is not the call to contains but the or (||) executed with its result.
Here is how you may refactor it.
def contains[A](list: MyList[A])(target: A): Boolean = {
#annotation.tailrec
def loop(remaining: MyList[A]): Boolean =
remaining match {
case Empty => false
case Cons(h, t) => if (h == target) true else loop(remaining = t)
case Repeat(_, l) => loop(remaining = l)
}
loop(remaining = list)
}
If you still want the method on your class, you can forward it to call this helper function passing this as the initial value.

Is there a concept for 'fold with break' or 'find with accumulator' in functional programming?

Title says it all, really; iterating over collection while preserving state between loops and finishing iteration based on termination condition in addition to simply running out of elements may be the most common pattern to accomplish anything in imperative programming. It seems to me however like it's something functional gentleprogrammers agreed to not talk about, or at least I never encountered an idiom for it or a semi-standarized name such as with map, fold, reduce, etc.
I often use the followinig code in scala:
implicit class FoldWhile[T](private val items :Iterable[T]) extends AnyVal {
def foldWhile[A](start :A)(until :A=>Boolean)(op :(A, T)=>A) :A = {
if (until(start)) start
else {
var accumulator = start
items.find{ e => accumulator = op(accumulator, e); until(accumulator) }
accumulator
}
}
}
But it's ugly. Whenever I try a more declarative approach, I come with even longer and almost surely slower code, akin to:
Iterator.iterate((start, items.iterator)){
case (acc, i) if until(acc) => (acc, i)
case (acc, i) if i.hasNext => (op(acc, i.next()), i)
case x => x
}.dropWhile {
case (acc, i) => !until(acc) && i.hasNext
}.next()._1
(A more functional variant would use Lists or Streams, but iterators have arguably lesser overhead than converting items to a Stream, as default implementation for the latter uses an iterator underneath anyway).
My questions are:
1) Does this concept have a name in functional programming, and if so, what is the pattern associated with its implementation?
2) What would be the best (i.e. concise, generic, lazy, and with least overhead) way to implememnt it in scala?
This is frowned upon by scala purists, but you can use a return statement like this:
def foldWhile[A](zero: A)(until:A => Boolean)(op: (A,T) => A): A = items.fold(zero) {
case (a, b) if until(a) => return a
case (a,b) => op(a, b)
}
Or, if you are one of those frowning, and would like a purely functional solution without dirty imperative tricks, you can use something lazy, like an iterator or a stream:
items
.toStream // or .iterator - it doesn't really matter much in this case
.scanLeft(zero)(op)
.find(until)
The functional way of doing such things is via Tail Recursion:
implicit class FoldWhile[T](val items: Iterable[T]) extends AnyVal {
def foldWhile[A](zero: A)(until: A => Boolean)(op: (A, T) => A): A = {
#tailrec def loop(acc: A, remaining: Iterable[T]): A =
if (remaining.isEmpty || !until(acc)) acc else loop(op(acc, remaining.head), remaining.tail)
loop(zero, items)
}
}
Using recursion you can decide at each step if you want to proceed or not without using break and without any overhead, because tail recursions are converted to iterations from the compiler.
Also, pattern matching is often used to decompose sequences. For example, if you had a List you could do:
implicit class FoldWhile[T](val items: List[T]) extends AnyVal {
def foldWhile[A](zero: A)(until: A => Boolean)(op: (A, T) => A): A = {
#tailrec def loop(acc: A, remaining: List[T]): A = remaining match {
case Nil => acc
case _ if !until(acc) => acc
case h :: t => loop(op(acc, h), t)
}
loop(zero, items)
}
}
Scala has the #scala.annotation.tailrec annotation to force compilation to fail if the function you're annotating is not tail recursive. I suggest you use it as much as you can because it helps both to avoid errors and document the code.
The functional name for this is an Iteratee.
There are a bunch of references about this, but it's probably better to start from where the design ended up by reading the Pipes Tutorial and only if you're interested working backwards from there to see how it came from an early terminating left fold.
A right fold, when done lazily, can do early termination. In Haskell, for example, you can write the find function (return first element of list that satisfies predicate) with foldr:
find :: (a -> Bool) -> [a] -> Maybe a
find p = foldr (\a r -> if p a then Just a else r) Nothing
-- For reference:
foldr :: (a -> r -> r) -> r -> [a] -> r
foldr _ z [] = []
foldr f z (a:as) = f a (foldr f z as)
What happens when you try, say, find even [1..]? (Note that this is an infinite list!)
find even [1..]
= foldr (\a r -> if even a then Just a else r) Nothing [1..]
= if even 1
then Just 1
else foldr (\a r -> if even a then Just a else r) Nothing ([2..])
= if False
then Just 1
else foldr (\a r -> if even a then Just a else r) Nothing ([2..])
= foldr (\a r -> if even a then Just a else r) Nothing ([2..])
= if even 2
then Just 2
else foldr (\a r -> if even a then Just a else r) Nothing ([3..])
= if True
then Just 2
else foldr (\a r -> if even a then Just a else r) Nothing ([3..])
= Just 2
Laziness means that the function that we fold with (\a r -> if even a then Just a else r) gets to decide whether to force the r argument—the one whose evaluation requires us to recurse down the list—at all. So when even 2 evaluates to True, we pick the branch of the if ... then ... else ... that discards the result computed off the tail of the list—which means we never evaluate it. (It also runs in constant space as well. While programmers in eager functional languages learn to avoid foldr because of space and termination issues, those aren't always true in lazy languages!)
This of course hinges on the fact that Haskell is lazily evaluated, but it should nevertheless be possible to simulate this in an eager language like Scala—I do know it has a lazy val feature that might be usable for this. It looks like you'd need to write a lazyFold function that does a right fold, but the recursion happens inside a lazy value. You might still have problems with space usage, though.

Common pattern of scan() where I don't ultimately care about the state

I find myself constantly doing things like the following:
val adjustedActions = actions.scanLeft((1.0, null: CorpAction)){
case ((runningSplitAdj, _), action) => action match {
case Dividend(date, amount) =>
(runningSplitAdj, Dividend(date, amount * runningSplitAdj))
case s # Split(date, sharesForOne) =>
((runningSplitAdj * sharesForOne), s)
}
}
.drop(1).map(_._2)
Where I need to accumulate the runningSplitAdj, in this case, in order to correct the dividends in the actions list. Here, I use scan to maintain the state that I need in order to correct the actions, but in the end, I only need the actions. Hence, I need to use null for the initial action in the state, but in the end, drop that item and map away all the states.
Is there a more elegant way of structuring these? In the context of RxScala Observables, I actually made a new operator to do this (after some help from the RxJava mailing list):
implicit class ScanMappingObs[X](val obs: Observable[X]) extends AnyVal {
def scanMap[S,Y](f: (X,S) => (Y,S), s0: S): Observable[Y] = {
val y0: Y = null.asInstanceOf[Y]
// drop(1) because scan also emits initial state
obs.scan((y0, s0)){case ((y, s), x) => f(x, s)}.drop(1).map(_._1)
}
}
However, now I find myself doing it to Lists and Vectors too, so I wonder if there is something more general I can do?
The combinator you're describing (or at least something very similar) is often called mapAccum. Take the following simplified use of scanLeft:
val xs = (1 to 10).toList
val result1 = xs.scanLeft((1, 0.0)) {
case ((acc, _), i) => (acc + i, i.toDouble / acc)
}.tail.map(_._2)
This is equivalent to the following (which uses Scalaz's implementation of mapAccumLeft):
xs.mapAccumLeft[Double, Int](1, {
case (acc, i) => (acc + i, i.toDouble / acc)
})._2
mapAccumLeft returns a pair of the final state and a sequence of the results at each step, but it doesn't require you to specify a spurious initial result (that will just be ignored and then dropped), and you don't have to map over the entire collection to get rid of the state—you just take the second member of the pair.
Unfortunately mapAccumLeft isn't available in the standard library, but if you're looking for a name or for ideas about implementation, this is a place to start.

Functional way to loop over nested list

I was given a question to compare two trees. Something like below:
case class Node(elem:String, child:List[Node])
In order to compare each elements of the trees, I have following functions:
def compare(n1:Node, n2:Node): Boolean {
if(n1.elem == n2.elem){
return compare(n1.child, n2.child)
}
}
def compare(c1:List[Node], c2:List[Node]): Boolean {
while (c1.notEmpty) {
//filter, map etc call function above to compare the element recursively
}
}
Basically algorithm is for each elements in n1, we are checking for a match in n2. I was told that this is very imperative way and not functional way. What would be a functional way to achieve this behaviour. In other words, how do we remove while loop when comparing the list of children?
Consider zipping both lists and using forall which holds true only if each and every predicate it processes evaluates to true; for instance like this,
def compare(c1: List[Node], c2: List[Node]): Boolean =
(c1.sorted zip c2.sorted).forall(c => compare(c._1,c._2))
Note that forall will halt the evaluations as it encounters the first false. Cases of unequal length lists of nodes may be tackled with zipAll by defining an EmptyNode class for padding length differences; also both lists empty may compare to true.
Update
Sorted lists of nodes for soundness following comment by #JohnB.
If I understood your question correctly, you want to compare every element of the first list with every element of the second list. The following code achieves this. It gets rid of the while loop via a tail-recursion.
import scala.annotation.tailrec
def cmp(a:Int, b:Int) = a > b
#tailrec
def compare(xs: List[Int], ys: List[Int]): Boolean = xs match {
case Nil => true
case head :: tail if ys.forall(x => cmp(head, x)) => compare(tail, ys)
case _ => false
}

Mapping many Eithers to one Either with many

Say I have a monadic function in called processOne defined like this:
def processOne(input: Input): Either[ErrorType, Output] = ...
Given a list of "Inputs", I would like to return a corresponding list of "Outputs" wrapped in an Either:
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] = ...
processMany will call processOne for each input it has, however, I would like it to terminate the first time (if any) that processOne returns a Left, and return that Left, otherwise return a Right with a list of the outputs.
My question: what is the best way to implement processMany? Is it possible to accomplish this behavior using a for expression, or is it going to be necessary for me to iterate the list myself recursively?
With Scalaz 7:
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] =
inputs.toStream traverseU processOne
Converting inputs to a Stream[Input] takes advantage of the non-strict traverse implementation for Stream, i.e. gives you the short-circuiting behaviour you want.
By the way, you tagged this "monads", but traversal requires only an applicative functor (which, as it happens, is probably defined in terms of the monad for Either). For further reference, see the paper The Essence of the Iterator Pattern, or, for a Scala-based interpretation, Eric Torreborre's blog post on the subject.
The easiest with standard Scala, which doesn't evaluate more than is necessary, would probably be
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] = {
Right(inputs.map{ x =>
processOne(x) match {
case Right(r) => r
case Left(l) => return Left(l)
}
})
}
A fold would be more compact, but wouldn't short-circuit when it hit a left (it'd just keep carrying it along while you iterated through the entire input).
For now, I've decided to just solve this using recursion, as I am reluctant to add a dependency to a library (Scalaz).
(Types and names in my application have been changed here in order to appear more generic)
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] = {
import scala.annotation.tailrec
#tailrec
def traverse(acc: Vector[Output], inputs: List[Input]): Either[ErrorType, Seq[Output]] = {
inputs match {
case Nil => Right(acc)
case input :: more =>
processOne(input) match {
case Right(output) => traverse(acc :+ output, more)
case Left(e) => Left(e)
}
}
}
traverse(Vector[Output](), inputs.toList)
}