Use of Scala by-name parameters - scala

I am going through the book "Functional Programming in Scala" and have run across an example that I don't fully understand.
In the chapter on strictness/laziness the authors describe the construction of Streams and have code like this:
sealed trait Stream[+A]
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](hd: => A, tl: => Stream[A]) : Stream[A] = {
lazy val head = hd
lazy val tail = tl
Cons(() => head, () => tail)
}
...
}
The question I have is in the smart constructor (cons) where it calls the constructor for the Cons case class. The specific syntax being used to pass the head and tail vals doesn't make sense to me. Why not just call the constructor like this:
Cons(head, tail)
As I understand the syntax used it is forcing the creation of two Function0 objects that simply return the head and tail vals. How is that different from just passing head and tail (without the () => prefix) since the Cons case class is already defined to take these parameters by-name anyway? Isn't this redundant? Or have I missed something?

The difference is in => A not being equal to () => A.
The former is pass by name, and the latter is a function that takes no parameters and returns an A.
You can test this out in the Scala REPL.
scala> def test(x: => Int): () => Int = x
<console>:9: error: type mismatch;
found : Int
required: () => Int
def test(x: => Int): () => Int = x
^
Simply referencing x in my sample causes the parameter to be invoked. In your sample, it's constructing a method which defers invocation of x.

First, you are assuming that => A and () => A are the same. However, they are not. For example, the => A can only be used in the context of passing parameters by-name - it is impossible to declare a val of type => A. As case class parameters are always vals (unless explicitly declared vars), it is clear why case class Cons[+A](h: => A, t: => Stream[A]) would not work.
Second, just wrapping a by-name parameter into a function with an empty parameter list is not the same as what the code above accomplishes: using lazy vals, it is ensured that both hd and tl are evaluated at most once. If the code read
Cons(() => hd, () => tl)
the original hd would be evaluated every time the h method (field) of a Cons object is invoked. Using a lazy val, hd is evaluated only the first time the h method of this Cons object is invoked, and the same value is returned in every subsequent invocation.
Demonstrating the difference in a stripped-down fashion in the REPL:
> def foo = { println("evaluating foo"); "foo" }
> val direct : () => String = () => foo
> direct()
evaluating foo
res6: String = foo
> direct()
evaluating foo
res7: String = foo
> val lzy : () => String = { lazy val v = foo; () => v }
> lzy()
evaluating foo
res8: String = foo
> lzy()
res9: String = foo
Note how the "evaluating foo" output in the second invocation of lzy() is gone, as opposed to the second invocation of direct().

Note that the parameters of the method cons are by-name parameters (hd and tl). That means that if you call cons, the arguments will not be evaluated before you call cons; they will be evaluated later, at the moment you use them inside cons.
Note that the Cons constructor takes two functions of type Unit => A, but not as by-name parameters. So these will be evaluated before you call the constructor.
If you do Cons(head, tail) then head and tail will be evaluated, which means hd and tl will be evaluated.
But the whole point here was to avoid calling hd and tl until necessary (when someone accesses h or t in the Cons object). So, you pass two anonymous functions to the Cons constructor; these functions will not be called until someone accesses h or t.

In def cons[A](hd: => A, tl: => Stream[A]) : Stream[A]
the type of hd is A, tl is Stream[A]
whereas in case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
h is of type Function0[A] and t of type Function0[Stream[A]]
given the type of hd is A, the smart constructor invokes the case class as
lazy val head = hd
lazy val tail = tl
Cons(() => head, () => tail) //it creates a function closure so that head is accessible within Cons for lazy evaluation

Related

Understand Stream scala interleaved transformations behavior

I'm reading and having fun with examples and exercises contained in the book Functional Programming in Scala. I'm studing the strictess and laziness chapter talking about the Stream.
I can't understand the output produced by the following code excerpt:
sealed trait Stream[+A]{
def foldRight[B](z: => B)(f: (A, => B) => B): B =
this match {
case Cons(h,t) => f(h(), t().foldRight(z)(f))
case _ => z
}
def map[B](f: A => B): Stream[B] = foldRight(Stream.empty[B])((h,t) => {println(s"map h:$h"); Stream.cons(f(h), t)})
def filter(f:A=>Boolean):Stream[A] = foldRight(Stream.empty[A])((h,t) => {println(s"filter h:$h"); if(f(h)) Stream.cons(h,t) else t})
}
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](hd: => A, tl: => Stream[A]): Stream[A] = {
lazy val head = hd
lazy val tail = tl
Cons(() => head, () => tail)
}
def empty[A]: Stream[A] = Empty
def apply[A](as: A*): Stream[A] =
if (as.isEmpty) empty else cons(as.head, apply(as.tail: _*))
}
Stream(1,2,3,4,5,6).map(_+10).filter(_%2==0)
When I execute this code, I receive this output:
map h:1
filter h:11
map h:2
filter h:12
My questions are:
Why map and filter output are interleaved?
Could you explain all steps involved from the Stream creation until the last step for obtaining this behavior?
Where are other elements of the list that pass also filter transformation, so 4 and 6?
The key to understanding this behavior, I think, is in the signature of the foldRight.
def foldRight[B](z: => B)(f: (A, => B) => B): B = ...
Note that the 2nd argument, f, is a function that takes two parameters, an A and a by-name (lazy) B. Take away that laziness, f: (A, B) => B, and you not only get the expected method grouping (all the map() steps before all the filter() steps), they also come in reverse order with 6 processed first and 1 processed last, as you'd expect from a foldRight.
How does one little => perform all that magic? It basically says that the 2nd argument to f() is going to be held in reserve until it is required.
So, attempting to answer your questions.
Why map and filter output are interleaved?
Because each call to map() and filter() are delayed until the point when the values are requested.
Could you explain all steps involved from the Stream creation until the last step for obtaining this behavior?
Not really. That would take more time and SO answer space than I'm willing to contribute, but let's take just a few steps into the morass.
We start with a Stream, which looks likes a series of Cons, each holding an Int and a reference to the next Cons, but that's not completely accurate. Each Cons really holds two functions, when invoked the 1st produces an Int and the 2nd produces the next Cons.
Call map() and pass it the "+10" function. map() creates a new function: "Given h and t (both values), create a new Cons. The head function of the new Cons, when invoked, will be the "+10" function applied to the current head value. The new tail function will produce the t value as received." This new function is passed to foldRight.
foldRight receives the new function but the evaluation of the function's 2nd parameter will be delayed until it is needed. h() is called to retrieve the current head value, t() will be called to retrieve the current tail value and a recursive call to foldRight will be called on it.
Call filter() and pass it the "isEven" function. filter() creates a new function: "Given h and t, create a new Cons if h passes the isEven test. If not then return t." That's the real t. Not a promise to evaluate its value later.
Where are other elements of the list that pass also filter transformation, so 4 and 6?
They are still there waiting to be evaluated. We can force that evaluation by using pattern matching to extract the various Cons one by one.
val c0#Cons(_,_) = Stream(1,2,3,4,5,6).map(_+10).filter(_%2==0)
// **STDOUT**
//map h:1
//filter h:11
//map h:2
//filter h:12
c0.h() //res0: Int = 12
val c1#Cons(_,_) = c0.t()
// **STDOUT**
//map h:3
//filter h:13
//map h:4
//filter h:14
c1.h() //res1: Int = 14
val c2#Cons(_,_) = c1.t()
// **STDOUT**
//map h:5
//filter h:15
//map h:6
//filter h:16
c2.h() //res2: Int = 16
c2.t() //res3: Stream[Int] = Empty

Scala lazy evaluation and apply function

I'm following a book's example to implement a Steam class using lazy evaluation in Scala.
sealed trait Stream[+A]
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](hd: => A, tl: => Stream[A]): Stream[A] = {
lazy val head = hd
lazy val tail = tl
Cons(() => head, () => tail)
}
def empty[A]: Stream[A] = Empty
def apply[A](as: A*): Stream[A] = {
if (as.isEmpty) empty else cons(as.head, apply(as.tail: _*))
}
}
Then I used a simple function to test if it's working
def printAndReturn: Int = {
println("called")
1
}
Then I construct Stream like the following:
println(s"apply: ${
Stream(
printAndReturn,
printAndReturn,
printAndReturn,
printAndReturn
)
}")
The output is like this:
called
called
called
called
apply: Cons(fpinscala.datastructures.Stream$$$Lambda$7/1170794006#e580929,fpinscala.datastructures.Stream$$$Lambda$8/1289479439#4c203ea1)
Then I constructed Stream using cons:
println(s"cons: ${
cons(
printAndReturn,
cons(
printAndReturn,
cons(printAndReturn, Empty)
)
)
}")
The output is:
cons: Cons(fpinscala.datastructures.Stream$$$Lambda$7/1170794006#2133c8f8,fpinscala.datastructures.Stream$$$Lambda$8/1289479439#43a25848)
So here are two questions:
When constructing Stream using the apply function, all printAndReturn are evaluated. Is this because the recursive call to apply(as.head, ...) evaluates every head?
If the answer to the first question is true, then how do I change apply to make it not force evaluation?
No. If you put a breakpoint on the println you'll find that the method is actually being called when you first create the Stream. The line Stream(printAndReturn, ... actually calls your method however many times you put it there. Why? Consider the type signatures for cons and apply:
def cons[A](hd: => A, tl: => Stream[A]): Stream[A]
vs:
def apply[A](as: A*): Stream[A]
Note that the definition for cons has its parameters marked as => A. This is a by-name parameter. Declaring an input like this makes it lazy, delaying its evaluation until it is actually used. Hence your println will never get called using cons. Compare this to apply. You're not using a by name parameter and therefore anything that gets passed in to that method will automatically get evaluated.
Unfortunately there isn't a super easy way as of now. What you really want is something like def apply[A](as: (=>A)*): Stream[A] but unfortunately Scala does not support vararg by name parameters. See this answer for a few ideas on how to get around this. One way is to just wrap your function calls when creating the Stream:
Stream(
() => printAndReturn,
() => printAndReturn,
() => printAndReturn,
() => printAndReturn)
Which will then delay the evaluation.
When you called
Stream(
printAndReturn,
printAndReturn,
printAndReturn,
printAndReturn
)
the apply in the companion object was invoked. Looking at the parameter type of the apply, you would notice that it is strict. So the arguments will be evaluated first before being assigned to as. What as becomes is an Array of Ints
For 2, you can define apply as
def apply[A](as: (() => A)*): Stream[A] =
if (as.isEmpty) empty else cons(as.head(), apply(as.tail: _*))
and as was suggested above, you need to pass the arguments as thunks themselves as in
println(s"apply: ${Stream(
() => printAndReturn,
() => printAndReturn,
() => printAndReturn,
() => printAndReturn
)}")

Scala: default return type of Option.getOrElse(...)

The signature of the function getOrElse(...) of Scala's Option[+A] class is
final def getOrElse[B >: A](default: ⇒ B): B
If I use the example
val o1 = Option("Hi")
val o2: Option[String] = Option(null)
println(o1.getOrElse(() => "Else"))
println(o2.getOrElse(() => "Else"))
I get the output
Hi
<function0>
The Scala API says about getOrElse(...):
Returns the option's value if the option is nonempty, otherwise return the result of evaluating default.
But () => "Else") is not evaluated.
The result cannot be evaluated by using brackets:
o2.getOrElse(() => "Else")()
error: Object does not take parameters
o2.getOrElse( () => "Else")()
^
How can I evaluate the result and why it is not evaluated automatically?
Is default: ⇒ B the same as default: () ⇒ B ?
Is default: ⇒ B the same as default: () ⇒ B
No, the first is call by name and the second is a thunk. The type of a call by name parameter is the type of the parameter itself, where the type of a thunk is () => T which is the same as Function0[T].
When you do o1.getOrElse(() => "Else") you are working with heterogeneous types, so Scala will find the least common super type which is in this case is Any.
val orElse: Any = o1.getOrElse(() => "Else")
Consider this:
val e: Function0[String] = () => "Else"
Then you can write:
println(o1.getOrElse(e)) //Hi
println(o2.getOrElse(e)) //<function0>
println(o2.getOrElse(e())) //Else
println(o2.getOrElse((() => "Else")())) //Else

Scala , what does this function definition means?

In the following code snippet, what does (F: => T) mean?
def func1[T](arg1: Int, arg2: String)(F: => T): func2[T]
Thanks
F is the argument name; => T means it's a by-name parameter. It's basically equivalent to () => T with some syntactic sugar:
When invoking this method, the argument will have type T and will automatically be turned into () => T:
func1[String](0, x)(x + x) ===> func1[String](0, x)(() => x + x)
When implementing this method, each use of F turns into F(). So the value of type T will be recalculated each time.
Obviously, this is useful in one of two cases:
if F may not be needed;
if the value returned by F may change between different invocations.

Help me understand this Scala code: scalaz IO Monad and implicits

This is a followup to this question.
Here's the code I'm trying to understand (it's from http://apocalisp.wordpress.com/2010/10/17/scalaz-tutorial-enumeration-based-io-with-iteratees/):
object io {
sealed trait IO[A] {
def unsafePerformIO: A
}
object IO {
def apply[A](a: => A): IO[A] = new IO[A] {
def unsafePerformIO = a
}
}
implicit val IOMonad = new Monad[IO] {
def pure[A](a: => A): IO[A] = IO(a)
def bind[A,B](a: IO[A], f: A => IO[B]): IO[B] = IO {
implicitly[Monad[Function0]].bind(() => a.unsafePerformIO,
(x:A) => () => f(x).unsafePerformIO)()
}
}
}
This code is used like this (I'm assuming an import io._ is implied)
def bufferFile(f: File) = IO { new BufferedReader(new FileReader(f)) }
def closeReader(r: Reader) = IO { r.close }
def bracket[A,B,C](init: IO[A], fin: A => IO[B], body: A => IO[C]): IO[C] = for { a <- init
c <- body(a)
_ <- fin(a) } yield c
def enumFile[A](f: File, i: IterV[String, A]): IO[IterV[String, A]] = bracket(bufferFile(f),
closeReader(_:BufferedReader),
enumReader(_:BufferedReader, i))
I'm now trying to understand the implicit val IOMonad definition. Here's how I understand it. This is a scalaz.Monad, so it needs to define pure and bind abstract values of the scalaz.Monad trait.
pure takes a value and turns it into a value contained in the "container" type. For example it could take an Int and return a List[Int]. This seems pretty simple.
bind takes a "container" type and a function that maps the type that the container holds to another type. The value that is returned is the same container type, but it's now holding a new type. An example would be taking a List[Int] and mapping it to a List[String] using a function that maps Ints to Strings. Is bind pretty much the same as map?
The implementation of bind is where I'm stuck. Here's the code:
def bind[A,B](a: IO[A], f: A => IO[B]): IO[B] = IO {
implicitly[Monad[Function0]].bind(() => a.unsafePerformIO,
(x:A) => () => f(x).unsafePerformIO)()
}
This definition takes IO[A] and maps it to IO[B] using a function that takes an A and returns an IO[B]. I guess to do this, it has to use flatMap to "flatten" the result (correct?).
The = IO { ... } is the same as
= new IO[A] {
def unsafePerformIO = implicitly[Monad[Function0]].bind(() => a.unsafePerformIO,
(x:A) => () => f(x).unsafePerformIO)()
}
}
I think?
the implicitly method looks for an implicit value (value, right?) that implements Monad[Function0]. Where does this implicit definition come from? I'm guessing this is from the implicit val IOMonad = new Monad[IO] {...} definition, but we're inside that definition right now and things get a little circular and my brain starts to get stuck in an infinite loop :)
Also, the first argument to bind (() => a.unsafePerformIO) seems to be a function that takes no parameters and returns a.unsafePerformIO. How should I read this? bind takes a container type as its first argument, so maybe () => a.unsafePerformIO resolves to a container type?
IO[A] is intended to represent an Action returning an A, where the result of the Action may depend on the environment (meaning anything, values of variables, file system, system time...) and the execution of the action may also modify the environment. Actually, scala type for an Action would be Function0. Function0[A] returns an A when called and it is certainly allowed to depend on and modify the environment. IO is Function0 under another name, but it is intended to distinguish (tag?) those Function0 which depends on the environment from the other ones, which are actually pure value (if you say f is a function[A] which always returns the same value, without any side effect, there is no much difference between f and its result). To be precise, it is not so much that function tagged as IO must have side effect. It is that those not so tagged must have none. Note however than wrapping impure functions in IO is entirely voluntary, there is no way you will have a guarantee when you get a Function0 that it is pure. Using IO is certainly not the dominant style in scala.
pure takes a value and turns it into a value contained in the
"container" type.
Quite right, but "container" may mean quite a lot of things. And the one returned by pure must be as light as possible, it must be the one that makes no difference. The point of list is that they may have any number of values. The one returned by pure must have one. The point of IO is that it depends on and affect the environment. The one returned by pure must do no such thing. So it is actually the pure Function0 () => a, wrapped in IO.
bind pretty much the same as map
Not so, bind is the same as flatMap. As you write, map would receive a function from Int to String, but here you have the function from Int to List[String]
Now, forget IO for a moment and consider what bind/flatMap would mean for an Action, that is for Function0.
Let's have
val askUserForLineNumber: () => Int = {...}
val readingLineAt: Int => Function0[String] = {i: Int => () => ...}
Now if we must combine, as bind/flatMap does, those items to get an action that returns a String, what it must be is pretty clear: ask the reader for the line number, read that line and returns it. That would be
val askForLineNumberAndReadIt= () => {
val lineNumber : Int = askUserForLineNumber()
val readingRequiredLine: Function0[String] = readingLineAt(line)
val lineContent= readingRequiredLine()
lineContent
}
More generically
def bind[A,B](a: Function0[A], f: A => Function0[B]) = () => {
val value = a()
val nextAction = f(value)
val result = nextAction()
result
}
And shorter:
def bind[A,B](a: Function0[A], f: A => Function0[B])
= () => {f(a())()}
So we know what bind must be for Function0, pure is clear too. We can do
object ActionMonad extends Monad[Function0] {
def pure[A](a: => A) = () => a
def bind[A,B](a: () => A, f: A => Function0[B]) = () => f(a())()
}
Now, IO is Function0 in disguise. Instead of just doing a(), we must do a.unsafePerformIO. And to define one, instead of () => body, we write IO {body}
So there could be
object IOMonad extends Monad[IO] {
def pure[A](a: => A) = IO {a}
def bind[A,B](a: IO[A], f: A => IO[B]) = IO {f(a.unsafePerformIO).unsafePerformIO}
}
In my view, that would be good enough. But in fact it repeats the ActionMonad. The point in the code you refer to is to avoid that and reuse what is done for Function0 instead. One goes easily from IO to Function0 (with () => io.unsafePerformIo) as well as from Function0 to IO (with IO { action() }). If you have f: A => IO[B], you can also change that to f: A => Function0[B], just by composing with the IO to Function0 transform, so (x: A) => f(x).unsafePerformIO.
What happens here in the bind of IO is:
() => a.unsafePerformIO: turn a into a Function0
(x:A) => () => f(x).unsafePerformIO): turn f into an A => Function0[B]
implicitly[Monad[Function0]]: get the default monad for Function0, the very same as the ActionMonad above
bind(...): apply the bind of the Function0 monad to the arguments a and f that have just been converted to Function0
The enclosing IO{...}: convert the result back to IO.
(Not sure I like it much)