Here's the code from FPIS
object test2 {
//a naive IO monad
sealed trait IO[A] { self =>
def run: A
def map[B](f: A => B): IO[B] = new IO[B] { def run = f(self.run) }
def flatMap[B](f: A => IO[B]): IO[B] = {
println("calling IO.flatMap")
new IO[B] {
def run = {
println("calling run from flatMap result")
f(self.run).run
}
}
}
}
object IO {
def unit[A](a: => A): IO[A] = new IO[A] { def run = a }
def apply[A](a: => A): IO[A] = unit(a) // syntax for IO { .. }
}
//composer in question
def forever[A,B](a: IO[A]): IO[B] = {
lazy val t: IO[B] = a flatMap (_ => t)
t
}
def PrintLine(msg: String) = IO { println(msg) }
def say = forever(PrintLine("Still Going..")).run
}
test2.say will print thousands of "Still Going" before stack overflows. But I don't know exactly how that happens.
The output looks like this:
scala> test2.say
calling IO.flatMap //only once
calling run from flatMap result
Still Going..
calling run from flatMap result
Still Going..
... //repeating until stack overflows
When function forever returns, is the lazy val t fully computed (cached)?
And, the flatMap method seems to be called only once (I add print statements) which counters the recursive definition of forever. Why?
===========
Another thing I find interesting is that the B type in forever[A, B] could be anything. Scala actually can run with it being opaque.
I manually tried forever[Unit, Double], forever[Unit, String] etc and it all worked. This feels smart.
What forever method does is, as the name suggests, makes the monadic instance a run forever. To be more precise, it gives us an infinite chain of monadic operations.
Its value t is defined recursively as:
t = a flatMap (_ => t)
which expands to
t = a flatMap (_ => a flatMap (_ => t))
which expands to
t = a flatMap (_ => a flatMap (_ => a flatMap (_ => t)))
and so on.
Lazy gives us the ability to define something like this. If we removed the lazy part we would either get a "forward reference" error (in case the recursive value is contained within some method) or it would simply be initialized with a default value and not used recursively (if contained within a class, which makes it a class field with a behind-the-scenes getter and setter).
Demo:
val rec: Int = 1 + rec
println(rec) // prints 1, "rec" in the body is initialized to default value 0
def foo() = {
val rec: Int = 1 + rec // ERROR: forward reference extends over definition of value rec
println(rec)
}
However, this alone is not the reason why the whole stack overflow thing happens. There is another recursive part, and this one is actually responsible for the stack overflow. It is hidden here:
def run = {
println("calling run from flatMap result")
f(self.run).run
}
Method run calls itself (see that self.run). When we define it like this, we don't evaluate self.run on the spot because f hasn't been invoked yet; we are just stating that it will be invoked once run() is invoked.
But when we create the value t in forever, we are creating an IO monad that flatMaps into itself (the function it provides to flatMap is "evaluate into yourself"). This will trigger the run and therefore the evaluation and invocation of f. We never really leave the flatMap context (hence only one printed statement for the flatMap part) because as soon as we try to flatMap, run starts evaluating the function f which returns the IO on which we call run which invokes the function f which returns the IO on which we call run which invokes the function f which returns the IO on which we call run...
I'd like to know when function forever returns, is the lazy val t fully computed (cached)?
Yes
If so then why need the lazy keyword?
It's no use in your case. It can be useful in situation like:
def repeat(n: Int): Seq[Int] {
lazy val expensive = "some expensive computation"
Seq.fill(n)(expensive)
// when n == 0, the 'expensive' computation will be skipped
// when n > 1, the 'expensive' computation will only be computed once
}
The other thing I don't understand is that the flatMap method seems to
be called only once (I add print statements) which counters the
recursive definition of forever. Why?
Not possible to comment until you can provide a Minimal, Complete, and Verifiable example, like #Yuval Itzchakov said
Updated 19/04/2017
Alright, I need to correct myself :-) In your case the lazy val is required due to the recursive reference back to itself.
To explain your observation, let's try to expand the forever(a).run call:
forever(a) expands to
{ lazy val t = a flatMap(_ => t) } expands to
{ lazy val t = new IO[B] { def run() = { ... t.run } }
Because t is lazy, flatMap and new IO[B] in 2 and 3 are invoked only once and then 'cached' for reuse.
On invoking run() on 3, you start a recursion on t.run and thus the result you observed.
Not exactly sure about your requirement, but a non-stack-blowing version of forever can be implemented like:
def forever[A, B](a: IO[A]): IO[B] = {
new IO[B] {
#tailrec
override def run: B = {
a.run
run
}
}
}
new IO[B] {
def run = {
println("calling run from flatMap result")
f(self.run).run
}
}
I get it now why overflowing occurs at run method: the outer run invocation in def run actually points to def run itself.
The call stack looks like this:
f(self.run).run
|-----|--- println
|--- f(self.run).run
|-----|------println
|------f(self.run).run
|------ (repeating)
f(self.run) always points to the same evaluated/cached lazy val t object
because f: _ => t simply returns t that IS the UNIQUE newly created
IO[B] that hosts its run method which we are calling and will immediately recursively call again.
That's how we can see print statements before stack overflows.
However still not clear how lazy val in this case can cook it right.
Related
Why putStrLn in flatMap followed by a result statement didn't get effectively write to stdout?
object Mgr extends App {
def main1(args: Array[String]) = getStrLn.flatMap { s =>
putStrLn(s) // Why this did not write to console?
UIO.succeed(s)
}
override def run(args: List[String]): URIO[zio.ZEnv, Int] = main1(Array()).fold(_ => 1,
{ x =>
println(x) // only this line wrote to console, why?
0
})
}
Your problem is basically, that you put two effects into single flatMap.
By invoking putStrLn(s) you're not actually printing to console, you're merely creating the description of the action that will print when your program is interpreted and run (when method run is called). And because in your flatmap only last value is returned (in your case UIO.succeed(s)), then only it will be taken into consideration while constructing ZIO program.
You can fix your program by chaining both actions.
You can do it with *> operator:
def main1(args: Array[String]) = getStrLn.flatMap { s =>
putStrLn(s) *> UIO.succeed(s)
}
or you could just put effects into separate flatMaps. But since you want to create side-effect (by printing value), but then pass value further unchanged, you need to use special function tap:
def main1(args: Array[String]) = getStrLn.tap { s =>
putStrLn(s)
}.flatMap { s =>
UIO.succeed(s)
}
Your issue is also described (with other pitfalls) in this great article (look at the first point).
I am creating in Scala and Cats a function that does some I/O and that will be called by other parts of the code. I'm also learning Cats and I want my function to:
Be generic in its effect and use a F[_]
Run on a dedicated thread pool
I want to introduce async boundaries
I assume that all my functions are generic in F[_] up to the main method because I'm trying to follow these Cat's guidelines
But I struggle to make these constraint to work by using ContextShift or ExecutionContext. I have written a full example here and this is an exctract from the example:
object ComplexOperation {
// Thread pool for ComplexOperation internal use only
val cs = IO.contextShift(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
)
// Complex operation that takes resources and time
def run[F[_]: Sync](input: String): F[String] =
for {
r1 <- Sync[F].delay(cs.shift) *> op1(input)
r2 <- Sync[F].delay(cs.shift) *> op2(r1)
r3 <- Sync[F].delay(cs.shift) *> op3(r2)
} yield r3
def op1[F[_]: Sync](input: String): F[Int] = Sync[F].delay(input.length)
def op2[F[_]: Sync](input: Int): F[Boolean] = Sync[F].delay(input % 2 == 0)
def op3[F[_]: Sync](input: Boolean): F[String] = Sync[F].delay(s"Complex result: $input")
}
This clearly doesn't abstract over effects as ComplexOperation.run needs a ContextShift[IO] to be able to introduce async boundaries. What is the right (or best) way of doing this?
Creating ContextShift[IO] inside ComplexOperation.run makes the function depend on IO which I don't want.
Moving the creation of a ContextShift[IO] on the caller will simply shift the problem: the caller is also generic in F[_] so how does it obtain a ContextShift[IO] to pass to ComplexOperation.run without explicitly depending on IO?
Remember that I don't want to use one global ContextShift[IO] defined at the topmost level but I want each component to decide for itself.
Should my ComplexOperation.run create the ContextShift[IO] or is it the responsibility of the caller?
Am I doing this right at least? Or am I going against standard practices?
So I took the liberty to rewrite your code, hope it helps:
import cats.effect._
object Functions {
def sampleFunction[F[_]: Sync : ContextShift](file: String, blocker: Blocker): F[String] = {
val handler: Resource[F, Int] =
Resource.make(
blocker.blockOn(openFile(file))
) { handler =>
blocker.blockOn(closeFile(handler))
}
handler.use(handler => doWork(handler))
}
private def openFile[F[_]: Sync](file: String): F[Int] = Sync[F].delay {
println(s"Opening file $file with handler 2")
2
}
private def closeFile[F[_]: Sync](handler: Int): F[Unit] = Sync[F].delay {
println(s"Closing file handler $handler")
}
private def doWork[F[_]: Sync](handler: Int): F[String] = Sync[F].delay {
println(s"Calculating the value on file handler $handler")
"The final value"
}
}
object Main extends IOApp {
override def run(args: List[String]): IO[ExitCode] = {
val result = Blocker[IO].use { blocker =>
Functions.sampleFunction[IO](file = "filePath", blocker)
}
for {
data <- result
_ <- IO(println(data))
} yield ExitCode.Success
}
}
You can see it running here.
So, what does this code does.
First, it creates a Resource for the file, since close has to be done, even on guarantee or on failure.
It is using Blocker to run the open and close operations on a blocking thread poo (that is done using ContextShift).
Finally, on the main, it creates a default Blocker for instance, for **IO*, and uses it to call your function; and prints the result.
Fell free to ask any question.
I wrote simple callback(handler) function which i pass to async api and i want to wait for result:
object Handlers {
val logger: Logger = Logger("Handlers")
implicit val cs: ContextShift[IO] =
IO.contextShift(ExecutionContext.Implicits.global)
class DefaultHandler[A] {
val response: IO[MVar[IO, A]] = MVar.empty[IO, A]
def onResult(obj: Any): Unit = {
obj match {
case obj: A =>
println(response.flatMap(_.tryPut(obj)).unsafeRunSync())
println(response.flatMap(_.isEmpty).unsafeRunSync())
case _ => logger.error("Wrong expected type")
}
}
def getResponse: A = {
response.flatMap(_.take).unsafeRunSync()
}
}
But for some reason both tryPut and isEmpty(when i'd manually call onResult method) returns true, therefore when i calling getResponse it sleeps forever.
This is the my test:
class HandlersTest extends FunSuite {
test("DefaultHandler.test") {
val handler = new DefaultHandler[Int]
handler.onResult(3)
val response = handler.getResponse
assert(response != 0)
}
}
Can somebody explain why tryPut returns true, but nothing puts. And what is the right way to use Mvar/channels in scala?
IO[X] means that you have the recipe to create some X. So on your example, yuo are putting in one MVar and then asking in another.
Here is how I would do it.
object Handlers {
trait DefaultHandler[A] {
def onResult(obj: Any): IO[Unit]
def getResponse: IO[A]
}
object DefaultHandler {
def apply[A : ClassTag]: IO[DefaultHandler[A]] =
MVar.empty[IO, A].map { response =>
new DefaultHandler[A] {
override def onResult(obj: Any): IO[Unit] = obj match {
case obj: A =>
for {
r1 <- response.tryPut(obj)
_ <- IO(println(r1))
r2 <- response.isEmpty
_ <- IO(println(r2))
} yield ()
case _ =>
IO(logger.error("Wrong expected type"))
}
override def getResponse: IO[A] =
response.take
}
}
}
}
The "unsafe" is sort of a hint, but every time you call unsafeRunSync, you should basically think of it as an entire new universe. Before you make the call, you can only describe instructions for what will happen, you can't actually change anything. During the call is when all the changes occur. Once the call completes, that universe is destroyed, and you can read the result but no longer change anything. What happens in one unsafeRunSync universe doesn't affect another.
You need to call it exactly once in your test code. That means your test code needs to look something like:
val test = for {
handler <- TestHandler.DefaultHandler[Int]
_ <- handler.onResult(3)
response <- handler.getResponse
} yield response
assert test.unsafeRunSync() == 3
Note this doesn't really buy you much over just using the MVar directly. I think you're trying to mix side effects inside IO and outside it, but that doesn't work. All the side effects need to be inside.
Trying to understand how best deal with side-effects in FP.
I implemented this rudimentary IO implementation:
trait IO[A] {
def run: A
}
object IO {
def unit[A](a: => A): IO[A] = new IO[A] { def run = a }
def loadFile(fileResourcePath: String) = IO.unit[List[String]]{
Source.fromResource(fileResourcePath).getLines.toList }
def printMessage(message: String) = IO.unit[Unit]{ println(message) }
def readLine(message:String) = IO.unit[String]{ StdIn.readLine() }
}
I have the following use case:
- load lines from log file
- parse each line to BusinessType object
- process each BusinessType object
- print process result
Case 1:
So Scala code may look like this
val load: String => List[String]
val parse: List[String] => List[BusinessType]
val process: List[BusinessType] => String
val output: String => Unit
Case 2:
I decide to use IO above:
val load: String => IO[List[String]]
val parse: IO[List[String]] => List[BusinessType]
val process: List[BusinessType] => IO[Unit]
val output: IO[Unit] => Unit
In case 1 the load is impure because it's reading from file so is the output is also impure because, it's writing the result to console.
To be more functional I use case 2.
Questions:
- Aren't case 1 and 2 really the same thing?
- In case 2 aren't we just delaying the inevitable?
as the parse function will need to call the io.run
method and cause a side-effect?
- when they say "leave side-effects until the end of the world"
how does this apply to the example above? where is the
end of the world here?
Your IO monad seems to lack all the monad stuff, namely the part where you can flatMap over it to build bigger IO out of smaller IO. That way, everything stays "pure" until the call run at the very end.
In case 2 aren't we just delaying the inevitable?
as the parse function will need call the io.run
method and cause a side effect?
No. The parse function should not call io.run. It should return another IO that you can then combine with its input IO.
when they say "leave side-effects until the end of the world"
how does this apply to the example above? where is the
end of the world here?
End of the world would be the last thing your program does. You only run once. The rest of your program "purely" builds one giant IO for that.
Something like
def load(): IO[Seq[String]]
def parse(data: Seq[String]): IO[Parsed] // returns IO, because has side-effects
def pureComputation(data: Parsed): Result // no side-effects, no need to use I/O
def output(data: Result): IO[Unit]
// combining effects is "pure", so the whole thing
// can be a `val` (or a `def` if it takes some input params)
val program: IO[Unit] = for {
data <- load() // use <- to "map" over IO
parsed <- parse()
result = pureComputation(parsed) // = instead of <-, no I/O here
_ <- output(result)
} yield ()
// only `run` at the end produces any effects
def main() {
program.run()
}
So, I read the article here about parallel comprehension. He gives the following code example:
// Make 3 parallel async calls
val fooFuture = WS.url("http://foo.com").get()
val barFuture = WS.url("http://bar.com").get()
val bazFuture = WS.url("http://baz.com").get()
for {
foo <- fooFuture
bar <- barFuture
baz <- bazFuture
} yield {
// Build a Result using foo, bar, and baz
Ok(...)
}
All fine so far, but, I am in a situation where I don't know how many WS.get()'s I need to do always, I want it to be dynamic. So for instance:
val checks = Seq(callOne(param), callTwo(param))
Where the calls are:
def callOne(param: String): Future[Boolean] = {
// do something and return the Future with a true/false value
Future(true)
}
def callTwo(param: String): Future[Boolean] = {
// do something and return the Future with a true/false value
Future(false)
}
So, my question is, how shall I react on the results of my sequence with WS calls (or database queries for that matter), in a for-yield?
I have given two example of calls, but I want the same code be able to process 1 to many number of calls in parallel and gather the results in the for-yield to ultimately proceed to do other things.
Important: All calls should be carried out in parallel, the quickest ones will complete before the slow ones without any respect to what order they are fired.
Future.sequence is likely what you want.
Example usage:
val futures = List(WS.url("http://foo.com").get(), WS.url("http://bar.com").get())
Future.sequence(futures) # => Transforms a Seq[Future[_]] to Future[Seq[_]]
The future returns from Future.sequence will not be completed until the all of the futures in the input sequence are completed.
Bonus:
If your futures are heterogeneously typed, and you need to preserve that type, you can use Hlist. I've written the following snippet which will take an Hlist of futures, and transform it to a Future containing an Hlist of resolved values:
import shapeless._
import scala.concurrent.{ExecutionContext,Future}
object FutureHelpers {
object FutureReducer extends Poly2 {
import scala.concurrent.ExecutionContext.Implicits.global
implicit def f[A, B <: HList] = at[Future[A], Future[B]] { (f, resultFuture) =>
for {
result <- resultFuture
value <- f
} yield value :: result
}
}
// Like Future.sequence, but for HList
// hsequence(Future { 1 } :: Future { "string" } :: HNil)
// => Future { 1 :: "string" :: HNil }
def hsequence[T <: HList](hlist: T)(implicit
executor: ExecutionContext,
folder: RightFolder[T, Future[HNil], FutureReducer.type]) = {
hlist.foldRight(Future.successful[HNil](HNil))(FutureReducer)
}
}