Looking at an IO Monad example from Functional Programming in Scala, I created an SBT project to test out IO.scala:
def ReadLine: IO[String] = IO { readLine }
def PrintLine(msg: String): IO[Unit] = IO { println(msg) }
def converter: IO[Unit] = for {
_ <- PrintLine("Enter a temperature in degrees fahrenheit: ")
d <- ReadLine.map(_.toDouble)
_ <- PrintLine(fahrenheitToCelsius(d).toString)
} yield ()
But, when I run console from SBT to access the above class with REPL, I tried:
scala> val echo = Util.ReadLine.flatMap(Util.PrintLine)
echo: common.I01.IO[Unit] = common.I01$IO$$anon$2#71c6b580
I'm expecting to be prompted for typing in text (via readLine), but I see, as I understand, simply an anonymous function/class.
How can I test out the above code?
Calling flatMap on ReadLine just produces an IO[Unit] value that has not been interpreted. At some point, you have to call IO#run (or IO#unsafePerformIO in scalaz) to make the side effects happen
To preserve referential transparency, the general idea is to build up an IO[A] (where A is typically Unit) and at the "outermost" part of your program, call run on the value -- for example, from the main entry point of the application. That's not always easy/possible though depending on the environment you are running in -- e.g., some form of framework or container.
Because loss of referential transparency is generally considered a pretty serious disadvantage, it is common to defer running of the IO value as long as possible. Hence, it is common to say that IO is evaluated at the end of the universe.
In this case, the end of the universe is the REPL session, so try calling echo.run from the REPL.
Related
I'm new to functional programming and Scala, and I was checking out the Cats Effect framework and trying to understand what the IO monad does. So far what I've understood is that writing code in the IO block is just a description of what needs to be done and nothing happens until you explicitly run using the unsafe methods provided, and also a way to make code that performs side-effects referentially transparent by actually not running it.
I tried executing the snippet below just to try to understand what it means:
object Playground extends App {
var out = 10
var state = "paused"
def changeState(newState: String): IO[Unit] = {
state = newState
IO(println("Updated state."))
}
def x(string: String): IO[Unit] = {
out += 1
IO(println(string))
}
val tuple1 = (x("one"), x("two"))
for {
_ <- x("1")
_ <- changeState("playing")
} yield ()
println(out)
println(state)
}
And the output was:
13
paused
I don't understand why the assignment state = newState does not run, but the increment and assign expression out += 1 run. Am I missing something obvious on how this is supposed to work? I could really use some help. I understand that I can get this to run using the unsafe methods.
In your particular example, I think what is going on is that regular imperative Scala coded is unaffected by the IO monad--it runs when it normally would under the rules of Scala.
When you run:
for {
_ <- x("1")
_ <- changeState("playing")
} yield ()
this immediately calls x. That has nothing to do with the IO monad; it's just how for comprehensions are defined. The first step is to evaluate the first statement so you can call flatMap on it.
As you observe, you never "run" the monadic result, so the argument to flatMap, the monadic continuation, is never invoked, resulting in no call to changeState. This is specific to the IO monad, as, e.g., the List monad's flatMap would have immediately invoked the function (unless it were an empty list).
I am reading chapter 13.2.1 and came across the example that can handle IO input and get rid of side effect in the meantime:
object IO extends Monad[IO] {
def unit[A](a: => A): IO[A] = new IO[A] { def run = a }
def flatMap[A,B](fa: IO[A])(f: A => IO[B]) = fa flatMap f
def apply[A](a: => A): IO[A] = unit(a)
}
def ReadLine: IO[String] = IO { readLine }
def PrintLine(msg: String): IO[Unit] = IO { println(msg) }
def converter: IO[Unit] = for {
_ <- PrintLine("Enter a temperature in degrees Fahrenheit: ")
d <- ReadLine.map(_.toDouble)
_ <- PrintLine(fahrenheitToCelsius(d).toString)
} yield ()
I have couple of questions regarding this piece of code:
In the unit function, what does def run = a really do?
In the ReadLine function, what does IO { readLine } really do? Will it really execute the println function or just return an IO type?
What does _ in the for comprehension mean (_ <- PrintLine("Enter a temperature in degrees Fahrenheit: ")) ?
Why it removes the IO side effects? I saw these functions still interact with inputs and outputs.
The definition of your IO is as follows:
trait IO { def run: Unit }
Following that definition, you can understand that writing new IO[A] { def run = a } means initialising an anonymous class from your trait, and assigning a to be the method that runs when you call IO.run. Because a is a by name parameter, nothing is actually ran at creation time.
Any object or class in Scala which follows a contract of an apply method, can be called as: ClassName(args), where the compiler will search for an apply method on the object/class and convert it to a ClassName.apply(args) call. A more elaborate answer can be found here. As such, because the IO companion object posses such a method:
def apply[A](a: => A): IO[A] = unit(a)
The expansion is allowed to happen. Thus we actually call IO.apply(readLine) instead.
_ has many overloaded uses in Scala. This occurrence means "I don't care about the value returned from PrintLine, discard it". It is so because the value returned is of type Unit, which we have nothing to do with.
It is not that the IO datatype removes the part of doing IO, it's that it defers it to a later point in time. We usually say IO runs at the "edges" of the application, in the Main method. These interactions with the out side world will still occur, but since we encapsulate them inside IO, we can reason about them as values in our program, which brings a lot of benefit. For example, we can now compose side effects and depend on the success/failure of their execution. We can mock out these IO effects (using other data types such as Const), and many other surprisingly nice properties.
The simplest way to look at IO monad as a small piece of program definition.
Thus:
This is IO definition, run method defines what IO monad does. new IO[A] { def run = a } is scala way of creating an instance of class and defining method run.
There is a bit of syntactical sugar is going on. IO { readLine } is same as IO.apply { readLine } or IO.apply(readLine) where readLine is call-by-name function of type => String. This calls the unit method from object IO and thus this is just creation of instance of IO class that does not run yet.
Since IO is a monad, for comprehension can be used. It requires storing a result of each monad operation in a syntax like result <- someMonad. To ignore the result, _ can be used, thus _ <- someMonad reads as execute the monad but ignore the result.
This methods are all IO definitions, they don't run anything and thus there is no side effect. Side effects only appears when IO.run is called.
I'm trying to write a DSL for writing system tests in Scala. In this DSL I don't want to expose the fact that some operations might take place asynchronously (because they are implemented using the web-service under test for instance), or that errors might occur (because the web-service might not be available, and we want the test to fail). In this answer this approach is discouraged, but I don't completely agree with this in the context of a DSL for writing tests. I think the DSL will get unnecessary polluted by the introduction of these aspects.
To frame the question, consider the following DSL:
type Elem = String
sealed trait TestF[A]
// Put an element into the bag.
case class Put[A](e: Elem, next: A) extends TestF[A]
// Count the number of elements equal to "e" in the bag.
case class Count[A](e: Elem, withCount: Int => A) extends TestF[A]
def put(e: Elem): Free[TestF, Unit] =
Free.liftF(Put(e, ()))
def count(e: Elem): Free[TestF, Int] =
Free.liftF(Count(e, identity))
def test0 = for {
_ <- put("Apple")
_ <- put("Orange")
_ <- put("Pinneaple")
nApples <- count("Apple")
nPears <- count("Pear")
nBananas <- count("Banana")
} yield List(("Apple", nApples), ("Pears", nPears), ("Bananas", nBananas))
Now assume we want to implement an interpreter that makes use of our service under test to put and count the elements in the store. Since we make use of the network, I'd like that the put operations take place asynchronously. In addition, given that network errors or server errors can occur, I'd like the program to stop as soon as an error occurs. To give an idea of what I want to achieve, here is an example of mixing the different aspects in Haskell by means of monad transformers (that I cannot translate to Scala).
So my question is, which monad M would you use for an interpreter that satisfies the requirements above:
def interp[A](cmd: TestF[A]): M[A]
And in case M is a monad tranformer, how would you compose them using the FP library of your choice (Cats, Scalaz).
Task (scalaz or better fs2) should satisfy all of the requirements, it doesn't need monad-transformer as it's already has Either inside (Either for fs2, \/ for scalaz). It also has a fast-fail behavior you need, same as right-biased disjunction/xor.
Here are several implementations that are known to me:
Scalaz Task (original): little outdated doc and new sources
FS2 Task: https://github.com/functional-streams-for-scala/fs2/blob/series/0.9/docs/guide.md It also provides interoperability (type classes) with scalaz and cats
Monix Task: https://monix.io/docs/2x/eval/task.html
"Cats" doesn't provide any Task or other IO-monad-related operations (no scalaz-effect analog at all) and recommends to use either Monix or FS2.
Regardless of monad-transformer absence, you still kinda need lifting when using Task:
from value to Task or
from Either to Task
But yes, it does seem to be simpler than monad transformers especially in respect to the fact monads are hardly composable - in order to define monad transformer you have to know some other details about your type besides being a monad (usually it requires something like comonad to extract value).
Just for advertising purposes, I would also add that Task represents stack-safe trampolined computation.
However, there are some projects focused on extended monadic composition, like Emm-monad: https://github.com/djspiewak/emm, so you can compose monad transformers with Future/Task, Either, Option, List and so on and so forth. But, IMO, it's still limited in comparison with Applicative composition - cats provides universal Nested data type that allows to easily compose any Applicative, you can find some examples in this answer - the only disadvantage here is that it's hard to build a readable DSL using Applicative. Another alternative is so-called "Freer monad": https://github.com/m50d/paperdoll, which basically provides better composition and allows to separate different effect layers into different interpreters.
For example, as there is no FutureT/TaskT transformer you can't build effects like type E = Option |: Task |: Base (Option from Task) as such flatMap would require extraction of value from the Future/Task.
As a conclusion, I can say that from my experience Task really comes in hand for do-notation based DSLs: I had a complex external rule-like DSL for async computations and when I decided to migrate it all to Scala-embedded version Task really helped - I literally converted external-DSL to Scala's for-comprehension. Another thing we considered is having some custom type, like ComputationRule with a set of type classes defined over it along with conversions to Task/Future or whatever we need, but this was because we didn't use Free-monad explicitly.
You might even not need Free-monad here assuming you don't need an ability to switch interpreters (which might be true for just system tests). In that case Task might be the only thing you need - it's lazy (in comparison with Future), truly functional and stack-safe:
trait DSL {
def put[E](e: E): Task[Unit]
def count[E](e: E): Task[Int]
}
object Implementation1 extends DSL {
...implementation
}
object Implementation2 extends DSL {
...implementation
}
//System-test script:
def test0(dsl: DSL) = {
import dsl._
for {
_ <- put("Apple")
_ <- put("Orange")
_ <- put("Pinneaple")
nApples <- count("Apple")
nPears <- count("Pear")
nBananas <- count("Banana")
} yield List(("Apple", nApples), ("Pears", nPears), ("Bananas", nBananas))
}
So you can switch implementation by passing different "interpreter" here:
test0(Implementation1).unsafeRun
test0(Implementation2).unsafeRun
Differences/Disadvantages (in comparison with http://typelevel.org/cats/datatypes/freemonad.html):
you stuck with Task type, so you can't collapse it to some other monad easily.
implementation is resolved in runtime when you pass an instance of DSL-trait (instead of natural transformation), you can easily abstract it using eta-expansion: test0 _. Polymorphic methods (put, count) are naturally supported by Java/Scala, but poly functions aren't so it's easier to pass instance of DSL containing T => Task[Unit] (for put operation) than making synthetic polymorphic function DSLEntry[T] => Task[Unit] using natural-transform DSLEntry ~> Task.
no explicit AST as instead of pattern matching inside natural transformation - we use static dispatch (explicitly calling a method, which will return lazy computation) inside DSL trait
Actually, you can even get rid of Task here:
trait DSL[F[_]] {
def put[E](e: E): F[Unit]
def count[E](e: E): F[Int]
}
def test0[M[_]: Monad](dsl: DSL[M]) = {...}
So here it might even become a matter of preference especially when you're not writing an open-source library.
Putting it all together:
import cats._
import cats.implicits._
trait DSL[F[_]] {
def put[E](e: E): F[Unit]
def count[E](e: E): F[Int]
}
def test0[M[_]: Monad](dsl: DSL[M]) = {
import dsl._
for {
_ <- put("Apple")
_ <- put("Orange")
_ <- put("Pinneaple")
nApples <- count("Apple")
nPears <- count("Pear")
nBananas <- count("Banana")
} yield List(("Apple", nApples), ("Pears", nPears), ("Bananas", nBananas))
}
object IdDsl extends DSL[Id] {
def put[E](e: E) = ()
def count[E](e: E) = 5
}
Note that cats have a Monad defined for Id, so:
scala> test0(IdDsl)
res2: cats.Id[List[(String, Int)]] = List((Apple,5), (Pears,5), (Bananas,5))
simply works. Of course, you can choose Task/Future/Option or any combination if you prefer. As a matter of fact, you can use Applicative instead of Monad:
def test0[F[_]: Applicative](dsl: DSL[F]) =
dsl.count("Apple") |#| dsl.count("Pinapple apple pen") map {_ + _ }
scala> test0(IdDsl)
res8: cats.Id[Int] = 10
|#| is a parallel operator, so you can use cats.Validated instead of Xor, be aware that |#| for Task isn't executed (at least in older scalaz version) in parallel (parallel operator not equals parallel computation). You can also use a combination of both:
import cats.syntax._
def test0[M[_]:Monad](d: DSL[M]) = {
for {
_ <- d.put("Apple")
_ <- d.put("Orange")
_ <- d.put("Pinneaple")
sum <- d.count("Apple") |#| d.count("Pear") |#| d.count("Banana") map {_ + _ + _}
} yield sum
}
scala> test0(IdDsl)
res18: cats.Id[Int] = 15
Doing some home project, I encountered an interested effect, which now , seems obvious to me, but still I do not see a way to get away from it.
That is the gist (I am using ScalaZ, but in haskell there would be probably the same result):
def askAndReadResponse(question: String): IO[String] = {
putStrLn(question) >> readLn
}
def core: IO[String] = {
val answer: IO[String] = askAndReadResponse("enter something")
val cond: IO[Boolean] = answer map {_.length > 2}
IO.ioMonad.ifM(cond, answer, core)
}
When I am trying to get an input from core, the askAndReadResponse evaluates twice - once for evaluating the condition, and then in ifM (so I have the message and readLn one more time then necessary).
What I need - just the validated value (to print it later, for instance)
Is there any elegant way to do this, in particular - to pass further the result of IO, without preceding IO actions, namely avoiding execution of askAndReadResponse twice?
You can sequence the effects using monadic binding with flatMap:
def core: IO[String] = askAndReadResponse("enter something").flatMap {
case response if response.length > 2 => response.point[IO]
case response => core
}
This lets you take the result of one computation (the user entering text after being prompted) and use it in subsequent computations (the calculation about whether to return or loop, and the result if returning).
ifM just isn't going to be useful in your case—it would only work here if your condition and your successful branch were independent computations.
Why and how specifically is a Scala Future not a Monad; and would someone please compare it to something that is a Monad, like an Option?
The reason I'm asking is Daniel Westheide's The Neophyte's Guide to Scala Part 8: Welcome to the Future where I asked whether or not a Scala Future was a Monad, and the author responded that it wasn't, which threw off base. I came here to ask for a clarification.
A summary first
Futures can be considered monads if you never construct them with effectful blocks (pure, in-memory computation), or if any effects generated are not considered as part of semantic equivalence (like logging messages). However, this isn't how most people use them in practice. For most people using effectful Futures (which includes most uses of Akka and various web frameworks), they simply aren't monads.
Fortunately, a library called Scalaz provides an abstraction called Task that doesn't have any problems with or without effects.
A monad definition
Let's review briefly what a monad is. A monad must be able to define at least these two functions:
def unit[A](block: => A)
: Future[A]
def bind[A, B](fa: Future[A])(f: A => Future[B])
: Future[B]
And these functions must statisfy three laws:
Left identity: bind(unit(a))(f) ≡ f(a)
Right identity: bind(m) { unit(_) } ≡ m
Associativity: bind(bind(m)(f))(g) ≡ bind(m) { x => bind(f(x))(g) }
These laws must hold for all possible values by definition of a monad. If they don't, then we simply don't have a monad.
There are other ways to define a monad that are more or less the same. This one is popular.
Effects lead to non-values
Almost every usage of Future that I've seen uses it for asychronous effects, input/output with an external system like a web service or a database. When we do this, a Future isn't even a value, and mathematical terms like monads only describe values.
This problem arises because Futures execute immediately upon data contruction. This messes up the ability to substitute expressions with their evaluated values (which some people call "referential transparency"). This is one way to understand why Scala's Futures are inadequate for functional programming with effects.
Here's an illustration of the problem. If we have two effects:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def twoEffects =
( Future { println("hello") },
Future { println("hello") } )
we will have two printings of "hello" upon calling twoEffects:
scala> twoEffects
hello
hello
scala> twoEffects
hello
hello
But if Futures were values, we should be able to factor out the common expression:
lazy val anEffect = Future { println("hello") }
def twoEffects = (anEffect, anEffect)
But this doesn't give us the same effect:
scala> twoEffects
hello
scala> twoEffects
The first call to twoEffects runs the effect and caches the result, so the effect isn't run the second time we call twoEffects.
With Futures, we end up having to think about the evaluation policy of the language. For instance, in the example above, the fact I use a lazy value rather than a strict one makes a difference in the operational semantics. This is exactly the kind of twisted reasoning functional programming is designed to avoid -- and it does it by programming with values.
Without substitution, laws break
In the presense of effects, monad laws break. Superficially, the laws appear to hold for simple cases, but the moment we begin to substitute expressions with their evaluated values, we end up with the same problems we illustrated above. We simply can't talk about mathematical concepts like monads when we don't have values in the first place.
To put it bluntly, if you use effects with your Futures, saying they're monads is not even wrong because they aren't even values.
To see how monad laws break, just factor out your effectful Future:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def unit[A]
(block: => A)
: Future[A] =
Future(block)
def bind[A, B]
(fa: Future[A])
(f: A => Future[B])
: Future[B] =
fa flatMap f
lazy val effect = Future { println("hello") }
Again, it will only run one time, but you need it to run twice -- once for the right-hand side of the law, and another for the left. I'll illustrate the problem for the right identity law:
scala> effect // RHS has effect
hello
scala> bind(effect) { unit(_) } // LHS doesn't
The implicit ExecutionContext
Without putting an ExecutionContext in implicit scope, we can't define either unit or bind in our monad. This is because the Scala API for Futures has these signature:
object Future {
// what we need to define unit
def apply[T]
(body: ⇒ T)
(implicit executor: ExecutionContext)
: Future[T]
}
trait Future {
// what we need to define bind
flatMap[S]
(f: T ⇒ Future[S])
(implicit executor: ExecutionContext)
: Future[S]
}
As a "convenience" to the user, the standard library encourages users to define an execution context in implicit scope, but I think this is a huge hole in the API that just leads to defects. One scope of the computation may have one execution context defined while another scope can have another context defined.
Perhaps you can ignore the problem if you define an instance of unit and bind that pins both operations to a single context and use this instance consistently. But this is not what people do most of the time. Most of the time, people use Futures with for-yield comprehensions that become map and flatMap calls. To make for-yield comprehensions work, an execution context must be defined at some non-global implicit scope (because for-yield doesn't provide a way to specify additional parameters to the map and flatMap calls).
To be clear, Scala lets you use lots of things with for-yield comprehensions that aren't actually monads, so don't believe that you have a monad just because it works with for-yield syntax.
A better way
There's a nice library for Scala called Scalaz that has an abstraction called scalaz.concurrent.Task. This abstraction doesn't run effects upon data construction as the standard library Future does. Furthermore, Task actually is a monad. We compose Task monadically (we can use for-yield comprehensions if we like), and no effects run while we're composing. We have our final program when we have composed a single expression evaluating to Task[Unit]. This ends up being our equivalent of a "main" function, and we can finally run it.
Here's an example illustrating how we can substitute Task expressions with their respective evaluated values:
import scalaz.concurrent.Task
import scalaz.IList
import scalaz.syntax.traverse._
def twoEffects =
IList(
Task delay { println("hello") },
Task delay { println("hello") }).sequence_
We will have two printings of "hello" upon calling twoEffects:
scala> twoEffects.run
hello
hello
And if we factor out the common effect,
lazy val anEffect = Task delay { println("hello") }
def twoEffects =
IList(anEffect, anEffect).sequence_
we get what we'd expect:
scala> twoEffects.run
hello
hello
In fact, it doesn't really matter that whether we use a lazy value or a strict value with Task; we get hello printed out twice either way.
If you want to functionally program, consider using Task everywhere you may use Futures. If an API forces Futures upon you, you can convert the Future to a Task:
import concurrent.
{ ExecutionContext, Future, Promise }
import util.Try
import scalaz.\/
import scalaz.concurrent.Task
def fromScalaDeferred[A]
(future: => Future[A])
(ec: ExecutionContext)
: Task[A] =
Task
.delay { unsafeFromScala(future)(ec) }
.flatMap(identity)
def unsafeToScala[A]
(task: Task[A])
: Future[A] = {
val p = Promise[A]
task.runAsync { res =>
res.fold(p failure _, p success _)
}
p.future
}
private def unsafeFromScala[A]
(future: Future[A])
(ec: ExecutionContext)
: Task[A] =
Task.async(
handlerConversion
.andThen { future.onComplete(_)(ec) })
private def handlerConversion[A]
: ((Throwable \/ A) => Unit)
=> Try[A]
=> Unit =
callback =>
{ t: Try[A] => \/ fromTryCatch t.get }
.andThen(callback)
The "unsafe" functions run the Task, exposing any internal effects as side-effects. So try not to call any of these "unsafe" functions until you've composed one giant Task for your entire program.
I believe a Future is a Monad, with the following definitions:
def unit[A](x: A): Future[A] = Future.successful(x)
def bind[A, B](m: Future[A])(fun: A => Future[B]): Future[B] = fut.flatMap(fun)
Considering the three laws:
Left identity:
Future.successful(a).flatMap(f) is equivalent to f(a). Check.
Right identity:
m.flatMap(Future.successful _) is equivalent to m (minus some possible performance implications). Check.
Associativity
m.flatMap(f).flatMap(g) is equivalent to m.flatMap(x => f(x).flatMap(g)). Check.
Rebuttal to "Without substitution, laws break"
The meaning of equivalent in the monad laws, as I understand it, is you could replace one side of the expression with the other side in your code without changing the behavior of the program. Assuming you always use the same execution context, I think that is the case. In the example #sukant gave, it would have had the same issue if it had used Option instead of Future. I don't think the fact that the futures are evaluated eagerly is relevant.
As the other commenters have suggested, you are mistaken. Scala's Future type has the monadic properties:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def unit[A](block: => A): Future[A] = Future(block)
def bind[A, B](fut: Future[A])(fun: A => Future[B]): Future[B] = fut.flatMap(fun)
This is why you can use for-comprehension syntax with futures in Scala.