So I was reading the "Scala with Cats" book, and there was this sentence which I'm going to quote down here:
Note that Scala’s Futures aren’t a great example of pure functional programming because they aren’t referentially transparent.
And also, an example is provided as follows:
val future1 = {
// Initialize Random with a fixed seed:
val r = new Random(0L)
// nextInt has the side-effect of moving to
// the next random number in the sequence:
val x = Future(r.nextInt)
for {
a <- x
b <- x
} yield (a, b)
}
val future2 = {
val r = new Random(0L)
for {
a <- Future(r.nextInt)
b <- Future(r.nextInt)
} yield (a, b)
}
val result1 = Await.result(future1, 1.second)
// result1: (Int, Int) = (-1155484576, -1155484576)
val result2 = Await.result(future2, 1.second)
// result2: (Int, Int) = (-1155484576, -723955400)
I mean, I think it's because of the fact that r.nextInt is never referentially transparent, right? since identity(r.nextInt) would never be equal to identity(r.nextInt), does this mean that identity is not referentially transparent either? (or Identity monad, to have better comparisons with Future). If the expression being calculated is RT, then the Future would also be RT:
def foo(): Int = 42
val x = Future(foo())
Await.result(x, ...) == Await.result(Future(foo()), ...) // true
So as far as I can reason about the example, almost every function and Monad type should be non-RT. Or is there something special about Future? I also read this question and its answers, yet couldn't find what I was looking for.
You are actually right and you are touching one of the pickiest points of FP; at least in Scala.
Technically speaking, Future on its own is RT. The important thing is that different to IO it can't wrap non-RT things into an RT description. However, you can say the same of many other types like List, or Option; so why folks don't make a fuss about it?
Well, as with many things, the devil is in the details.
Contrary to List or Option, Future is typically used with non-RT things; e.g. an HTTP request or a database query. Thus, the emphasis folks give in showing that Future can't guarantee RT in those situations.
More importantly, there is only one reason to introduce Future on a codebase, concurrency (not to be confused with parallelism); otherwise, it would be the same as Try. Thus, controlling when and how those are executed is usually important.
Which is the reason why cats recommends the use of IO for all use cases of Future
Note: You can find a similar discussion on this cats PR and its linked discussions: https://github.com/typelevel/cats/pull/4182
So... the referential transparency simply means that you should be able to replace the reference with the actual thing (and vice versa) without changing the overall symatics or behaviour. Like mathematics is.
So, lets say you have x = 4 and y = 5, then x + y, 4 + y, x + 5, and 4 + 5 are pretty much the same thing. And can be replaced with each otherwhenever you want.
But... just look at following two things...
val f1 = Future { println("Hi") }
val f2 = f1
val f1 = Future { println("Hi") }
val f2 = Future { println("Hi") }
You can try to run it. The behaviour of these two programs is not going to be the same.
Scala Future are eagerly evaluated... which means that there is no way to actually write Future { println("Hi") } in your code without executing it as a seperate behaviour.
Keep in mind that this is not just linked to having side effects. Yes, the example which I used here with println was a side effect, but that was just to make the behaviour difference obvious to notice.
Even if you use something to suspend the side effect inside the Future, you will endup with two suspended side effects instead of one. And once these suspended side effects are passed to the interpreater, the same action will happen twice.
In following example, even if we suspend the print side-effect by wrapping it up in an IO, the expansive evaluation part of the program can still cause different behavours even if everything in the universe is exactly same for two cases.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration._
// cpu bound
// takes around 80 miliseconds
// we have only 1 core
def veryExpensiveComputation(input: Int): Int = ???
def impl1(): Unit = {
val f1 = Future {
val result = veryExpensiveComputation(10)
IO {
println(result)
result
}
}
val f2 = f1
val f3 = f1
val futures = Future.sequence(Seq(f1, f2, f3))
val ios = Await.result(futures, 100 milli)
}
def impl2(): Unit = {
val f1 = Future {
val result = veryExpensiveComputation(10)
IO {
println(result)
result
}
}
val f2 = Future {
val result = veryExpensiveComputation(10)
IO {
println(result)
result
}
}
val f3 = Future {
val result = veryExpensiveComputation(10)
IO {
println(result)
result
}
}
val futures = Future.sequence(Seq(f1, f2, f3))
val ios = Await.result(futures, 100 milli)
}
The first impl will cause only 1 expensive computation, but the second will trigger 3 expensive computations. And thus the program will fail with timeout in the second example.
If properly written with IO or ZIO (without Future), it with fail with timeout in both implementations.
Related
I'm fairly new to cats-effect, but I think I am getting a handle on it. But I have come to a situation where I want to memoize the result of an IO, and it's not doing what I expect.
The function I want to memoize transforms String => String, but the transformation requires a network call, so it is implemented as a function String => IO[String]. In a non-IO world, I'd simply save the result of the call, but the defining function doesn't actually have access to it, as it doesn't execute until later. And if I save the constructed IO[String], it won't actually help, as that IO would repeat the network call every time it's used. So instead, I try to use Async.memoize, which has the following documentation:
Lazily memoizes f. For every time the returned F[F[A]] is bound, the
effect f will be performed at most once (when the inner F[A] is bound
the first time).
What I expect from memoize is a function that only ever executes once for a given input, AND where the contents of the returned IO are only ever evaluated once; in other words, I expect the resulting IO to act as if it were IO.pure(result), except the first time. But that's not what seems to be happening. Instead, I find that while the called function itself only executes once, the contents of the IO are still evaluated every time - exactly as would occur if I tried to naively save and reuse the IO.
I constructed an example to show the problem:
def plus1(num: Int): IO[Int] = {
println("foo")
IO(println("bar")) *> IO(num + 1)
}
var fooMap = Map[Int, IO[IO[Int]]]()
def mplus1(num: Int): IO[Int] = {
val check = fooMap.get(num)
val res = check.getOrElse {
val plus = Async.memoize(plus1(num))
fooMap = fooMap + ((num, plus))
plus
}
res.flatten
}
println("start")
val call1 = mplus1(2)
val call2 = mplus1(2)
val result = (call1 *> call2).unsafeRunSync()
println(result)
println(fooMap.toString)
println("finish")
The output of this program is:
start
foo
bar
bar
3
Map(2 -> <function1>)
finish
Although the plus1 function itself only executes once (one "foo" printed), the output "bar" contained within the IO is printed twice, when I expect it to also print only once. (I have also tried flattening the IO returned by Async.memoize before storing it in the map, but that doesn't do much).
Consider following examples
Given the following helper methods
def plus1(num: Int): IO[IO[Int]] = {
IO(IO(println("plus1")) *> IO(num + 1))
}
def mPlus1(num: Int): IO[IO[Int]] = {
Async.memoize(plus1(num).flatten)
}
Let's build a program that evaluates plus1(1) twice.
val program1 = for {
io <- plus1(1)
_ <- io
_ <- io
} yield {}
program1.unsafeRunSync()
This produces the expected output of printing plus1 twice.
If you do the same but instead using the mPlus1 method
val program2 = for {
io <- mPlus1(1)
_ <- io
_ <- io
} yield {}
program2.unsafeRunSync()
It will print plus1 just once confirming that memoization is working.
The trick with the memoization is that it should be evaluated only once to have the desired effect. Consider now the following program that highlights it.
val memIo = mPlus1(1)
val program3 = for {
io1 <- memIo
io2 <- memIo
_ <- io1
_ <- io2
} yield {}
program3.unsafeRunSync()
And it outputs plus1 twice as io1 and io2 are memoized separately.
As for your example, the foo is printed once because you're using a map and update the value when it's not found and this happens only once. The bar is printed every time when IO is evaluated as you lose the memoization effect by calling res.flatten.
Let's say I have a ListBuffer[Int] and I iterate it with a foreach loop, and each loop will modify this list from inside a Future (removing the current element), and will do something special when the list is empty. Example code:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.collection.mutable.ListBuffer
val l = ListBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
l.foreach(n => Future {
println(s"Processing $n")
Future {
l -= n
println(s"Removed $n")
if (l.isEmpty) println("List is empty!")
}
})
This is probably going to end very badly. I have a more complex code with similar structure and same needs, but I do not know how to structure it so I can achieve same functionality in a more reliable way.
The way you present your problem is really not in the functional paradigm that scala is intended for.
What you seem to want, is to do a list of asynchronous computations, do something at the end of each one, and something else when every one is finished. This is pretty simple if you use continuations, which are simple to implement with map and flatMap methods on Future.
val fa: Future[Int] = Future { 1 }
// will apply the function to the result when it becomes available
val fb: Future[Int] = fa.map(a => a + 1)
// will start the asynchronous computation using the result when it will become available
val fc: Future[Int] = fa.flatMap(a => Future { a + 2 })
Once you have all this, you can easily do something when each of your Future completes (successfully):
val myFutures: List[Future[Int]] = ???
myFutures.map(futInt => futInt.map(int => int + 2))
Here, I will add 2 to each value I get from the different asynchronous computations in the List.
You can also choose to wait for all the Futures in your list to complete by using Future.sequence:
val myFutureList: Future[List[Int]] = Future.sequence(myFutures)
Once again, you get a Future, which will be resolved when each of the Futures inside the input list are successfully resolved, or will fail whenever one of your Futures fails. You'll then be able to use map or flatMap on this new Future, to use all the computed values at once.
So here's how I would write the code you proposed:
val l = 1 to 10
val processings: Seq[Future[Unit]] = l.map {n =>
Future(println(s"processing $n")).map {_ =>
println(s"finished processing $n")
}
}
val processingOver: Future[Unit] =
Future.sequence(processings).map { (lu: Seq[Unit]) =>
println(s"Finished processing ${lu.size} elements")
}
Of course, I would recommend having real functions rather than procedures (returning Unit), so that you can have values to do something with. I used println to have a code which will produce the same output as yours (except for the prints, which have a slightly different meaning, since we are not mutating anything anymore).
I have two functions which return Futures. I'm trying to feed a modified result from first function into the other using a for-yield comprehension.
This approach works:
val schoolFuture = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- schoolStore.getSchool(sid.get) if sid.isDefined
} yield s
However I'm not happy with having the "if" in there, it seems that I should be able to use a map instead.
But when I try with a map:
val schoolFuture: Future[Option[School]] = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- sid.map(schoolStore.getSchool(_))
} yield s
I get a compile error:
[error] found : Option[scala.concurrent.Future[Option[School]]]
[error] required: scala.concurrent.Future[Option[School]]
[error] s <- sid.map(schoolStore.getSchool(_))
I've played around with a few variations, but haven't found anything attractive that works. Can anyone suggest a nicer comprehension and/or explain what's wrong with my 2nd example?
Here is a minimal but complete runnable example with Scala 2.10:
import concurrent.{Future, Promise}
case class User(userId: Int)
case class UserDetails(userId: Int, schoolId: Option[Int])
case class School(schoolId: Int, name: String)
trait Error
class UserStore {
def getUserDetails(userId: Int): Future[Either[Error, UserDetails]] = Promise.successful(Right(UserDetails(1, Some(1)))).future
}
class SchoolStore {
def getSchool(schoolId: Int): Future[Option[School]] = Promise.successful(Option(School(1, "Big School"))).future
}
object Demo {
import concurrent.ExecutionContext.Implicits.global
val userStore = new UserStore
val schoolStore = new SchoolStore
val user = User(1)
val schoolFuture: Future[Option[School]] = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- sid.map(schoolStore.getSchool(_))
} yield s
}
(Edited to give a correct answer!)
The key here is that Future and Option don't compose inside for because there aren't the correct flatMap signatures. As a reminder, for desugars like so:
for ( x0 <- c0; w1 = d1; x1 <- c1 if p1; ... ; xN <- cN) yield f
c0.flatMap{ x0 =>
val w1 = d1
c1.filter(x1 => p1).flatMap{ x1 =>
... cN.map(xN => f) ...
}
}
(where any if statement throws a filter into the chain--I've given just one example--and the equals statements just set variables before the next part of the chain). Since you can only flatMap other Futures, every statement c0, c1, ... except the last had better produce a Future.
Now, getUserDetails and getSchool both produce Futures, but sid is an Option, so we can't put it on the right-hand side of a <-. Unfortunately, there's no clean out-of-the-box way to do this. If o is an option, we can
o.map(Future.successful).getOrElse(Future.failed(new Exception))
to turn an Option into an already-completed Future. So
for {
ud <- userStore.getUserDetails(user.userId) // RHS is a Future[Either[...]]
sid = ud.right.toOption.flatMap(_.schoolId) // RHS is an Option[Int]
fid <- sid.map(Future.successful).getOrElse(Future.failed(new Exception)) // RHS is Future[Int]
s <- schoolStore.getSchool(fid)
} yield s
will do the trick. Is that better than what you've got? Doubtful. But if you
implicit class OptionIsFuture[A](val option: Option[A]) extends AnyVal {
def future = option.map(Future.successful).getOrElse(Future.failed(new Exception))
}
then suddenly the for-comprehension looks reasonable again:
for {
ud <- userStore.getUserDetails(user.userId)
sid <- ud.right.toOption.flatMap(_.schoolId).future
s <- schoolStore.getSchool(sid)
} yield s
Is this the best way to write this code? Probably not; it relies upon converting a None into an exception simply because you don't know what else to do at that point. This is hard to work around because of the design decisions of Future; I'd suggest that your original code (which invokes a filter) is at least as good of a way to do it.
This answer to a similar question about Promise[Option[A]] might help. Just substitute Future for Promise.
I'm inferring the following types for getUserDetails and getSchool from your question:
getUserDetails: UserID => Future[Either[??, UserDetails]]
getSchool: SchoolID => Future[Option[School]]
Since you ignore the failure value from the Either, transforming it to an Option instead, you effectively have two values of type A => Future[Option[B]].
Once you've got a Monad instance for Future (there may be one in scalaz, or you could write your own as in the answer I linked), applying the OptionT transformer to your problem would look something like this:
for {
ud <- optionT(getUserDetails(user.userID) map (_.right.toOption))
sid <- optionT(Future.successful(ud.schoolID))
s <- optionT(getSchool(sid))
} yield s
Note that, to keep the types compatible, ud.schoolID is wrapped in an (already completed) Future.
The result of this for-comprehension would have type OptionT[Future, SchoolID]. You can extract a value of type Future[Option[SchoolID]] with the transformer's run method.
What behavior would you like to occur in the case that the Option[School] is None? Would you like the Future to fail? With what kind of exception? Would you like it to never complete? (That sounds like a bad idea).
Anyways, the if clause in a for-expression desugars to a call to the filter method. The contract on Future#filteris thus:
If the current future contains a value which satisfies the predicate,
the new future will also hold that value. Otherwise, the resulting
future will fail with a NoSuchElementException.
But wait:
scala> None.get
java.util.NoSuchElementException: None.get
As you can see, None.get returns the exact same thing.
Thus, getting rid of the if sid.isDefined should work, and this should return a reasonable result:
val schoolFuture = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- schoolStore.getSchool(sid.get)
} yield s
Keep in mind that the result of schoolFuture can be in instance of scala.util.Failure[NoSuchElementException]. But you haven't described what other behavior you'd like.
We've made small wrapper on Future[Option[T]] which acts like one monad (nobody even checked none of monad laws, but there is map, flatMap, foreach, filter and so on) - MaybeLater. It behaves much more than an async option.
There are a lot of smelly code there, but maybe it will be usefull at least as an example.
BTW: there are a lot of open questions(here for ex.)
It's easier to use https://github.com/qifun/stateless-future or https://github.com/scala/async to do A-Normal-Form transform.
Blocking bad, async good, but is blocking within a future still blocking? This is something that I keep coming back to; consider following pseudo-code:
def queryName(id:Id):Future[String]
def queryEveryonesNames:Future[Seq[String]] = {
val everyonesIds:Future[Seq[Id]] = getIds
val everyonesNames:Future[Seq[Future[String]]] = {
everyonesIds.map(seq.map(id=>queryName(id)))
}
// I'm trying to understand the impact of what I'll do below
everyonesNames.map(seq=>seq.map(fut=>blocking(fut, 1 s)))
}
queryEveryonesNames
In the last line I turned Future[Seq[Future[String]]] (notice future within future) into Future[Seq[String]] by blocking on the inner future.
Blocking on a future within a future feels redundant, at least here, yet having a future within a future feels redundant as well.
Can you propose a smarter way of getting rid of the inner future?
Do you think blocking on a future inside a future is bad? If so why and under what circumstances?
Yes, future blocking is blocking, you should avoid that, as resources will be blocked to wait for a result, even if they are in another thread.
If I understood correctly, your question is how to convert Future[Seq[Future[String]]] into Future[Seq[String]] in a non-blocking way.
You can do that with for-comprehensions:
val in = Future[Seq[Future[String]]]
val m = for( a <- in ) // a is Seq[Future[String]]
yield ( Future.sequence(a)) // yields m = Future[Future[Seq[String]]]
val result = for(a <- m; b <- a) yield (b) // yields Future[Seq[String]]
EDIT:
Or just:
val result = in.flatMap(a => Future.sequence(a))
I have two functions which return Futures. I'm trying to feed a modified result from first function into the other using a for-yield comprehension.
This approach works:
val schoolFuture = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- schoolStore.getSchool(sid.get) if sid.isDefined
} yield s
However I'm not happy with having the "if" in there, it seems that I should be able to use a map instead.
But when I try with a map:
val schoolFuture: Future[Option[School]] = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- sid.map(schoolStore.getSchool(_))
} yield s
I get a compile error:
[error] found : Option[scala.concurrent.Future[Option[School]]]
[error] required: scala.concurrent.Future[Option[School]]
[error] s <- sid.map(schoolStore.getSchool(_))
I've played around with a few variations, but haven't found anything attractive that works. Can anyone suggest a nicer comprehension and/or explain what's wrong with my 2nd example?
Here is a minimal but complete runnable example with Scala 2.10:
import concurrent.{Future, Promise}
case class User(userId: Int)
case class UserDetails(userId: Int, schoolId: Option[Int])
case class School(schoolId: Int, name: String)
trait Error
class UserStore {
def getUserDetails(userId: Int): Future[Either[Error, UserDetails]] = Promise.successful(Right(UserDetails(1, Some(1)))).future
}
class SchoolStore {
def getSchool(schoolId: Int): Future[Option[School]] = Promise.successful(Option(School(1, "Big School"))).future
}
object Demo {
import concurrent.ExecutionContext.Implicits.global
val userStore = new UserStore
val schoolStore = new SchoolStore
val user = User(1)
val schoolFuture: Future[Option[School]] = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- sid.map(schoolStore.getSchool(_))
} yield s
}
(Edited to give a correct answer!)
The key here is that Future and Option don't compose inside for because there aren't the correct flatMap signatures. As a reminder, for desugars like so:
for ( x0 <- c0; w1 = d1; x1 <- c1 if p1; ... ; xN <- cN) yield f
c0.flatMap{ x0 =>
val w1 = d1
c1.filter(x1 => p1).flatMap{ x1 =>
... cN.map(xN => f) ...
}
}
(where any if statement throws a filter into the chain--I've given just one example--and the equals statements just set variables before the next part of the chain). Since you can only flatMap other Futures, every statement c0, c1, ... except the last had better produce a Future.
Now, getUserDetails and getSchool both produce Futures, but sid is an Option, so we can't put it on the right-hand side of a <-. Unfortunately, there's no clean out-of-the-box way to do this. If o is an option, we can
o.map(Future.successful).getOrElse(Future.failed(new Exception))
to turn an Option into an already-completed Future. So
for {
ud <- userStore.getUserDetails(user.userId) // RHS is a Future[Either[...]]
sid = ud.right.toOption.flatMap(_.schoolId) // RHS is an Option[Int]
fid <- sid.map(Future.successful).getOrElse(Future.failed(new Exception)) // RHS is Future[Int]
s <- schoolStore.getSchool(fid)
} yield s
will do the trick. Is that better than what you've got? Doubtful. But if you
implicit class OptionIsFuture[A](val option: Option[A]) extends AnyVal {
def future = option.map(Future.successful).getOrElse(Future.failed(new Exception))
}
then suddenly the for-comprehension looks reasonable again:
for {
ud <- userStore.getUserDetails(user.userId)
sid <- ud.right.toOption.flatMap(_.schoolId).future
s <- schoolStore.getSchool(sid)
} yield s
Is this the best way to write this code? Probably not; it relies upon converting a None into an exception simply because you don't know what else to do at that point. This is hard to work around because of the design decisions of Future; I'd suggest that your original code (which invokes a filter) is at least as good of a way to do it.
This answer to a similar question about Promise[Option[A]] might help. Just substitute Future for Promise.
I'm inferring the following types for getUserDetails and getSchool from your question:
getUserDetails: UserID => Future[Either[??, UserDetails]]
getSchool: SchoolID => Future[Option[School]]
Since you ignore the failure value from the Either, transforming it to an Option instead, you effectively have two values of type A => Future[Option[B]].
Once you've got a Monad instance for Future (there may be one in scalaz, or you could write your own as in the answer I linked), applying the OptionT transformer to your problem would look something like this:
for {
ud <- optionT(getUserDetails(user.userID) map (_.right.toOption))
sid <- optionT(Future.successful(ud.schoolID))
s <- optionT(getSchool(sid))
} yield s
Note that, to keep the types compatible, ud.schoolID is wrapped in an (already completed) Future.
The result of this for-comprehension would have type OptionT[Future, SchoolID]. You can extract a value of type Future[Option[SchoolID]] with the transformer's run method.
What behavior would you like to occur in the case that the Option[School] is None? Would you like the Future to fail? With what kind of exception? Would you like it to never complete? (That sounds like a bad idea).
Anyways, the if clause in a for-expression desugars to a call to the filter method. The contract on Future#filteris thus:
If the current future contains a value which satisfies the predicate,
the new future will also hold that value. Otherwise, the resulting
future will fail with a NoSuchElementException.
But wait:
scala> None.get
java.util.NoSuchElementException: None.get
As you can see, None.get returns the exact same thing.
Thus, getting rid of the if sid.isDefined should work, and this should return a reasonable result:
val schoolFuture = for {
ud <- userStore.getUserDetails(user.userId)
sid = ud.right.toOption.flatMap(_.schoolId)
s <- schoolStore.getSchool(sid.get)
} yield s
Keep in mind that the result of schoolFuture can be in instance of scala.util.Failure[NoSuchElementException]. But you haven't described what other behavior you'd like.
We've made small wrapper on Future[Option[T]] which acts like one monad (nobody even checked none of monad laws, but there is map, flatMap, foreach, filter and so on) - MaybeLater. It behaves much more than an async option.
There are a lot of smelly code there, but maybe it will be usefull at least as an example.
BTW: there are a lot of open questions(here for ex.)
It's easier to use https://github.com/qifun/stateless-future or https://github.com/scala/async to do A-Normal-Form transform.