For comprehension - execute futures in order - scala

If I have the following for comprehension, futures will be executed in order: f1, f2, f3:
val f = for {
r1 <- f1
r2 <- f2(r1)
r3 <- f3(r2)
} yield r3
For this one however, all the futures are started at the same time:
val f = for {
r1 <- f1
r2 <- f2
r3 <- f3
} yield ...
How can I enforce the order?(I want this order of execution f1, f2, f3)

It does matter what f1, f2, f3 are: a future will start executing a soon as it is created. In your first case, f2(r1) must be a function returning a future, so the future begins executing when the function is called, which happens when r1 becomes available.
If the second case is the same (f2 is a function), then behavior will be the same as the first case, your futures will be executed sequentially, one after the other.
But if you create the futures outside the for, and just assign them to variables f1, f2, f3, then by the time you get inside the comprehension, they are already running.

Future are eager constructs, that is, once created you can not dictate when they get processed. If the Future already exists when you attempt to use it in a for-comprehension, you've already lost the ability to sequence it's execution order.
If you want to enforce ordering on a method that accepts Future arguments then you'll need to wrap the evaluation in a thunk:
def foo(ft: => Future[Thing], f2: => Future[Thing]): Future[Other] = for{
r1 <- ft
r2 <- f2
} yield something(r1, r2)
If, on the other hand, you want to define the Future within a method body, then instead of val use a def
def foo() ={
def f1 = Future{ code here... }
def f2 = Future{ code here... }
for{
r1 <- f1
r2 <- f2
} yield something(r1, r2)

Executing futures in for comprehension is default behavior. It is good when few tasks are processed parrallel without any blocking.
But if you want to preserve procecessing order you have to ways:
Send result of first task to second like in your example
use andThen operator
val allposts = mutable.Set[String]()
Future {
session.getRecentPosts
} andThen {
posts => allposts ++= posts
} andThen {
posts =>
clearAll()
for (post <- allposts) render(post)
}

Related

ZIO : How to compute only once?

I am using ZIO: https://github.com/zio/zio
in my build.sbt:
"dev.zio" %% "zio" % "1.0.0-RC9"
No matter what I tried, my results are always being computed each time I need them:
val t = Task {
println(s"Compute")
12
}
val r = unsafeRun(for {
tt1 <- t
tt2 <- t
} yield {
tt1 + tt2
})
println(r)
For this example, the log look like :
Compute
Compute
24
I tried with Promise:
val p = for {
p <- Promise.make[Nothing, Int]
_ <- p.succeed {
println(s"Compute - P")
48
}
r <- p.await
} yield {
r
}
val r = unsafeRun(for {
tt1 <- p
tt2 <- p
} yield {
tt1 + tt2
})
And I get the same issue:
Compute - P
Compute - P
96
I tried with
val p = for {
p <- Promise.make[Nothing, Int]
_ <- p.succeed(48)
r <- p.await
} yield {
println(s"Compute - P")
r
}
first and I was thinking that maybe the pipeline is executed but not the value recomputed but I does not work either.
I would like to be able to compute asynchronously my values and be able to reuse them.
I looked at How do I make a Scalaz ZIO lazy? but it does not work for me either.
ZIO has memoize, which should do essentially what you want. I don't have a way to test it just now, but it should work something like:
for {
memoized <- t.memoize
tt1 <- memoized
tt2 <- memoized
} yield tt1 + tt2
Note that unless the second and third lines of your real code have some branching that might result in the Task never getting called, or getting called only once, this yields the same answer and side effects as the much simpler:
t flatMap {tt => tt + tt}
Does computing the results have side effects? If it doesn't you can just use a regular old lazy val, perhaps lifted into ZIO.
lazy val results = computeResults()
val resultsIO = ZIO.succeedLazy(results)
If it does have side effects, you can't really cache the results because that wouldn't be referentially transparent, which is the whole point of ZIO.
What you'll probably have to do is flatMap on your compute Task and write the rest of your program which needs the result of that computation inside that call to flatMap, threading the result value as a parameter through your function calls where necessary.
val compute = Task {
println(s"Compute")
12
}
compute.flatMap { result =>
// the rest of your program
}

Return Future[(Int,Int)] instead of (Future[Int],Future[Int])

I have the following futures:
def f1 = Future {1}
def f2 = Future {2}
I need the following code to return Future[(Int,Int)] :
val future = // function that returns a future
future.flatMap {
val x1 = f1
val x2 = f2
(x1,x2) // This returns Future[Int],Future[Int]
}
Instead of (Future[Int],Future[Int]) I need the function to return Future[(Int,Int)]. How to convert it?
I'm going to take issue with the (currently) accepted answer here. As per my comment, this is not really the correct way to zip together two futures. the correct way is simply this:
f1 zip f2
The other answer:
for (x <- f1; y <- f2) yield (x, y)
Whilst this will work, it is not parallel in the case that f2 is an expression yielding a future (as it is in this question). If this is the case, f2 will not be constructed until the first future has completed [1]. Whilst zip has been implemented in terms of flatMap in the same way, because its argument is strict, the second future is already running (subject to the execution context of course).
It's also more succinct!
[1] - this can be seen by observing that y, the value computed by f1, is in scope as f2 is constructed
Appendix
This can be demonstrated easily:
scala> import scala.concurrent._; import ExecutionContext.Implicits.global
import scala.concurrent._
import ExecutionContext.Implicits.global
scala> def createAndStartFuture(i: Int): Future[Int] = Future {
| println(s"starting $i in ${Thread.currentThread.getName} at ${java.time.Instant.now()}")
| Thread.sleep(20000L)
| i
| }
createAndStartFuture: (i: Int)scala.concurrent.Future[Int]
With this:
scala> for (x <- createAndStartFuture(1); y <- createAndStartFuture(2)) yield (x, y)
starting 1 in scala-execution-context-global-34 at 2017-05-05T10:29:47.635Z
res15: scala.concurrent.Future[(Int, Int)] = Future(<not completed>)
// Waits 20s
starting 2 in scala-execution-context-global-32 at 2017-05-05T10:30:07.636Z
But with zip
scala> createAndStartFuture(1) zip createAndStartFuture(2)
starting 1 in scala-execution-context-global-34 at 2017-05-05T10:30:45.434Z
starting 2 in scala-execution-context-global-32 at 2017-05-05T10:30:45.434Z
res16: scala.concurrent.Future[(Int, Int)] = Future(<not completed>)
This is a classic use case for a for comprehension.
for {
x1 <- f1
x2 <- f2
} yield (x1, x2)
This is equivalent to a flatmap followed by a map
f1.flatMap(x1 => f2.map(x2 => (x1,x2)))
Since you will not reach the inside of f1.flatMap until x1 is ready, this code will execute sequentially.
If you wish for the futures to execute concurrently, you can instantiated the futures before the for comprehension.
val future1 = f1
val future2 = f2
for {
x1 <- future1
x2 <- future2
} yield (x1, x2)

How to dynamically generate parallel futures with for-yield

I have below code:
val f1 = Future(genA1)
val f2 = Future(genA2)
val f3 = Future(genA3)
val f4 = Future(genA4)
val results: Future[Seq[A]] = for {
a1 <- f1
a2 <- f2
a3 <- f3
a4 <- f4
} yield Seq(a, b, c, d)
Now I have a requirement to optionally exclude a2, how to modified the code? ( with map or flatMap is also acceptable)
Further more, say if I have M possible future needs to be aggregated like above, and N of M could be optionally excluded against some flag (biz logic), how should I handle it?
thanks in advance!
Leon
In question1, I understand that you want to exclude one entry (e.g B) from the sequence given some logic and in question2, you want to supress N entries from a total of M, and have the future computed on those results. We could generalize both cases to something like this:
// Using a map as simple example, but 'generators' could be a function that creates the required computation
val generators = Map('a' -> genA1, 'b' -> genA1, 'c' -> genA3, 'd' -> genA4)
...
// shouldAccept(k) => Business logic to decide which computations should be executed.
val selectedGenerators = generators.filter{case (k,v) => shouldAccept(k)}
// Create Seq[Future] from the selected computations
val futures = selectedGenerators.map{case (k,v) => Future(v)}
// Create Future[Seq[_]] to have the result of computing all entries.
val result = Future.sequence(futures)
In general, what I think you are looking for is Future.sequence, which takes a Seq[Future[_]] and produces a Future[Seq[_]], which is basically what you are doing "by hand" with the for-comprehension.

Most idiomatic way to mix synchronous, asynchronous, and parallel computation in a scala for comprehension of futures

Suppose I have 4 future computations to do. The first two can be done in parallel, but the third must be done after the first two (even though the values of the first two are not used in the third -- think of each computation as a command that performs some db operation). Finally, there is a 4th computation that must occur after all of the first 3. Additionally, there is a side effect that can be started after the first 3 complete (think of this as kicking off a periodic runnable). In code, this could look like the following:
for {
_ <- async1 // not done in parallel with async2 :( is there
_ <- async2 // any way of achieving this cleanly inside of for?
_ <- async3
_ = sideEffect // do I need "=" here??
_ <- async4
} yield ()
The comments show my doubts about the quality of the code:
What's the cleanest way to do two operations in parallel in a for comprehension?
Is there is a way to achieve this result without so many "_" characters (nor assigning a named reference, at least in the case of sideEffect)
what's the cleanest and most idiomatic way to do this?
You can use zip to combine two futures, including the result of zip itself. You'll end up with tuples holding tuples, but if you use infix notation for Tuple2 it is easy to take them apart. Below I define a synonym ~ for succinctness (this is what the parser combinator library does, except its ~ is a different class that behaves similiarly to Tuple2).
As an alternative for _ = for the side effect, you can either move it into the yield, or combine it with the following statement using braces and a semicolon. I would still consider _ = to be more idiomatic, at least so far as having a side effecting statement in the for is idiomatic at all.
val ~ = Tuple2
for {
a ~ b ~ c <- async1 zip
async2 zip
async3
d <- { sideEffect; async4 }
} yield (a, b, c, d)
for-comprehensions represent monadic operations, and monadic operations are sequenced. There's superclass of monad, applicative, where computations don't depend on the results of prior computations, thus may be run in parallel.
Scalaz has a |#| operator for combining applicatives, so you can use (future1 |#| future2)(proc(_, _)) to dispatch two futures in parallel and then run "proc" on the result of both of them, as opposed to sequential computation of for {a <- future1; b <- future2(a)} yield b (or just future1 flatMap future2).
There's already a method on stdlib Futures called .zip that combines Futures in parallel, and indeed the scalaz impl uses this: https://github.com/scalaz/scalaz/blob/scalaz-seven/core/src/main/scala/scalaz/std/Future.scala#L36
And .zip and for-comprehensions may be intermixed to have parallel and sequential parts, as appropriate.
So just using the stdlib syntax, your above example could be written as:
for {
_ <- async1 zip async2
_ <- async3
_ = sideEffect
_ <- async4
} yield ()
Alternatively, written w/out a for-comprehension:
async1 zip async2 flatMap (_=> async3) flatMap {_=> sideEffect; async4}
Just as an FYI, it's really simple to get two futures to run in parallel and still process them via a for-comprehension. The suggested solutions of using zip can certainly work, but I find that when I want to handle a couple of futures and do something when they are all done, and I have two or more that are independent of each other, I do something like this:
val f1 = async1
val f2 = async2
//First two futures now running in parallel
for {
r1 <- f1
r2 <- f2
_ <- async3
_ = sideEffect
_ <- async4
} yield {
...
}
Now the way the for comprehension is structured certainly waits on f1 before checking on the completion status of f2, but the logic behind these two futures is running at the same time. This is a little simpler then some of the suggestions but still might give you what you need.
Your code already looks structured minus computing futures in parallel.
Use helper functions, ideally writing a code generator to print out
helpers for all tuple cases
As far as I know, you need to name the result or assign it _
Example code
Example code with helpers.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
object Example {
def run: Future[Unit] = {
for {
(a, b, c) <- par(
Future.successful(1),
Future.successful(2),
Future.successful(3)
)
constant = 100
(d, e) <- par(
Future.successful(a + 10),
Future.successful(b + c)
)
} yield {
println(constant)
println(d)
println(e)
}
}
def par[A,B](a: Future[A], b: Future[B]): Future[(A, B)] = {
for {
a <- a
b <- b
} yield (a, b)
}
def par[A,B,C](a: Future[A], b: Future[B], c: Future[C]): Future[(A, B, C)] = {
for {
a <- a
b <- b
c <- c
} yield (a, b, c)
}
}
Example.run
Edit:
generated code for 1 to 20 futures: https://gist.github.com/nanop/c448db7ac1dfd6545967#file-parhelpers-scala
parPrinter script: https://gist.github.com/nanop/c448db7ac1dfd6545967#file-parprinter-scala

Scala's "for comprehension" with futures

I am reading through the Scala Cookbook (http://shop.oreilly.com/product/0636920026914.do)
There is an example related to Future use that involves for comprehension.
So far my understanding about for comprehension is when use with a collection it will produce another collection with the same type. For example, if each futureX is of type Future[Int], the following should also be of type Future[Int]:
for {
r1 <- future1
r2 <- future2
r3 <- future3
} yield (r1+r2+r3)
Could someone explain me what exactly happening when use <- in this code?
I know if it was a generator it will fetch each element by looping.
First about for comprehension. It was answered on SO many many times, that it's an abstraction over a couple of monadic operations: map, flatMap, withFilter. When you use <-, scalac desugars this lines into monadic flatMap:
r <- monad into monad.flatMap(r => ... )
it looks like an imperative computation (what a monad is all about), you bind a computation result to the r. And yield part is desugared into map call. Result type depends on the type of monad's.
Future trait has a flatMap and map functions, so we can use for comprehension with it. In your example can be desugared into the following code:
future1.flatMap(r1 => future2.flatMap(r2 => future3.map(r3 => r1 + r2 + r3) ) )
Parallelism aside
It goes without saying that if execution of future2 depends on r1 then you can't escape sequential execution, but if the future computations are independent, you have two choices. You can enforce sequential execution, or allow for parallel execution. You can't enforce the latter, as the execution context will handle this.
val res = for {
r1 <- computationReturningFuture1(...)
r2 <- computationReturningFuture2(...)
r3 <- computationReturningFuture3(...)
} yield (r1+r2+r3)
will always run sequentially. It can be easily explained by the desugaring, after which the subsequent computationReturningFutureX calls are only invoked inside of the flatMaps, i.e.
computationReturningFuture1(...).flatMap(r1 =>
computationReturningFuture2(...).flatMap(r2 =>
computationReturningFuture3(...).map(r3 => r1 + r2 + r3) ) )
However this is able to run in parallel and the for comprehension aggregates the results:
val future1 = computationReturningFuture1(...)
val future2 = computationReturningFuture2(...)
val future3 = computationReturningFuture3(...)
val res = for {
r1 <- future1
r2 <- future2
r3 <- future3
} yield (r1+r2+r3)
To elaborate those existing answers here a simple result to demonstrate how for comprehension works.
Its bit lengthy functions yet they worth taking look into it.
A function that give us a range of integers
scala> def createIntegers = Future{
println("INT "+ Thread.currentThread().getName+" Begin.")
val returnValue = List.range(1, 256)
println("INT "+ Thread.currentThread().getName+" End.")
returnValue
}
createIntegers: createIntegers: scala.concurrent.Future[List[Int]]
A function that give us a range of chars
scala> def createAsciiChars = Future{
println("CHAR "+ Thread.currentThread().getName+" Begin.")
val returnValue = new ListBuffer[Char]
for (i <- 1 to 256){
returnValue += i.toChar
}
println("CHAR "+ Thread.currentThread().getName+" End.")
returnValue
}
createAsciiChars: scala.concurrent.Future[scala.collection.mutable.ListBuffer[Char]]
Using these function calls within the for comprehension.
scala> val result = for{
i <- createIntegers
s <- createAsciiChars
} yield i.zip(s)
Await.result(result, Duration.Inf)
result: scala.concurrent.Future[List[(Int, Char)]] = Future(<not completed>)
For these below lines we can make out that all the function calls are synchronous i.e. createAsciiChars function call is not executed until createIntegers completes its execution.
scala> INT scala-execution-context-global-27 Begin.
INT scala-execution-context-global-27 End.
CHAR scala-execution-context-global-28 Begin.
CHAR scala-execution-context-global-28 End.
Making these function createAsciiChars, createIntegers calls outside the for comprehensions will be asynchronous execution.
It allows r1, r2, r3 to run in parallel, if possible. It may not be possible, depending things like how many threads are available to execute Future computations, but by using this syntax you are telling the compiler to run these computations in parallel if possible, then execute the yield() when all have completed.