future built from by-name parameter not executed in parallel - scala

I am trying to build a new control structure which create a thread for each of its argument, and run them in parallel. The code seems to be fine when I built two future manually for each input, because I see the fast thread finishes before the slow thread.
Here is output:
fast
slow
However, if I use List(a, b).map(f => Future {f}) then I always see fast thread is executed after slow is done. Here is the output:
slow
fast
Can someone explain this?
code pasted here:
import java.util.concurrent.Executors
import scala.concurrent.{ExecutionContext, Future}
object ExecInParallel extends App {
def run(a: => Unit, b: => Unit): Unit = {
val executorService = Executors.newFixedThreadPool(2)
implicit val executionContext =
ExecutionContext.fromExecutorService(executorService)
// af and bf are executed in parallel
val af = Future(a)
val bf = Future(b)
// however, the following code is not parallel
List(a, b).map(f => Future(f))
Thread.sleep(3000)
executorService.shutdown
}
run(
{
Thread.sleep(2000)
println("slow")
},
{
Thread.sleep(1000)
println("fast")
}
)
}

That's because a and b are evaluated every time they are referenced in a non-by-name position and List(a, b) arguments are not by-name. From https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_name:
Call by name is an evaluation strategy where the arguments to a function are not evaluated before the function is called—rather,... then left to be evaluated whenever they appear in the function. If an argument is not used in the function body, the argument is never evaluated; if it is used several times, it is re-evaluated each time it appears.
Effectively that is equivalent to this code:
List({
Thread.sleep(2000)
println("slow")
},
{
Thread.sleep(1000)
println("fast")
}).map(f => Future(f))
Since the List constructor doesn't take by-name arguments, these values are evaluated before the list itself is even constructed

This happens because you first create list of two call-by name values:
List(a, b)...
and until a and b not fully computed the map operation is not executed.
When List(a,b) is ready you wrap it in Futures:
List(a, b).map(f => Future(f))

Your by-name a and b is executed (sequentially) in List(a, b), before the construction of Future in map. If you check the inferred type of List(a, b) you'd see it's a List[Unit].
To achieve what you intented, you need a list of functions rather than list of results.
The following will work:
List(a _, b _).map(f => Future(f()))

Related

Is it possible to lift an effect in Scala ZIO into another effectful context?

I'm looking for a way to lazily compose two effects without first executing their results in Zio. My program takes the following form:
/**
* Returns a reference to an effectful singleton cron scheduler backed by akka
* See https://github.com/philcali/cronish for more info on the API
*/
def scheduled: UManaged[Ref[Scheduled]] = ???
def schedule[R, E, A](e: => ZIO[R, E, A], crondef: String) =
(for {
resource <- scheduled
task <- ZManaged.fromEffect(e) // I need to lift the underlying effect here, not access its result
} yield resource.modify(schedule => schedule(job(task), crondef.cron) -> schedule)).flattenM
def scheduleEffect[A](e: => A, description: String = "")(crondef: String) =
(for {
resource <- scheduled
} yield resource.modify(schedule => schedule(job(e), crondef.cron) -> schedule)).flattenM
// Program which schedules cron jobs to increment/decrement x and y, respectively
def run(args: List[String]): URIO[ZEnv, ExitCode] = {
var x = 0
var y = 100
(for {
_ <- Scheduler.schedule(UIO({ x += 1; println(x) }), "every second")
_ <- Scheduler.scheduleEffect({ y -= 1; println(y) }, "every second")
} yield ())
.provideCustomLayer(???)
.as(ExitCode.success)
.useForever
}
In this current formulation, the decrementing of y runs every second until the program terminates, while the incrementing of x only runs once. I know that ZIO provides a Schedule utility, but for legacy compatibility reasons I have to stick with the effectful singleton used by the Cronish library. Basically job takes a pass-by-reference effect of type A and suspends it in a CronTask for execution within the Scheduled singleton according to the schedule defined by crondef.
What I am wondering is if it is possible to compose the effects themselves, rather than their results in the context of ZIO? I've basically wrapped the legacy cron scheduler in ZIO data types to manage the concurrency properly, but I still need the suspended effect from other ZIO-signature methods in my code to be available for me to pass down into the scheduler.
I ultimately found the solution by reading through the source code for ZIO.effectAsyncM, specifically noting its reference to ZIO.runtime[R]:
/**
* Imports an asynchronous effect into a pure `ZIO` value. This formulation is
* necessary when the effect is itself expressed in terms of `ZIO`.
*/
def effectAsyncM[R, E, A](
register: (ZIO[R, E, A] => Unit) => ZIO[R, E, Any]
): ZIO[R, E, A] =
for {
p <- Promise.make[E, A]
r <- ZIO.runtime[R] // This line right here!
a <- ZIO.uninterruptibleMask { restore =>
val f = register(k => r.unsafeRunAsync_(k.to(p)))
restore(f.catchAllCause(p.halt)).fork *> restore(p.await)
}
} yield a
Though I was not able to find a direct reference to this method in the ZIO docs, the Scaladocs were clear enough:
Returns an effect that accesses the runtime, which can be used to (unsafely) execute tasks. This is useful for integration with legacy code that must call back into ZIO code.
With this, my new implementation works beautifully as follows:
def schedule[R, E, A](e: => ZIO[R, E, A], crondef: String) =
(for {
resource <- scheduled
runtime <- ZManaged.fromEffect(ZIO.runtime[R])
} yield resource.modify({
schedule => schedule(job(runtime.unsafeRun(e)), crondef.cron) -> schedule
})).flattenM

What exactly is returned in a Scala Future?

I am new to Scala and am working on an application using akka and futures
I have a class(C) that processes a future and returns a List(L) of objects type X. X has a field y and it is this field I am interesed in. (Note: L was previously converted from a ListBuffer type).
In the Main Class that calls C the code to process the result is:
Snippet A:
result1.onComplete {
result =>
result.foreach(f = (o: List[X]) => {
o.foreach(f = (o: x) => {
println(X.y)
})
})
}
I can also use this code:
Snippet B:
result1 onSuccess{
case result =>
result.foreach(f = (o: X) => {
println(o.y)
})
}
Obviously, as OnSuccess is deprecated the first form may be preferred. The second form (Snippet B) I can understand, but I am baffled as to why, when using onComplete, I have to use nested 'foreach' to get to the same result. If anyone can help as I dont yet have a clear conception of what 'result' (Try[T] => U) ??? actually is. Thanks in advance..:)
If you take a look to the Future.onComplete scaladoc, you will see that it receives a function from Try[T], (where T is the type of your future), to Unit.
"When this future is completed, either through an exception, or a value, apply the provided function".
"Note that the returned value of f will be discarded".
The Try is there to catch failures on the asynchronous computation - that's why you need the two foreach, the first one is to extract the Successful List from the Try, and the second one is to extract each element O from the List.
You didn't need two in the case of onSuccess because, as the name says that callback would only be called if the Future completed successfully.
However, note that doing a foreach on a Try is a bad idea, since you are not handling the failure, also the code become hard to read - try this instead.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.{Success, Failure}
case class Result(data: Int)
val process: Future[List[Result]] =
Future.successful(List(Result(data = 5), Result(data = 3)))
process.onComplete {
case Success(results) =>
for (result <- results) {
println(result.data)
}
case Failure(ex) =>
println(s"Error: ${ex.getMessage}")
}

Scalaz task firstCompletedOf

I have two scalaz.concurrent.Tasks which are performing a HTTP request to different servers.
I want to compose them in a manner similar to Future.firstCompletedOf, that is: run them both in parallel and get the result of the first one that successfully completes.
Unfortunately Task.gatherUnordered does not what I want since it runs every task to completion before returning the result.
Not sure how to do it in scalaz.concurrent natively, but this one works for me:
import scalaz.Nondeterminism._
import scalaz.std.either.eitherInstance
import scalaz.syntax.bitraverse._
def race[A, B](t1: Task[A], t2: Task[B]): Task[A \/ B] = {
Nondeterminism[Task].choose(t1, t2).map {
_.bimap(_._1, _._2)
}
}
In fs2 - successor of scalaz.concurrent - it is fs2.async#race
While using bimap is indeed correct, there's an alternate implementation:
import scalaz.concurrent.Task
import scalaz.Nondeterminism
def firstOf[A, B, C](ta: Task[A], tb: Task[B])(fa: A => C, fb: B => C): Task[C] =
Nondeterminism[Task].chooseAny(ta.map(fa), Seq(tb.map(fb))).map(_._1)
val task1 = Task { Thread.sleep(10000); 4 }
val task2 = Task { Thread.sleep(5000); "test" }
firstOf(task1, task2)(_.toString, identity).unsafePerformSync // test
Here I'm assuming that non-deterministic retrieval of results is used to obtain equivalent values for which exact computation time is unknown. So the function incorporates concurrently-performed conversions fa and fb to the common type. It's good in the cases when conversion time is difficult to compute as well - it selects first result after conversion, for example, some request data extraction in the case of HTTP. For simpler cases, variant of race function that performs mapping in parallel is retrieved from firstOf as follows:
def race[A, B](ta: Task[A], tb: Task[B]): Task[A \/ B] = firstOf(ta, tb)(-\/(_), \/-(_))

For loop containing Scala Futures modifying a List

Let's say I have a ListBuffer[Int] and I iterate it with a foreach loop, and each loop will modify this list from inside a Future (removing the current element), and will do something special when the list is empty. Example code:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.collection.mutable.ListBuffer
val l = ListBuffer(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
l.foreach(n => Future {
println(s"Processing $n")
Future {
l -= n
println(s"Removed $n")
if (l.isEmpty) println("List is empty!")
}
})
This is probably going to end very badly. I have a more complex code with similar structure and same needs, but I do not know how to structure it so I can achieve same functionality in a more reliable way.
The way you present your problem is really not in the functional paradigm that scala is intended for.
What you seem to want, is to do a list of asynchronous computations, do something at the end of each one, and something else when every one is finished. This is pretty simple if you use continuations, which are simple to implement with map and flatMap methods on Future.
val fa: Future[Int] = Future { 1 }
// will apply the function to the result when it becomes available
val fb: Future[Int] = fa.map(a => a + 1)
// will start the asynchronous computation using the result when it will become available
val fc: Future[Int] = fa.flatMap(a => Future { a + 2 })
Once you have all this, you can easily do something when each of your Future completes (successfully):
val myFutures: List[Future[Int]] = ???
myFutures.map(futInt => futInt.map(int => int + 2))
Here, I will add 2 to each value I get from the different asynchronous computations in the List.
You can also choose to wait for all the Futures in your list to complete by using Future.sequence:
val myFutureList: Future[List[Int]] = Future.sequence(myFutures)
Once again, you get a Future, which will be resolved when each of the Futures inside the input list are successfully resolved, or will fail whenever one of your Futures fails. You'll then be able to use map or flatMap on this new Future, to use all the computed values at once.
So here's how I would write the code you proposed:
val l = 1 to 10
val processings: Seq[Future[Unit]] = l.map {n =>
Future(println(s"processing $n")).map {_ =>
println(s"finished processing $n")
}
}
val processingOver: Future[Unit] =
Future.sequence(processings).map { (lu: Seq[Unit]) =>
println(s"Finished processing ${lu.size} elements")
}
Of course, I would recommend having real functions rather than procedures (returning Unit), so that you can have values to do something with. I used println to have a code which will produce the same output as yours (except for the prints, which have a slightly different meaning, since we are not mutating anything anymore).

Why does the andThen of Future not chain the result?

The andThen meaning I learned from this answer is a function composer.
Say that
f andThen g andThen h
will equal to
h(g(f(x)))
This implies the h function will receive input from g(f(x))
But for the andThen in Future, all the closure of the following andThen always receives the result from the original Future.
Future{
1
}.andThen{ case Success(x) =>
println(x) // print 1
Thread.sleep(2000)
x * 2
}.andThen{ case Success(x) =>
println(x) // print 1
Thread.sleep(2000)
x * 2
}
compare to
val func: Function1[Int, Int] = { x: Int =>
x
}.andThen { y =>
println(y) // print 1
y * 2
}.andThen { z =>
println(z) // print 2
z * 2
}
func(1)
What is the reason to make Future::andThen(s) receive all the same result from original Future instead of chaining Future? I've observed that these chained andThen will be executed sequentially, so the reason may not be for parallel purpose.
scala.concurrent.Future is designed as compromise of two asynchronous approaches:
Object-oriented observer which allows binding of asynchronous handlers
Functional monad which offers rich functional composition capabilities.
Reading Future.andThen's docs:
Applies the side-effecting function to the result of this future, and
returns a new future with the result of this future.
So andThen is most likely from OOP universe. To gain similar similar result to Function1.andThen you could use map method :
Future(1).map {_ * 2}.map {_ * 2}
andThen differs from onComplete with one little thing: resulting Future of andThen still returning same result, but will wait until supplied observer will return or throw something. That's why there is written in the docs:
This method allows one to enforce that the callbacks are executed in a
specified order.
Also note third line from docs:
Note that if one of the chained andThen callbacks throws an exception,
that exception is not propagated to the subsequent andThen callbacks.
Instead, the subsequent andThen callbacks are given the original value
of this future.
So it' completely do nothing with new Future's result. Could not even spoil it with it's ownt exception. This andThen and onComplete just sequential and parallel binding of observers.
Let me sum up this nice discussion.
Say, we have tf: Future[T] =..., and two functions, f: T => U and g: U => V
We can do vf: Future[V] = tf map f map g, same asvf: Future[V] = tf map (f andThen g)
In another use case, having fp: PartialFunction[T, U] and gp: PartialFunction[U, V],
we can run tf1: Future[T] = tf andThen fp andThen gp - these partial functions will be called on the value that tf produces, with no outside effect - only side effects happen. This sequence waits for fp before calling gp.
Yet another future operation, onComplete, works like this: having f: Try[T] => U, the call tf onComplete f will call f even if the future ended with an error; the result of tf onComplete f is of type Unit.
Also, if your function f produces a Future, you will need to use flatMap.