Does the map of Scala Future add any overhead?

Does the map of Scala Future add any overhead? - scala

Suppose I've got a function like this:
import scala.concurrent._
def plus2Future(fut: Future[Int])
(implicit ec: ExecutionContext): Future[Int] = fut.map(_ + 2)
Now I am using plus2Future to create a new Future:
import import scala.concurrent.ExecutionContext.Implicits.global
val fut1 = Future { 0 }
val fut2 = plus2Future(fut1)
Does function plus2 always run now on the same thread as fut1 ? I guess it does not.
Does using map in plus2Future add an overhead of thread context switching, creating a new Runnable etc. ?

Does using map in plus2Future add an overhead of thread context
switching, creating a new Runnable etc. ?
map, on the default implementation of Future (via DefaultPromise) is:
def map[S](f: T => S)(implicit executor: ExecutionContext): Future[S] = { // transform(f, identity)
val p = Promise[S]()
onComplete { v => p complete (v map f) }
p.future
}
Where onComplete creates a new CallableRunnable and eventually invokes dispatchOrAddCallback:
/** Tries to add the callback, if already completed, it dispatches the callback to be executed.
* Used by `onComplete()` to add callbacks to a promise and by `link()` to transfer callbacks
* to the root promise when linking two promises togehter.
*/
#tailrec
private def dispatchOrAddCallback(runnable: CallbackRunnable[T]): Unit = {
getState match {
case r: Try[_] => runnable.executeWithValue(r.asInstanceOf[Try[T]])
case _: DefaultPromise[_] => compressedRoot().dispatchOrAddCallback(runnable)
case listeners: List[_] => if (updateState(listeners, runnable :: listeners)) () else dispatchOrAddCallback(runnable)
}
}
Which dispatches the call to the underlying execution context. This means that it is up to the implementing ExecutionContext to decide how and where the future is ran, so one cannot give a deterministic answer of "It runs here or there". We can see that there is both an allocation of a Promise and a callback object.
In general, I wouldn't worry about this, unless I've benchmarked the code and found this to be a bottleneck.

No, it cannot (deterministically) use the same thread (unless the EC only has one thread) since an unbounded amount of time can elapse between the two lines of code.

Related

How to Promise.allSettled with Scala futures?

I have two scala futures. I want to perform an action once both are completed, regardless of whether they were completed successfully. (Additionally, I want the ability to inspect those results at that time.)
In Javascript, this is Promise.allSettled.
Does Scala offer a simple way to do this?
One last wrinkle, if it matters: I want to do this in a JRuby application.

You can use the transform method to create a Future that will always succeed and return the result or the error as a Try object.
def toTry[A](future: Future[A])(implicit ec: ExecutionContext): Future[Try[A]] =
future.transform(x => Success(x))
To combine two Futures into one, you can use zip:
def settle2[A, B](fa: Future[A], fb: Future[B])(implicit ec: ExecutionContext)
: Future[(Try[A], Try[B])] =
toTry(fa).zip(toTry(fb))
If you want to combine an arbitrary number of Futures this way, you can use Future.traverse:
def allSettled[A](futures: List[Future[A]])(implicit ec: ExecutionContext)
: Future[List[Try[A]]] =
Future.traverse(futures)(toTry(_))

Normally in this case we use Future.sequence to transform a collection of a Future into one single Future so you can map on it, but Scala short circuit the failed Future and doesn't wait for anything after that (Scala considers one failure to be a failure for all), which doesn't fit your case.
In this case you need to map failed ones to successful, then do the sequence, e.g.
val settledFuture = Future.sequence(List(future1, future2, ...).map(_.recoverWith { case _ => Future.unit }))
settledFuture.map(//Here it is all settled)
EDIT
Since the results need to be kept, instead of mapping to Future.unit, we map the actual result into another layer of Try:
val settledFuture = Future.sequence(
List(Future(1), Future(throw new Exception))
.map(_.map(Success(_)).recover(Failure(_)))
)
settledFuture.map(println(_))
//Output: List(Success(1), Failure(java.lang.Exception))
EDIT2
It can be further simplified with transform:
Future.sequence(listOfFutures.map(_.transform(Success(_))))

Perhaps you could use a concurrent counter to keep track of the number of completed Futures and then complete the Promise once all Futures have completed
def allSettled[T](futures: List[Future[T]]): Future[List[Future[T]]] = {
val p = Promise[List[Future[T]]]()
val length = futures.length
val completedCount = new AtomicInteger(0)
futures foreach {
_.onComplete { _ =>
if (completedCount.incrementAndGet == length) p.trySuccess(futures)
}
}
p.future
}
val futures = List(
Future(-11),
Future(throw new Exception("boom")),
Future(42)
)
allSettled(futures).andThen(println(_))
// Success(List(Future(Success(-11)), Future(Failure(java.lang.Exception: boom)), Future(Success(42))))
scastie

Multiple futures that may fail - returning both successes and failures?

I have a situation where I need to run a bunch of operations in parallel.
All operations have the same return value (say a Seq[String]).
Its possible that some of the operations may fail, and others successfully return results.
I want to return both the successful results, and any exceptions that happened, so I can log them for debugging.
Is there a built-in way, or easy way through any library (cats/scalaz) to do this, before I go and write my own class for doing this?
I was thinking of doing each operation in its own future, then checking each future, and returning a tuple of Seq[String] -> Seq[Throwable] where left value is the successful results (flattened / combined) and right is a list of any exceptions that occurred.
Is there a better way?

Using Await.ready, which you mention in a comment, generally loses most benefits from using futures. Instead you can do this just using the normal Future combinators. And let's do the more generic version, which works for any return type; flattening the Seq[String]s can be added easily.
def successesAndFailures[T](futures: Seq[Future[T]]): Future[(Seq[T], Seq[Throwable])] = {
// first, promote all futures to Either without failures
val eitherFutures: Seq[Future[Either[Throwable, T]]] =
futures.map(_.transform(x => Success(x.toEither)))
// then sequence to flip Future and Seq
val futureEithers: Future[Seq[Either[Throwable, T]]] =
Future.sequence(eitherFutures)
// finally, Seq of Eithers can be separated into Seqs of Lefts and Rights
futureEithers.map { seqOfEithers =>
val (lefts, rights) = seqOfEithers.partition(_.isLeft)
val failures = lefts.map(_.left.get)
val successes = rights.map(_.right.get)
(successes, failures)
}
}
Scalaz and Cats have separate to simplify the last step.
The types can be inferred by the compiler, they are shown just to help you see the logic.

Calling value on your Future returns an Option[Try[T]]. If the Future has not completed then the Option is None. If it has completed then it's easy to unwrap and process.
if (myFutr.isCompleted)
myFutr.value.map(_.fold( err: Throwable => //log the error
, ss: Seq[String] => //process results
))
else
// do something else, come back later

Sounds like a good use-case for the Try idiom (it's basically similar to the Either monad).
Example of usage from the doc:
import scala.util.{Success, Failure}
val f: Future[List[String]] = Future {
session.getRecentPosts
}
f onComplete {
case Success(posts) => for (post <- posts) println(post)
case Failure(t) => println("An error has occurred: " + t.getMessage)
}
It actually does a little bit more than what you asked because it is fully asynchronous. Does it fit your use-case?

I'd do it this way:
import scala.concurrent.{Future, ExecutionContext}
import scala.util.Success
def eitherify[A](f: Future[A])(implicit ec: ExecutionContext): Future[Either[Throwable, A]] = f.transform(tryResult => Success(tryResult.toEither))
def eitherifyF[A, B](f: A => Future[B])(implicit ec: ExecutionContext): A => Future[Either[Throwable, B]] = { a => eitherify(f(a)) }
// here we need some "cats" magic for `traverse` and `separate`
// instead of `traverse` you can use standard `Future.sequence`
// there is no analogue for `separate` in the standard library
import cats.implicits._
def myProgram[A, B](values: List[A], asyncF: A => Future[B])(implicit ec: ExecutionContext): Future[(List[Throwable], List[B])] = {
val appliedTransformations: Future[List[Either[Throwable, B]]] = values.traverse(eitherifyF(asyncF))
appliedTransformations.map(_.separate)
}

Understanding Scala Future andThen

I'm trying to use andThen method to register action stages for the future. Like this:
implicit val executionContext = ExecutionContext.global
val f = Future(0)
f.andThen {
case r => println(r.get + "1")
} andThen {
case r => println(r.get + "2")
}
Since this method executes asynchronously and does not produce a
return value, any non-fatal exceptions thrown will be reported to the
ExecutionContext .
What does it mean asynchronously? It means asynchronously to the future execution itself?

What does it mean asynchronously? It means asynchronously to the
future execution itself?
It means that the execution of andThen will happen on a given thread provided by the ExecutionContext after the originating Future[T] completes.
If we look at the implementation:
def andThen[U](pf: PartialFunction[Try[T], U])(implicit executor: ExecutionContext): Future[T] = {
val p = Promise[T]()
onComplete {
case r => try pf.applyOrElse[Try[T], Any](r, Predef.conforms[Try[T]]) finally p complete r
}
p.future
}
This method is a smart wrapper over Future.onComplete which is a side effecting method that returns Unit. It applies the given function, and returns the original Future[T] result. As the documentation states, it is useful when you want to guarantee the order of several onComplete callbacks registered on the future.

Get actual value in Future Scala [duplicate]

I am a newbie to scala futures and I have a doubt regarding the return value of scala futures.
So, generally syntax for a scala future is
def downloadPage(url: URL) = Future[List[Int]] {
}
I want to know how to access the List[Int] from some other method which calls this method.
In other words,
val result = downloadPage("localhost")
then what should be the approach to get List[Int] out of the future ?
I have tried using map method but not able to do this successfully.`

The case of Success(listInt) => I want to return the listInt and I am not able to figure out how to do that.
The best practice is that you don't return the value. Instead you just pass the future (or a version transformed with map, flatMap, etc.) to everyone who needs this value and they can add their own onComplete.
If you really need to return it (e.g. when implementing a legacy method), then the only thing you can do is to block (e.g. with Await.result) and you need to decide how long to await.

You need to wait for the future to complete to get the result given some timespan, here's something that would work:
import scala.concurrent.duration._
def downloadPage(url: URL) = Future[List[Int]] {
List(1,2,3)
}
val result = downloadPage("localhost")
val myListInt = result.result(10 seconds)
Ideally, if you're using a Future, you don't want to block the executing thread, so you would move your logic that deals with the result of your Future into the onComplete method, something like this:
result.onComplete({
case Success(listInt) => {
//Do something with my list
}
case Failure(exception) => {
//Do something with my error
}
})

I hope you already solved this since it was asked in 2013 but maybe my answer can help someone else:
If you are using Play Framework, it support async Actions (actually all Actions are async inside). An easy way to create an async Action is using Action.async(). You need to provide a Future[Result]to this function.
Now you can just make transformations from your Future[List[Int]] to Future[Result] using Scala's map, flatMap, for-comprehension or async/await. Here an example from Play Framework documentation.
import play.api.libs.concurrent.Execution.Implicits.defaultContext
def index = Action.async {
val futureInt = scala.concurrent.Future { intensiveComputation() }
futureInt.map(i => Ok("Got result: " + i))
}

You can do something like that. If The wait time that is given in Await.result method is less than it takes the awaitable to execute, you will have a TimeoutException, and you need to handle the error (or any other error).
import scala.concurrent._
import ExecutionContext.Implicits.global
import scala.util.{Try, Success, Failure}
import scala.concurrent.duration._
object MyObject {
def main(args: Array[String]) {
val myVal: Future[String] = Future { silly() }
// values less than 5 seconds will go to
// Failure case, because silly() will not be done yet
Try(Await.result(myVal, 10 seconds)) match {
case Success(extractedVal) => { println("Success Happened: " + extractedVal) }
case Failure(_) => { println("Failure Happened") }
case _ => { println("Very Strange") }
}
}
def silly(): String = {
Thread.sleep(5000)
"Hello from silly"
}
}

The best way I’ve found to think of a Future is a box that will, at some point, contain the thing that you want. The key thing with a Future is that you never open the box. Trying to force open the box will lead you to blocking and grief. Instead, you put the Future in another, larger box, typically using the map method.
Here’s an example of a Future that contains a String. When the Future completes, then Console.println is called:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
object Main {
def main(args:Array[String]) : Unit = {
val stringFuture: Future[String] = Future.successful("hello world!")
stringFuture.map {
someString =>
// if you use .foreach you avoid creating an extra Future, but we are proving
// the concept here...
Console.println(someString)
}
}
}
Note that in this case, we’re calling the main method and then… finishing. The string’s Future, provided by the global ExecutionContext, does the work of calling Console.println. This is great, because when we give up control over when someString is going to be there and when Console.println is going to be called, we let the system manage itself. In constrast, look what happens when we try to force the box open:
val stringFuture: Future[String] = Future.successful("hello world!")
val someString = Future.await(stringFuture)
In this case, we have to wait — keep a thread twiddling its thumbs — until we get someString back. We’ve opened the box, but we’ve had to commandeer the system’s resources to get at it.

It wasn't yet mentioned, so I want to emphasize the point of using Future with for-comprehension and the difference of sequential and parallel execution.
For example, for sequential execution:
object FuturesSequential extends App {
def job(n: Int) = Future {
Thread.sleep(1000)
println(s"Job $n")
}
val f = for {
f1 <- job(1)
f2 <- job(2)
f3 <- job(3)
f4 <- job(4)
f5 <- job(5)
} yield List(f1, f2, f3, f4, f5)
f.map(res => println(s"Done. ${res.size} jobs run"))
Thread.sleep(6000) // We need to prevent main thread from quitting too early
}
And for parallel execution (note that the Future are before the for-comprehension):
object FuturesParallel extends App {
def job(n: Int) = Future {
Thread.sleep(1000)
println(s"Job $n")
}
val j1 = job(1)
val j2 = job(2)
val j3 = job(3)
val j4 = job(4)
val j5 = job(5)
val f = for {
f1 <- j1
f2 <- j2
f3 <- j3
f4 <- j4
f5 <- j5
} yield List(f1, f2, f3, f4, f5)
f.map(res => println(s"Done. ${res.size} jobs run"))
Thread.sleep(6000) // We need to prevent main thread from quitting too early
}

Compose two Scala futures with callbacks, WITHOUT a third ExecutionContext

I have two methods, let's call them load() and init(). Each one starts a computation in its own thread and returns a Future on its own execution context. The two computations are independent.
val loadContext = ExecutionContext.fromExecutor(...)
def load(): Future[Unit] = {
Future
}
val initContext = ExecutionContext.fromExecutor(...)
def init(): Future[Unit] = {
Future { ... }(initContext)
}
I want to call both of these from some third thread -- say it's from main() -- and perform some other computation when both are finished.
def onBothComplete(): Unit = ...
Now:
I don't care which completes first
I don't care what thread the other computation is performed on, except:
I don't want to block either thread waiting for the other;
I don't want to block the third (calling) thread; and
I don't want to have to start a fourth thread just to set the flag.
If I use for-comprehensions, I get something like:
val loading = load()
val initialization = initialize()
for {
loaded <- loading
initialized <- initialization
} yield { onBothComplete() }
and I get Cannot find an implicit ExecutionContext.
I take this to mean Scala wants a fourth thread to wait for the completion of both futures and set the flag, either an explicit new ExecutionContext or ExecutionContext.Implicits.global. So it would appear that for-comprehensions are out.
I thought I might be able to nest callbacks:
initialization.onComplete {
case Success(_) =>
loading.onComplete {
case Success(_) => onBothComplete()
case Failure(t) => log.error("Unable to load", t)
}
case Failure(t) => log.error("Unable to initialize", t)
}
Unfortunately onComplete also takes an implicit ExecutionContext, and I get the same error. (Also this is ugly, and loses the error message from loading if initialization fails.)
Is there any way to compose Scala Futures without blocking and without introducing another ExecutionContext? If not, I might have to just throw them over for Java 8 CompletableFutures or Javaslang Vavr Futures, both of which have the ability to run callbacks on the thread that did the original work.
Updated to clarify that blocking either thread waiting for the other is also not acceptable.
Updated again to be less specific about the post-completion computation.

Why not just reuse one of your own execution contexts? Not sure what your requirements for those are but if you use a single thread executor you could just reuse that one as the execution context for your comprehension and you won't get any new threads created:
implicit val loadContext = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor)
If you really can't reuse them you may consider this as the implicit execution context:
implicit val currentThreadExecutionContext = ExecutionContext.fromExecutor(
(runnable: Runnable) => {
runnable.run()
})
Which will run futures on the current thread. However, the Scala docs explicitly recommends against this as it introduces nondeterminism in which thread runs the Future (but as you stated, you don't care which thread it runs on so this may not matter).
See Synchronous Execution Context for why this isn't advisable.
An example with that context:
val loadContext = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor)
def load(): Future[Unit] = {
Future(println("loading thread " + Thread.currentThread().getName))(loadContext)
}
val initContext = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor)
def init(): Future[Unit] = {
Future(println("init thread " + Thread.currentThread().getName))(initContext)
}
val doneFlag = new AtomicBoolean(false)
val loading = load()
val initialization = init()
implicit val currentThreadExecutionContext = ExecutionContext.fromExecutor(
(runnable: Runnable) => {
runnable.run()
})
for {
loaded <- loading
initialized <- initialization
} yield {
println("yield thread " + Thread.currentThread().getName)
doneFlag.set(true)
}
prints:
loading thread pool-1-thread-1
init thread pool-2-thread-1
yield thread main
Though the yield line may print either pool-1-thread-1 or pool-2-thread-1 depending on the run.

In Scala, a Future represents a piece of work to be executed async (i.e. concurrently to other units of work). An ExecutionContext represent a pool of threads for executing Futures. In other words, ExecutionContext is the team of worker who performs the actual work.
For efficiency and scalability, it's better to have big team(s) (e.g. single ExecutionContext with 10 threads to execute 10 Future's) rather than small teams (e.g. 5 ExecutionContext with 2 threads each to execute 10 Future's).
In your case if you want to limit the number of threads to 2, you can:
def load()(implicit teamOfWorkers: ExecutionContext): Future[Unit] = {
Future { ... } /* will use the teamOfWorkers implicitly */
}
def init()(implicit teamOfWorkers: ExecutionContext): Future[Unit] = {
Future { ... } /* will use the teamOfWorkers implicitly */
}
implicit val bigTeamOfWorkers = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(2))
/* All async works in the following will use
the same bigTeamOfWorkers implicitly and works will be shared by
the 2 workers (i.e. thread) in the team */
for {
loaded <- loading
initialized <- initialization
} yield doneFlag.set(true)
The Cannot find an implicit ExecutionContext error does not mean that Scala wants additional threads. It only means that Scala wants a ExecutionContext to do the work. And additional ExecutionContext does not necessarily implies additional 'thread', e.g. the following ExecutionContext, instead of creating new threads, will execute works in the current thread:
val currThreadExecutor = ExecutionContext.fromExecutor(new Executor {
override def execute(command: Runnable): Unit = command.run()
})