In Scala, can a program finishes/exits if one of his Future val is unfinished? - scala

A simple example to illustrate the problem:
1 - Here, does the program exits after the future is completed ?
def main(args: Array[String]): Unit = {
val future: Future[Unit] = myFunction()
}
2 - If not, should I had an Await to guarantee that the future terminates?
def main(args: Array[String]): Unit = {
val future: Future[Unit] = myFunction()
Await.result(future, Inf)
}

Reading this about Futures/Promises in Scala, the point is: it is not the Future that is about concurrency.
Meaning: what prevents the JVM from exiting are running threads. Coming from there: unless something in your code creates an additional thread that somehow prevents the JVM from exiting, your main() should simply end.
Futures are a mean to interact with content which becomes available at some later point in time. You should rather look into your code base to determine what kind of threading comes in place, and for example if some underlying thread pool executor is configured regarding the threads it is using.

A future is value that is returned after executing a piece of task independently by a new thread(mostly) spawned by the another thread(say main).
To answer your question Yes the main thread will exit even if any future is still under execution.
import scala.concurrent._
import ExecutionContext.Implicits.global
object TestFutures extends App{
def doSomeOtherTask = {
Thread.sleep(1000) //do some task of 1 sec
println("Completed some task by "+Thread.currentThread().getName)
}
def returnFuture : Future[Int]= Future{
println("Future task started "+Thread.currentThread().getName)
Thread.sleep(5000) //do some task which is 5 sec
println("Future task completed "+Thread.currentThread().getName)
5
}
val x = returnFuture //this takes 5 secs
doSomeOtherTask // ~ 1 sec job
println(x.isCompleted)
doSomeOtherTask // ~ 2 sec completed
println(x.isCompleted)
doSomeOtherTask // ~ 3 sec completed
println(x.isCompleted)
println("Future task is still pending and main thread have no more lines to execute")
}
Output:-
Future task started scala-execution-context-global-11
Completed some task by main
false
Completed some task by main
false
Completed some task by main
false
Future task is still pending and main thread have no more lines to execute

Related

Twitter Futures - how are unused futures handled?

So I have a api in Scala that uses twitter.util.Future.
In my case, I want to create 2 futures, one of which is dependent on the result of the other and return the first future:
def apiFunc(): Future[Response]={
val future1 = getFuture1()
val future2 = future1 map {
resp => processFuture1Resp(resp)
}
future1
}
So in this case, future2 is never consumed and the api returns the result of future1 before future2 is completed.
My question is - will future2 run even though the api has returned?
Further, future1 should not be affected by future2. That is, the processing time for future2 should not be seen by any client that calls apiFunc(). You can think of processFuture1Resp as sending the results of getFuture1() to another service but the client calling apiFunc() doesn't care about that portion and only wants future1 as quickly as possible
My understanding is that futures spawn threads, but I am unsure if that thread will be terminated after the return of the main thread.
I guess a different way to ask this is - Will a twitter.util.future always be executed? Is there a way to fire and forget a twitter.util.future?
If you want to chain two futures where one of them processes the result of the other (and return a future), you can use a for comprehension:
def getFuture1() = {
println("Getting future 1...")
Thread.sleep(1000)
Future.successful("42")
}
def processFuture1Resp(resp: String) = {
println(s"Result of future 1 is: ${resp}")
"END"
}
def apiFunc(): Future[String]={
for {
res1 <- getFuture1()
res2 <- Future(processFuture1Resp(res1))
} yield res2
}
def main(args: Array[String]) {
val finalResultOfFutures: Future[String] = apiFunc()
finalResultOfFutures
Thread.sleep(500)
}
This will print:
Getting future 1...
Result of future 1 is: 42
The value finalResultOfFutures will contain the result of chaining both futures, and you'll be sure that the first future is executed before the second one. If you don't wait for the execution of finalResultOfFutures on the main thread (commenting the last sleep function of the main thread), you will only see:
Getting future 1...
The main thread will finish before the second future has time to print anything.
Another (better) way to wait for the execution of the future would be something like this:
val maxWaitTime: FiniteDuration = Duration(5, TimeUnit.SECONDS)
Await.result(finalResultOfFutures, maxWaitTime)
Await.result will block the main thread and waits a defined duration for the result of the given Future. If it is not ready or completes with a failure, Await.result will throw an exception.
EDITED
Another option is to use Monix Tasks This library allows you wrap actions (such as Futures) and have a greater control on how and when the Futures are executed. Since a Future may start its execution right after its declaration, these functionalities can be quite handy. Example:
import monix.execution.Scheduler.Implicits.global
import scala.concurrent.duration._
def getFuture1() = {
println("Getting future 1...")
Thread.sleep(3000)
println("Future 1 finished")
Future.successful("42")
}
def processFuture1Resp(resp: Task[String]) = {
val resp1 = resp.runSyncUnsafe(2.seconds)
println(s"Future2: Result of future 1 is: ${resp1}")
}
def main(args: Array[String]) {
// Get a Task from future 1. A Task does not start its execution until "run" is called
val future1: Task[String] = Task.fromFuture(getFuture1())
// Here we can create the Future 2 that depends on Future 1 and execute it async. It will finish even though the main method ends.
val future2 = Task(processFuture1Resp(future1)).runAsyncAndForget
// Now the future 1 starts to be calculated. Yo can runAsyncAndForget here or return just the task so that the client
// can execute it later on
future1.runAsyncAndForget
}
This will print:
Getting future 1...
Future 1 finished
Future2: Result of future 1 is: 42
Process finished with exit code 0

Await for a Sequence of Futures with timeout without failing on TimeoutException

I have a sequence of scala Futures of same type.
I want, after some limited time, to get a result for the entire sequence while some futures may have succeeded, some may have failed and some haven't completed yet, the non completed futures should be considered failed.
I don't want to use Await each future sequentially.
I did look at this question: Scala waiting for sequence of futures
and try to use the solution from there, namely:
private def lift[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
futures.map(_.map { Success(_) }.recover { case t => Failure(t) })
def waitAll[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
Future.sequence(lift(futures))
futures: Seq[Future[MyObject]] = ...
val segments = Await.result(waitAll(futures), waitTimeoutMillis millis)
but I'm still getting a TimeoutException, I guess because some of the futures haven't completed yet.
and that answer also states,
Now Future.sequence(lifted) will be completed when every future is completed, and will represent successes and failures using Try.
But I want my Future to be completed after the timeout has passed, not when every future in the sequence has completed. What else can I do?
If I used raw Future (rather than some IO monad which has this functionality build-in, or without some Akka utils for exactly that) I would hack together utility like:
// make each separate future timeout
object FutureTimeout {
// separate EC for waiting
private val timeoutEC: ExecutorContext = ...
private def timeout[T](delay: Long): Future[T] = Future {
blocking {
Thread.sleep(delay)
}
throw new Exception("Timeout")
}(timeoutEC)
def apply[T](fut: Future[T], delat: Long)(
implicit ec: ExecutionContext
): Future[T] = Future.firstCompletedOf(Seq(
fut,
timeout(delay)
))
}
and then
Future.sequence(
futures
.map(FutureTimeout(_, delay))
.map(Success(_))
.recover { case e => Failure(e) }
)
Since each future would terminate at most after delay we would be able to collect them into one result right after that.
You have to remember though that no matter how would you trigger a timeout you would have no guarantee that the timeouted Future stops executing. It could run on and on on some thread somewhere, it's just that you wouldn't wait for the result. firstCompletedOf just makes this race more explicit.
Some other utilities (like e.g. Cats Effect IO) allow you to cancel computations (which is used in e.g. races like this one) but you still have to remember that JVM cannot arbitrarily "kill" a running thread, so that cancellation would happen after one stage of computation is completed and before the next one is started (so e.g. between .maps or .flatMaps).
If you aren't afraid of adding external deps there are other (and more reliable, as Thread.sleep is just a temporary ugly hack) ways of timing out a Future, like Akka utils. See also other questions like this.
Here is solution using monix
import monix.eval.Task
import monix.execution.Scheduler
val timeoutScheduler = Scheduler.singleThread("timeout") //it's safe to use single thread here because timeout tasks are very fast
def sequenceDiscardTimeouts[T](tasks: Task[T]*): Task[Seq[T]] = {
Task
.parSequence(
tasks
.map(t =>
t.map(Success.apply) // Map to success so we can collect the value
.timeout(500.millis)
.executeOn(timeoutScheduler) //This is needed to run timesouts in dedicated scheduler that won't be blocked by "blocking"/io work if you have any
.onErrorRecoverWith { ex =>
println("timed-out")
Task.pure(Failure(ex)) //It's assumed that any error is a timeout. It's possible to "catch" just timeout exception here
}
)
)
.map { res =>
res.collect { case Success(r) => r }
}
}
Testing code
implicit val mainScheduler = Scheduler.fixedPool(name = "main", poolSize = 10)
def slowTask(msg: String) = {
Task.sleep(Random.nextLong(1000).millis) //Sleep here to emulate a slow task
.map { _ =>
msg
}
}
val app = sequenceDiscardTimeouts(
slowTask("1"),
slowTask("2"),
slowTask("3"),
slowTask("4"),
slowTask("5"),
slowTask("6")
)
val started: Long = System.currentTimeMillis()
app.runSyncUnsafe().foreach(println)
println(s"Done in ${System.currentTimeMillis() - started} millis")
This will print an output different for each run but it should look like following
timed-out
timed-out
timed-out
3
4
5
Done in 564 millis
Please note the usage of two separate schedulers. This is to ensure that timeouts will fire even if the main scheduler is busy with business logic. You can test it by reducing poolSize for main scheduler.

Blocking Operation in Actor NOT Occupying All Default Dispatchers

I am learning Akka Actor recently. I read the document of dispatchers in Actor. I am curious about the blocking operation in an actor. The last topic in the document describes how to solve the problem. And I am trying to reproduce the example experiment in the document.
Here is my code:
package dispatcher
import akka.actor.{ActorSystem, Props}
import com.typesafe.config.ConfigFactory
object Main extends App{
var config = ConfigFactory.parseString(
"""
|my-dispatcher{
|type = Dispatcher
|
|executor = "fork-join-executor"
|
|fork-join-executor{
|fixed-pool-size = 32
|}
|throughput = 1
|}
""".stripMargin)
// val system = ActorSystem("block", ConfigFactory.load("/Users/jiexray/IdeaProjects/ActorDemo/application.conf"))
val system = ActorSystem("block")
val actor1 = system.actorOf(Props(new BlockingFutureActor()))
val actor2 = system.actorOf(Props(new PrintActor()))
for(i <- 1 to 1000){
actor1 ! i
actor2 ! i
}
}
package dispatcher
import akka.actor.Actor
import scala.concurrent.{ExecutionContext, Future}
class BlockingFutureActor extends Actor{
override def receive: Receive = {
case i: Int =>
Thread.sleep(5000)
implicit val excutionContext: ExecutionContext = context.dispatcher
Future {
Thread.sleep(5000)
println(s"Blocking future finished ${i}")
}
}
}
package dispatcher
import akka.actor.Actor
class PrintActor extends Actor{
override def receive: Receive = {
case i: Int =>
println(s"PrintActor: ${i}")
}
}
I simply create an ActorSystem with the default dispatchers and all actors depend on those. The BlockingFutureActor has a blocking operation that is encapsulated in a Future. The PrintActor is merely printing a number instantly.
In the document's explanation, the default dispatchers will be occupied by Futures in the BlockingFutureActor, which leads to the message blocking of the PrintActor. The application gets stuck somewhere like:
> PrintActor: 44
> PrintActor: 45
Unfortunately, my code is not blocked. All outputs from PrintActor show up smoothly. But outputs from BlockingFutureActor show up like squeezing toothpaste. I try to monitor my thread info by Intellij's Debug, I got:
You may find only two dispatchers are sleeping(BlockingFutureActor makes this happen). Others are waiting, which means they are available for new message delivering.
I have read an answer about blocking operation in Actor(page). It is quoted that "Dispatchers are, effectively, thread-pools. Separating the two guarantees that the slow, blocking operations don't starve the other. This approach, in general, is referred to as bulk-heading, because the idea is that if a part of the app fails, the rest remains responsive."
Do default dispatchers spare some dispatcher for blocking operation? Such that the system can handle messages even if there are so many blocking operations asking for dispatchers.
Can the experiment in the Akka document be reproduced? Is there something wrong with my configuration.
Thanks for your suggestions. Best Wishes.
The reason you see all 1000 print statements from the PrintActor before any print statements from the BlockingFutureActor is because of the first Thread.sleep call in the BlockingFutureActor's receive block. This Thread.sleep is the key difference between your code and the example in the official documentation:
override def receive: Receive = {
case i: Int =>
Thread.sleep(5000) // <----- this call is not in the example in the official docs
implicit val excutionContext: ExecutionContext = context.dispatcher
Future {
...
}
}
Remember that actors process one message at a time. The Thread.sleep(5000) basically simulates a message that takes at least five seconds to process. The BlockingFutureActor won't process another message until it's done processing the current message, even if it has hundreds of messages in its mailbox. While the BlockingFutureActor is processing that first Int message of value 1, the PrintActor has already finished processing all 1000 messages that were sent to it. To make this more clear, let's add a println statement:
override def receive: Receive = {
case i: Int =>
println(s"Entering BlockingFutureActor's receive: $i") // <-----
Thread.sleep(5000)
implicit val excutionContext: ExecutionContext = context.dispatcher
Future {
...
}
}
A sample output when we run the program:
Entering BlockingFutureActor's receive: 1
PrintActor: 1
PrintActor: 2
PrintActor: 3
...
PrintActor: 1000
Entering BlockingFutureActor's receive: 2
Entering BlockingFutureActor's receive: 3
Blocking future finished 1
...
As you can see, by the time the BlockingFutureActor actually begins to process the message 2, the PrintActor has already churned through all 1000 messages.
If you remove that first Thread.sleep, then you'll see messages dequeued from the BlockingFutureActor's mailbox more quickly, because the work is being "delegated" to a Future. Once the Future is created, the actor grabs the next message from its mailbox without waiting for the Future to complete. Below is a sample output without that first Thread.sleep (it won't be exactly the same every time you run it):
Entering BlockingFutureActor's receive: 1
PrintActor: 1
PrintActor: 2
...
PrintActor: 84
PrintActor: 85
Entering BlockingFutureActor's receive: 2
Entering BlockingFutureActor's receive: 3
Entering BlockingFutureActor's receive: 4
Entering BlockingFutureActor's receive: 5
PrintActor: 86
PrintActor: 87
...

Compose two Scala futures with callbacks, WITHOUT a third ExecutionContext

I have two methods, let's call them load() and init(). Each one starts a computation in its own thread and returns a Future on its own execution context. The two computations are independent.
val loadContext = ExecutionContext.fromExecutor(...)
def load(): Future[Unit] = {
Future
}
val initContext = ExecutionContext.fromExecutor(...)
def init(): Future[Unit] = {
Future { ... }(initContext)
}
I want to call both of these from some third thread -- say it's from main() -- and perform some other computation when both are finished.
def onBothComplete(): Unit = ...
Now:
I don't care which completes first
I don't care what thread the other computation is performed on, except:
I don't want to block either thread waiting for the other;
I don't want to block the third (calling) thread; and
I don't want to have to start a fourth thread just to set the flag.
If I use for-comprehensions, I get something like:
val loading = load()
val initialization = initialize()
for {
loaded <- loading
initialized <- initialization
} yield { onBothComplete() }
and I get Cannot find an implicit ExecutionContext.
I take this to mean Scala wants a fourth thread to wait for the completion of both futures and set the flag, either an explicit new ExecutionContext or ExecutionContext.Implicits.global. So it would appear that for-comprehensions are out.
I thought I might be able to nest callbacks:
initialization.onComplete {
case Success(_) =>
loading.onComplete {
case Success(_) => onBothComplete()
case Failure(t) => log.error("Unable to load", t)
}
case Failure(t) => log.error("Unable to initialize", t)
}
Unfortunately onComplete also takes an implicit ExecutionContext, and I get the same error. (Also this is ugly, and loses the error message from loading if initialization fails.)
Is there any way to compose Scala Futures without blocking and without introducing another ExecutionContext? If not, I might have to just throw them over for Java 8 CompletableFutures or Javaslang Vavr Futures, both of which have the ability to run callbacks on the thread that did the original work.
Updated to clarify that blocking either thread waiting for the other is also not acceptable.
Updated again to be less specific about the post-completion computation.
Why not just reuse one of your own execution contexts? Not sure what your requirements for those are but if you use a single thread executor you could just reuse that one as the execution context for your comprehension and you won't get any new threads created:
implicit val loadContext = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor)
If you really can't reuse them you may consider this as the implicit execution context:
implicit val currentThreadExecutionContext = ExecutionContext.fromExecutor(
(runnable: Runnable) => {
runnable.run()
})
Which will run futures on the current thread. However, the Scala docs explicitly recommends against this as it introduces nondeterminism in which thread runs the Future (but as you stated, you don't care which thread it runs on so this may not matter).
See Synchronous Execution Context for why this isn't advisable.
An example with that context:
val loadContext = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor)
def load(): Future[Unit] = {
Future(println("loading thread " + Thread.currentThread().getName))(loadContext)
}
val initContext = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor)
def init(): Future[Unit] = {
Future(println("init thread " + Thread.currentThread().getName))(initContext)
}
val doneFlag = new AtomicBoolean(false)
val loading = load()
val initialization = init()
implicit val currentThreadExecutionContext = ExecutionContext.fromExecutor(
(runnable: Runnable) => {
runnable.run()
})
for {
loaded <- loading
initialized <- initialization
} yield {
println("yield thread " + Thread.currentThread().getName)
doneFlag.set(true)
}
prints:
loading thread pool-1-thread-1
init thread pool-2-thread-1
yield thread main
Though the yield line may print either pool-1-thread-1 or pool-2-thread-1 depending on the run.
In Scala, a Future represents a piece of work to be executed async (i.e. concurrently to other units of work). An ExecutionContext represent a pool of threads for executing Futures. In other words, ExecutionContext is the team of worker who performs the actual work.
For efficiency and scalability, it's better to have big team(s) (e.g. single ExecutionContext with 10 threads to execute 10 Future's) rather than small teams (e.g. 5 ExecutionContext with 2 threads each to execute 10 Future's).
In your case if you want to limit the number of threads to 2, you can:
def load()(implicit teamOfWorkers: ExecutionContext): Future[Unit] = {
Future { ... } /* will use the teamOfWorkers implicitly */
}
def init()(implicit teamOfWorkers: ExecutionContext): Future[Unit] = {
Future { ... } /* will use the teamOfWorkers implicitly */
}
implicit val bigTeamOfWorkers = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(2))
/* All async works in the following will use
the same bigTeamOfWorkers implicitly and works will be shared by
the 2 workers (i.e. thread) in the team */
for {
loaded <- loading
initialized <- initialization
} yield doneFlag.set(true)
The Cannot find an implicit ExecutionContext error does not mean that Scala wants additional threads. It only means that Scala wants a ExecutionContext to do the work. And additional ExecutionContext does not necessarily implies additional 'thread', e.g. the following ExecutionContext, instead of creating new threads, will execute works in the current thread:
val currThreadExecutor = ExecutionContext.fromExecutor(new Executor {
override def execute(command: Runnable): Unit = command.run()
})

Asynchronous in Failure

Why is the user's name not printed when I comment out the println("testing")?
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
object Future3 extends App {
val userFuture = Future(
User("Me")
)
val userNameFuture: Future[String] = userFuture map {
user => user.name
}
userNameFuture onSuccess {
case userName => println(s"user's name = $userName")
}
// println("testing")
}
case class User(name: String)
The reason is that default ExecutionContext global executes your future block on daemon thread and main thread doesn't wait for daemons to complete. You can use Thread.sleep(1000), Await.result(userNameFuture, 1 second) or another thread blocking operation in main thread to wait for some time so that your future's thread completes.
Another way is to run future on not-daemon thread:
import java.util.concurrent.Executors
import scala.concurrent.{ExecutionContext, Future}
object Future3 extends App {
implicit val executor = ExecutionContext
.fromExecutorService(Executors.newCachedThreadPool()) //not-daemon threads
val userFuture = Future(
User("Me")
)
val userNameFuture: Future[String] = userFuture map {
user => user.name
}
userNameFuture onSuccess {
case userName => println(s"user's name = $userName")
}
}
case class User(name: String)
Short answer
The ExecutionContext.Implicits.global creates daemon threads. (see Scala source code scala.concurrent.impl.ExecutionContextImpl.DefaultThreadFactory) These are threads the JVM will not wait for on exit (in your case, when the main routine stops). Thus, before the userNameFuture which runs as a daemon thread has finished, the main routine is already finished and does not wait for the future threads to finish.
To prevent this from happening, either use non-daemon thread, e.g. to create such an implicit ExecutionContext
implicit val ec = (scala.concurrent.ExecutionContext.fromExecutorService(Executors.newCachedThreadPool()))
or use
Await.result( userNameFuture, Duration.Inf )
in the main routine.
Attention: If you use the latter approach with both Await.result and onSuccess callback, it still can happen, that the main routine exits first and no output of username will be made, as there is no order which of both happens first.
Long answer
Have a look at the code
object F2 {
def main(args: Array[String]): Unit = {
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.Success
val userFuture = Future {
Thread.sleep(1000)
println( "userFuture runs on: " + Thread.currentThread().getName)
Thread.sleep(1000)
User("Me")
}
val userNameFuture: Future[String] = userFuture map {
user => {
Thread.sleep(2000)
println( "map runs on: " + Thread.currentThread().getName )
Thread.sleep(2000)
user.name
}
}
val p = Promise[Boolean]()
userNameFuture onSuccess {
case userName => {
println( "onSuccess runs on : " + Thread.currentThread().getName )
println(s"user's name = $userName")
p.complete(Success(true))
}
}
println( "main runs on: " + Thread.currentThread().getName )
println( "main is waiting (for promise to complete) .... ")
Await.result( p.future, Duration.Inf )
println( "main got promise fulfilled")
println( "main end ")
}
}
whose output is
main runs on: run-main-b
main is waiting (for promise to complete) ....
userFuture runs on: ForkJoinPool-1-worker-5
map runs on: ForkJoinPool-1-worker-5
onSuccess runs on : ForkJoinPool-1-worker-5
user's name = Me
main got promise fulfilled
main end
First, you can see, that both userFuture and it's map operation run on ForkJoinPool as daemon threads.
Second, main is run through first, printing "main is waiting for promise" and waits here (only for demenstration purposes) for the promise to be fulfilled. If main wouldn't wait here ( try yourself, by commenting out the Await) for the promise completion, the main routine would just print the other two lines and is done. As a result, the JVM would close (and you would never see output of onComplete)
Trick (for debugging) via SBT
In general, if you are using SBT and invoke program execution via run, then you can still see the output of daemon threads, as the JVM is not terminated, if started from within SBT.
So, if you start via SBT run you are soon back to the SBT prompt (because main routine has finished), but output of threads (onComplete) is visible in SBT.