Practical use of futures? Ie, how to kill them? - scala

Futures are very convenient, but in practice, you may need some guarantees on their execution. For example, consider:
import scala.actors.Futures._
def slowFn(time:Int) = {
Thread.sleep(time * 1000)
println("%d second fn done".format(time))
}
val fs = List( future(slowFn(2)), future(slowFn(10)) )
awaitAll(5000, fs:_*)
println("5 second expiration. Continuing.")
Thread.sleep(12000) // ie more calculations
println("done with everything")
The idea is to kick off some slow running functions in parallel. But we wouldn't want to hang forever if the functions executed by the futures don't return. So we use awaitAll() to put a timeout on the futures. But if you run the code, you see that the 5 second timer expires, but the 10 second future continues to run and returns later. The timeout doesn't kill the future; it just limits the join wait.
So how do you kill a future after a timeout period? It seems like futures can't be used in practice unless you're certain that they will return in a known amount of time. Otherwise, you run the risk of losing threads in the thread pool to non-terminating futures until there are none left.
So the questions are: How do you kill futures? What are the intended usage patterns for futures given these risks?

Futures are intended to be used in settings where you do need to wait for the computation to complete, no matter what. That's why they are described as being used for slow running functions. You want that function's result, but you have other stuff you can be doing meanwhile. In fact, you might have many futures, all independent of each other that you may want to run in parallel, while you wait until all complete.
The timer just provides a wait to get partial results.

I think the reason Future can't simply be "killed" is exactly the same as why java.lang.Thread.stop() is deprecated.
While Future is running, a Thread is required. In order to stop a Future without calling stop() on the executing Thread, application specific logic is needed: checking for an application specific flag or the interrupted status of the executing Thread periodically is one way to do it.

Related

Should I block on a Future - scala

According to Scala documentation, no blocking should be done on Future.
"As mentioned earlier, blocking on a future is strongly discouraged for the sake of performance and for the prevention of deadlocks. Callbacks and combinators on futures are a preferred way to use their results. However, blocking may be necessary in certain situations and is supported by the Futures and Promises API."
How can I ensure that my all the Futures have completed (and their callbacks finished) before my program exits? I usually use Await.result at the end of my main function to ensure that all Futures have completed.
object ConcurrencyExample extends App {
val gpf= Future {some operations}
val ccf = Future{some operations}
val atbf = for {g <- gpf
c <- ccf if c == true} yield {some operations}
//is it OK to use Await? If not, how do I ensure that all Futures have finished
?
Await.result(atbf,1000 millis )
}
Questions
Is using Await wrong? My code doesn't wait for Futures to finish otherwise
If so, What is the alternative?
How do I ensure that the Future and its callback have completed before my main program exits ?
Yes you can Await.result in your case.
You can use Await.result for keeping main thread alive for futures to complete
Becareful with Await.result
Note this applies for both Akka and play apps
Await.result should be used very carefully only when it is absolutely necessary.
Await.result blocks the thread in which it is running until the given duration. Blocking the thread will waste the precious computation resource because that thread will not be able to do any useful computation like handling the new request or number crunching in an algorithm etc.
So, Avoid using the Await.result as much as possible.
But, when do we use it (Await.result) ?
Here is one of the typical use case for using Await.result.
Lets say you have written a program containing main thread and all the computation inside the main thread is asynchronous. Now once you start the asynchronous computation inside the main thread. Some one has to stop the main thread from existing till the asynchronous computation finishes, if not the program stops running and you cannot see the result of the asynchronous computation.
When an application begins running, there is one non-daemon thread, whose job is to execute main(). JVM will not exit by itself until and unless non-daemon threads are completed.
object Main {
def main(args: Array[String]): Unit = {
import scala.concurrent.Future
import scala.concurrent.duration._
val f = Future { //do something }
//stop main thread till f completes
Await.result(f, 10 seconds)
}
}
Future uses daemon threads for running. So daemon threads cannot stop the JVM from shutting down. So JVM shuts down even if non-daemon threads are running.
In the above case there is no other way expect stopping (blocking) the main thread till the computation f completes if not main thread exits and computation stops.
In most of the cases you do not need to use Await.result and simple Future composition using map and flatMap would suffice.
Risks of using Await.result (In general all blocking code)
Running out of threads in event based model
In event based model you will quickly run out of threads if you have blocking code which takes long time to return. In playframework any blocking call could decrease the performance of the application and app will becomes dead slow as it runs out of threads.
Running out of memory in non-event based models
In thread per request models. When you have blocking calls which take long time to exit/return.
case 1: If you have fixed thread pool then application might run out of threads.
case 2: If you have dynamically growing thread pool then your application will suffer from too much context switching overhead and also will run out of memory because of too many blocked threads in memory.
In all of the cases no useful work is done expect for waiting for some IO or some other event.

How to create ScalaZ Task which is running asynchronously right after creation?

I need Scalaz Task (or some wrapper) which is already running, and can return value immediately if it is completed, or after some waiting if it is not. In terms of Future I could do it like this:
val f = myTask.get.started
This way I have Future running asynchronously, which on f.run returns result immediately when called after the computation is complete, or blocks for some time and waits for completion if it is not. However, this way I loose error handling.
How to have Task and not use Future, but still have it already running asynchronously before run, or runAsync is called on it?
The intention of scalaz.Task is clear control over the execution, which makes referential transparency possible. If you want to fork off the Task, use:
val result = Task.fork(myTask)
and the task will run in its own threadpool as soon as you run it with one of the unsafe* methods.

play - how to wrap a blocking code with futures

I am trying to understand the difference between the 2 methods, in terms of functionality.
class MyService (blockService: BlockService){
def doSomething1(): Future[Boolean] = {
//do
//some non blocking
//stuff
val result = blockService.block()
Future.successful(result)
}
def doSomething2(): Future[Boolean] = {
Future{
//do
//some non blocking
//stuff
blockService.block()
}
}
}
To my understanding the difference between the 2 is which thread is the actual thread that will be blocked.
So if there is a thread: thread_1 that execute something1, thread_1 will be the one that is blocked, while if a thread_1 executed something2a new thread will run it - thread_2, and thread_2 is the one to be blocked.
Is this true?
If so, than there is no really a preferred way to write this code? if I don't care which thread will eventually be blocked, then the end result will be the same.
dosomething1 seems like a weird way to write this code, I would choose dosomething2.
Make sense?
Yes, doSomething1 and doSomething2 blocks different threads, but depending on your scenario, this is an important decision.
As #AndreasNeumann said, you can have different execution contexts in doSomething2. Imagine that the main execution context is the one receiving HTTP requests from your users. Block threads in this context is bad because you can easily exhaust the execution context and impact requests that have nothing to do with doSomething.
Play docs have a better explanation about the possible problems with having blocking code:
If you plan to write blocking IO code, or code that could potentially do a lot of CPU intensive work, you need to know exactly which thread pool is bearing that workload, and you need to tune it accordingly. Doing blocking IO without taking this into account is likely to result in very poor performance from Play framework, for example, you may see only a few requests per second being handled, while CPU usage sits at 5%. In comparison, benchmarks on typical development hardware (eg, a MacBook Pro) have shown Play to be able to handle workloads in the hundreds or even thousands of requests per second without a sweat when tuned correctly.
In your case, both methods are being executed using Play default thread pool. I suggest you to take a look at the recommended best practices and see if you need a different execution context or not. I also suggest you to read Akka docs about Dispatchers and Futures to gain a better understanding about what executing Futures and have blocking/non-blocking code.
This approach makes sense if you make use of different execution contexts in the second method.
So having for example one for answering requests and another for blocking requests.
So you would use the normal playExecutionContext to keep you application running and answering and separate blocking operation in a different one.
def doSomething2(): Future[Boolean] = Future{
blocking { blockService.block() }
}( mySpecialExecutionContextForBlockingOperations )
For a little more information: http://docs.scala-lang.org/overviews/core/futures.html#blocking
You are correct. I don't see a point in doSomething1. It simply complicates the interface for the caller while not providing the benefits of an asynchronous API.
Does BlockService handle blocking operation?
Normally, use blocking ,as #Andreas remind,to make blocking operation into another thread is meanful.

Execution context without daemon threads for futures

I am having trouble with the JVM immediately exiting using various new applications I wrote which spawn threads through the Scala 2.10 Futures + Promises framework.
It seems that at least with the default execution context, even if I'm using blocking, e.g.
future { blocking { /* work */ }}
no non-daemon thread is launched, and therefore the JVM thinks it can immediately quit.
A stupid work around is to launch a dummy Thread instance which is just waiting, but then I also need to make sure that this thread stops when the processes are done.
So how to I enforce them to run on non-daemon threads?
In looking at the default ExecutionContext attached to ExecutionContext.global, it's of the fork join variety and the Threadfactory it uses sets the threads to daemon. If you want to work around this, you could use a different ExecutionContext, one you set up yourself. If you still want the FJP variety (and you probably do as it scales the best), you should be able to look at what they are doing in ExecutionContextImpl via this link and create something similar. Or just use a cached thread pool via Executors.newCachedThreadPool as that won't shut down immediately before your futures complete.
spawn processes
If this means processes and not just tasks, then scala.sys.process spawns non-daemon threads to run OS processes.
Otherwise, if you're creating a bunch of tasks, this is what Future.sequence helps with. Then just Await ready (Future sequence List(futures)) on the main thread.

In Scala, does Futures.awaitAll terminate the thread on timeout?

So I'm writing a mini timeout library in scala, it looks very similar to the code here: How do I get hold of exceptions thrown in a Scala Future?
The function I execute is either going to complete successfully, or block forever, so I need to make sure that on a timeout the executing thread is cancelled.
Thus my question is: On a timeout, does awaitAll terminate the underlying actor, or just let it keep running forever?
One alternative that I'm considering is to use the java Future library to do this as there is an explicit cancel() method one can call.
[Disclaimer - I'm new to Scala actors myself]
As I read it, scala.actors.Futures.awaitAll waits until the list of futures are all resolved OR until the timeout. It will not Future.cancel, Thread.interrupt, or otherwise attempt to terminate a Future; you get to come back later and wait some more.
The Future.cancel may be suitable, however be aware that your code may need to participate in effecting the cancel operation - it doesn't necessarily come for free. Future.cancel cancels a task that is scheduled, but not yet started. It interrupts a running thread [setting a flag that can be checked]... which may or may not acknowledge the interrupt. Review Thread.interrupt and Thread.isInterrupted(). Your long-running task would normally check to see if it's being interrupted (your code), and self-terminate. Various methods (i.e. Thread.sleep, Object.wait and others) respond to the interrupt by throwing InterruptedException. You need to review & understand that mechanism to ensure your code will meet your needs within those constraints. See this.