I am having trouble with the JVM immediately exiting using various new applications I wrote which spawn threads through the Scala 2.10 Futures + Promises framework.
It seems that at least with the default execution context, even if I'm using blocking, e.g.
future { blocking { /* work */ }}
no non-daemon thread is launched, and therefore the JVM thinks it can immediately quit.
A stupid work around is to launch a dummy Thread instance which is just waiting, but then I also need to make sure that this thread stops when the processes are done.
So how to I enforce them to run on non-daemon threads?
In looking at the default ExecutionContext attached to ExecutionContext.global, it's of the fork join variety and the Threadfactory it uses sets the threads to daemon. If you want to work around this, you could use a different ExecutionContext, one you set up yourself. If you still want the FJP variety (and you probably do as it scales the best), you should be able to look at what they are doing in ExecutionContextImpl via this link and create something similar. Or just use a cached thread pool via Executors.newCachedThreadPool as that won't shut down immediately before your futures complete.
spawn processes
If this means processes and not just tasks, then scala.sys.process spawns non-daemon threads to run OS processes.
Otherwise, if you're creating a bunch of tasks, this is what Future.sequence helps with. Then just Await ready (Future sequence List(futures)) on the main thread.
Related
According to Scala documentation, no blocking should be done on Future.
"As mentioned earlier, blocking on a future is strongly discouraged for the sake of performance and for the prevention of deadlocks. Callbacks and combinators on futures are a preferred way to use their results. However, blocking may be necessary in certain situations and is supported by the Futures and Promises API."
How can I ensure that my all the Futures have completed (and their callbacks finished) before my program exits? I usually use Await.result at the end of my main function to ensure that all Futures have completed.
object ConcurrencyExample extends App {
val gpf= Future {some operations}
val ccf = Future{some operations}
val atbf = for {g <- gpf
c <- ccf if c == true} yield {some operations}
//is it OK to use Await? If not, how do I ensure that all Futures have finished
?
Await.result(atbf,1000 millis )
}
Questions
Is using Await wrong? My code doesn't wait for Futures to finish otherwise
If so, What is the alternative?
How do I ensure that the Future and its callback have completed before my main program exits ?
Yes you can Await.result in your case.
You can use Await.result for keeping main thread alive for futures to complete
Becareful with Await.result
Note this applies for both Akka and play apps
Await.result should be used very carefully only when it is absolutely necessary.
Await.result blocks the thread in which it is running until the given duration. Blocking the thread will waste the precious computation resource because that thread will not be able to do any useful computation like handling the new request or number crunching in an algorithm etc.
So, Avoid using the Await.result as much as possible.
But, when do we use it (Await.result) ?
Here is one of the typical use case for using Await.result.
Lets say you have written a program containing main thread and all the computation inside the main thread is asynchronous. Now once you start the asynchronous computation inside the main thread. Some one has to stop the main thread from existing till the asynchronous computation finishes, if not the program stops running and you cannot see the result of the asynchronous computation.
When an application begins running, there is one non-daemon thread, whose job is to execute main(). JVM will not exit by itself until and unless non-daemon threads are completed.
object Main {
def main(args: Array[String]): Unit = {
import scala.concurrent.Future
import scala.concurrent.duration._
val f = Future { //do something }
//stop main thread till f completes
Await.result(f, 10 seconds)
}
}
Future uses daemon threads for running. So daemon threads cannot stop the JVM from shutting down. So JVM shuts down even if non-daemon threads are running.
In the above case there is no other way expect stopping (blocking) the main thread till the computation f completes if not main thread exits and computation stops.
In most of the cases you do not need to use Await.result and simple Future composition using map and flatMap would suffice.
Risks of using Await.result (In general all blocking code)
Running out of threads in event based model
In event based model you will quickly run out of threads if you have blocking code which takes long time to return. In playframework any blocking call could decrease the performance of the application and app will becomes dead slow as it runs out of threads.
Running out of memory in non-event based models
In thread per request models. When you have blocking calls which take long time to exit/return.
case 1: If you have fixed thread pool then application might run out of threads.
case 2: If you have dynamically growing thread pool then your application will suffer from too much context switching overhead and also will run out of memory because of too many blocked threads in memory.
In all of the cases no useful work is done expect for waiting for some IO or some other event.
I have an actor that uses ProcessBuilder to execute an external process:
def act {
while (true) {
receive {
case param: String => {
val filePaths = Seq("/tmp/file1","/tmp/file2")
val fileList = new ByteArrayInputStream(filePaths.mkString("\n").getBytes())
val output = s"myExecutable.sh ${param}" #< fileList !!<
doSomethingWith(output)
}
}
}
}
I run hundreds this actors running in parallel. Sometimes, for an unknown reason, the execution of the process (!!) never returns. It hangs forever. This specific actor cannot handle new messages. Is there any way to setup a timeout for this process to return, and if it exceeds retry?
What could be the reason for these executions to hold forever? Because these commands are not supposed to last more than a few milliseconds.
Edit 1:
Two important facts that I observed:
This problem does not occur on Max OS X, only in Linux
When I don't use ByteArrayInputStream as input for the execution, the program does not hang
I have an actor that uses ProcessBuilder to execute an external process: ... I run hundreds this actors running in parallel ...
That's some very heavy processing happening in parallel just to achieve a few millisecs of work in each case. Concurrent processing mechanisms rank as follows (from worst to best in terms of resource-usage, scalability and performance):
process = heavy-weight
thread = medium-weight (dozens of threads can execute within a single process space)
actor = light-weight (dozens of actors can execute by leveraging a single shared thread or multiple shared threads)
Concurrently spawning many processes takes significant operating system resources - for process creation and termination. In extreme cases, the O/S overhead to start & end processes could consume hundreds or thousands more CPU and memory resources than the actual job execution. That's why the thread-model was created (and the more efficient actor model). Think of your current processing as doing 'CGI-like' non-scalable O/S-stressing-processing from within your extremely-scalable actors - that's an anti-pattern. It doesn't take much to stress some operating systems to the point of breakage: this could be happening.
Also, if the files being read are very large in size, it would be best for scalability and reliability to limit the number of processes that concurrently read files on the same disk. It might be OK for up to 10 processes to read concurrently, I doubt it would be OK for 100.
How should an Actor invoke an external program?
Of course, if you converted your logic in myExecutable.sh into Scala, you would not need to create processes at all. Achieving scalability, performance and reliability would be more straightforward.
Assuming this is not possible/desirable, you should limit the total number of processes created and you should reuse them across different Actors / requests over time.
First solution option: (1) create a pool of processes that are reused (say size 10) (2) create actors (say 100) that communicate to/from the processes via ProcessIO (3) if all processes are busy with processing, then it is OK/appropriate that Actors block until one becomes available. The issue with this option: complexity; the 100 actors must do work to interact with the process pool and the actors themselves add little value when the processes are the bottle-neck.
Better solution option: (1) create a limited number of actors (say 10) (2) have each actor create 1 private long-running process (i.e. no pool as such) (3) have each actor communicate to/from via ProcessIO, blocking if the process is busy. Issue: still not as simple as possible; actors interact poorly with blocking processes.
Best solution option: (1) no actors, a simple for-loop from your main thread will achieve the same benefits as actors (2) create a limited number of processes (10) (3) via for-loop, sequentially interact each process using ProcessIO (if busy - block or skip to next iteration)
Is there any way to setup a timeout for this process to return, and if it exceeds retry?
Indeed there is. One of the most powerful features of actors is the ability for some actors to spawn other actors and to act as supervisor of them (receiving failure or timeout messages, from which they can recover/restart). With 'native scala actors' this is done via rudimentary programming, generating your own checks and timeout messages. But I won't cover that because the Akka approaches are more powerful and simpler. Plus the next major Scala release (2.11) will use Akka as the supported actor model, with 'native scala actors' deprecated.
Here's an example Akka supervising actor with programmatic timeout/restart (not compiled/tested). Of course, this is not useful if you go with the 3rd solution option):
import scala.concurrent.duration._
import scala.collection.immutable.Set
class Supervisor extends Actor {
override val supervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1 minute) {
case _: ArithmeticException => Resume // resumes (reuses) all child actors
case _: NullPointerException => Restart // restarts all child actors
case _: IllegalArgumentException => Stop // terminates this actor & all children
case _: Exception => Escalate // supervisor to receive exception
}
val worker = context.actorOf(Props[Worker]) // creates a supervised child actor
var pendingRequests = Set.empty[WorkerRequest]
def receive = {
case req: WorkRequest(sender, jobReq) =>
pendingRequests = pendingRequests + req
worker ! req
system.scheduler.scheduleOnce(10 seconds, self, WorkTimeout(req))
case resp: WorkResponse(req # WorkRequest(sender, jobReq), jobResp) =>
pendingRequests = pendingRequests - req
sender ! resp
case timeout: WorkTimeout(req) =>
if (pendingRequests get req != None) {
// restart the unresponsive worker
worker restart
// resend all pending requests
pendingRequests foreach{ worker ! _ }
}
}
}
A word of caution: this approach to actor supervision will not overcome poor architecture & design. If you start with suitable process/thread/actor design to meet your requirements, then supervision will promote reliability. But if you start with poor design, then there's a risk that using 'brute-force' recovery from O/S-level failures could exacerbate your problems - making process reliability worse or even causing the machine to crash.
I don't have enough info to reproduce the issue, so I can't diagnose it exactly, but here's how I'd go about diagnosing it if I were in your shoes. The basic approach is a differential diagnosis - identify possible causes, and tests that would prove or rule them out.
The first thing I'd do is to validate that the myExecutable.sh process spawned by the application is actually terminating.
If the process isn't terminating, then this is part of the problem, so we need to understand why. One thing we could do is to run something other than myExecutable.sh. You suggested that ByteArrayInputStream may be part of the problem, which suggests that myExecutable.sh is getting bad input on stdin. If that's the case, then you could instead run a script that simply logs its input to a file, which would show this. If the input is invalid, then ByteArrayInputStream is providing bad data for some reason - thread safety and unicode are the obvious culprits, but looking at the actual bad data should give you a clue. If the input is valid, then it's a bug in myExecutable.sh.
If the process is terminating, then the problem is somewhere else. My first guesses would be that it's either related to actor scheduling (actor libraries typically use ForkJoin for execution, which is great, but doesn't deal well with blocking code), or a bug in the scala.sys.process library (wouldn't be unprecedented - I had to drop scala.sys.process from a project I was working on because of a memory leak).
Looking at the stack trace for a hung thread should give you some clues (VisualVM is your friend), as you should be able to see what's waiting. You can then find the relevant code in the OpenJDK or Scala standard library source code. Where you go from there depends on what you find.
Can you not fire off this process and its handling in a future and use a timed wait against it?
I don't think we can figure it out witout knowing myExecutable.sh or doSomethingWith.
When it hangs, try killing all the myExecutable.sh processes.
If it helps, you should inspect the myExecutable.sh.
If it does not help, you should inspect the doSomethingWith function.
I'm currently researching threads in the context of the operating system and I'm unsure if a thread is a set sequence of instructions that can be repeatedly executed or if it is filled and replaced with new instructions by the user or the operating system.
Thanks a bundle!
-Tom
I'm not quite sure what you mean - the compiled instructions for a program are stored in memory and are not changed at runtime (at least for languages which are not JIT-compiled).
A thread is an entirely separate concept from the code itself. A thread gives you the ability to be running at "two places at once" in the code. At a conceptual level, a thread is simply a container for the context that you need at any point in the execution of some code. This means that each thread has a call stack and a set of registers (which are either actually stored in the registers of a processor if the thread is running, or elsewhere if the thread is paused).
Almost all thread libraries work such that a new thread will execute some user-defined function and will then exit. This function can be long-running, just like main() (which is the function executed by the first thread in your process).
If the threads are supported by the OS (ie they are not "green threads"/"fibers") they will exit by calling an OS API which tells the OS it can deallocate any data it has which is associated with that thread.
Sometimes, abstractions are built on top of this mechanism such that a thread or pool of threads will execute a function which simply loops over a queue of tasks to run, but the fundamental mechanism is the same. However, these abstractions are provided by user libraries built on top of the OS threading mechanisms, not by the OS itself.
So I'm writing a mini timeout library in scala, it looks very similar to the code here: How do I get hold of exceptions thrown in a Scala Future?
The function I execute is either going to complete successfully, or block forever, so I need to make sure that on a timeout the executing thread is cancelled.
Thus my question is: On a timeout, does awaitAll terminate the underlying actor, or just let it keep running forever?
One alternative that I'm considering is to use the java Future library to do this as there is an explicit cancel() method one can call.
[Disclaimer - I'm new to Scala actors myself]
As I read it, scala.actors.Futures.awaitAll waits until the list of futures are all resolved OR until the timeout. It will not Future.cancel, Thread.interrupt, or otherwise attempt to terminate a Future; you get to come back later and wait some more.
The Future.cancel may be suitable, however be aware that your code may need to participate in effecting the cancel operation - it doesn't necessarily come for free. Future.cancel cancels a task that is scheduled, but not yet started. It interrupts a running thread [setting a flag that can be checked]... which may or may not acknowledge the interrupt. Review Thread.interrupt and Thread.isInterrupted(). Your long-running task would normally check to see if it's being interrupted (your code), and self-terminate. Various methods (i.e. Thread.sleep, Object.wait and others) respond to the interrupt by throwing InterruptedException. You need to review & understand that mechanism to ensure your code will meet your needs within those constraints. See this.
Futures are very convenient, but in practice, you may need some guarantees on their execution. For example, consider:
import scala.actors.Futures._
def slowFn(time:Int) = {
Thread.sleep(time * 1000)
println("%d second fn done".format(time))
}
val fs = List( future(slowFn(2)), future(slowFn(10)) )
awaitAll(5000, fs:_*)
println("5 second expiration. Continuing.")
Thread.sleep(12000) // ie more calculations
println("done with everything")
The idea is to kick off some slow running functions in parallel. But we wouldn't want to hang forever if the functions executed by the futures don't return. So we use awaitAll() to put a timeout on the futures. But if you run the code, you see that the 5 second timer expires, but the 10 second future continues to run and returns later. The timeout doesn't kill the future; it just limits the join wait.
So how do you kill a future after a timeout period? It seems like futures can't be used in practice unless you're certain that they will return in a known amount of time. Otherwise, you run the risk of losing threads in the thread pool to non-terminating futures until there are none left.
So the questions are: How do you kill futures? What are the intended usage patterns for futures given these risks?
Futures are intended to be used in settings where you do need to wait for the computation to complete, no matter what. That's why they are described as being used for slow running functions. You want that function's result, but you have other stuff you can be doing meanwhile. In fact, you might have many futures, all independent of each other that you may want to run in parallel, while you wait until all complete.
The timer just provides a wait to get partial results.
I think the reason Future can't simply be "killed" is exactly the same as why java.lang.Thread.stop() is deprecated.
While Future is running, a Thread is required. In order to stop a Future without calling stop() on the executing Thread, application specific logic is needed: checking for an application specific flag or the interrupted status of the executing Thread periodically is one way to do it.