Execution Context not getting shutdown

Execution Context not getting shutdown - scala

I am running into a strange problem where the execution context is not getting shut down.I tried with or without await.
import scala.concurrent.ExecutionContext
val customExecutor: ExecutorService =
Executors.newFixedThreadPool(serviceConfig.serviceConf.numberOfThreads)
implicit val customExecutionContext: ExecutionContext =
ExecutionContext.fromExecutorService(
Executors.newFixedThreadPool(10)
)
futureCall() map { result =>
import java.util.concurrent.TimeUnit
customExecutor.shutdown()
customExecutor.awaitTermination(60, TimeUnit.SECONDS)
}

You are waiting for the shutdown inside a thread, but the executor can't shut down while a thread is running so it's not going the happen. So move the awaitTermination outside the map function.
However the real problem is that you are creating two ExecutorServices and stopping the wrong one. If you just pass the service that you created, it terminates as expected:
val customExecutor: ExecutorService =
Executors.newFixedThreadPool(10)
implicit val customExecutionContext: ExecutionContext =
ExecutionContext.fromExecutorService(customExecutor)
futureCall() map { result =>
customExecutor.shutdown()
}
customExecutor.awaitTermination(5, TimeUnit.SECONDS)

Related

Scala futures - how to end on completion?

I've inherited some code from an ex-coworker where he started using futures (in Scala) to process some data in Databricks.
I split it into chunks that complete in a similar time period. However there is no output, and I know they aren't using onSuccess or Await or anything.
The thing is, the code finishes running (doesn't return output) but the block in Databricks keeps executing until the thread.sleep() part.
I'm new to Scala and futures and am not sure how I can just exit the notebook once all the futures finish running (should i just use dbutils.notebook.exit() after the future blocks?)
Code is below:
import scala.concurrent.{Future, blocking, Await}
import java.util.concurrent.Executors
import scala.concurrent.ExecutionContext
import com.databricks.WorkflowException
val numNotebooksInParallel = 15
// If you create too many notebooks in parallel the driver may crash when you submit all of the jobs at once.
// This code limits the number of parallel notebooks.
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(numNotebooksInParallel))
val ctx = dbutils.notebook.getContext()
// The simplest interface we can have but doesn't
// have protection for submitting to many notebooks in parallel at once
println("starting parallel jobs... hang tight")
Future {
process("pro","bseg")
process("prc","bkpf")
process("prc","bseg")
process("pr4","bkpf")
process("pr4","bseg")
println("done with future1")
}
Future {
process("pr5","bkpf")
process("pr5","bseg")
process("pri","bkpf")
process("pri","bseg")
process("pr9","bkpf")
println("done with future2")
}
Future {
process("pr9","bseg")
process("prl","bkpf")
process("prl","bseg")
process("pro","bkpf")
println("done with future3")
}
println("finished futures - yay! :)")
Thread.sleep(5*60*60*1000)
println("thread timed out after 5 hrs... hope it all finished.")

One would typically save the futures as values:
val futs = Seq(
Future {
process("pro","bseg")
// and so on
},
// then the other futures
)
and then operate on the futures:
import scala.concurrent.Await
import scala.concurrent.duration._
Await.result(Future.sequence(futs), 5.hours)
Future.sequence will stop at the first one that fails or once they've all succeeded. If you want them all to run even if one fails, you could do something like
Await.result(
futs.foldLeft(Future.unit) { (_, f) =>
f.recover {
case _ => ()
}
},
5.hours
)

Handle SIGTERM in akka-http

The current (10.1.3) Akka HTTP docs:
https://doc.akka.io/docs/akka-http/current/server-side/graceful-termination.html
talk about graceful termination, using this code sample:
import akka.actor.ActorSystem
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.server.Route
import akka.stream.ActorMaterializer
import scala.concurrent.duration._
implicit val system = ActorSystem()
implicit val dispatcher = system.dispatcher
implicit val materializer = ActorMaterializer()
val routes = get {
complete("Hello world!")
}
val binding: Future[Http.ServerBinding] =
Http().bindAndHandle(routes, "127.0.0.1", 8080)
// ...
// once ready to terminate the server, invoke terminate:
val onceAllConnectionsTerminated: Future[Http.HttpTerminated] =
Await.result(binding, 10.seconds)
.terminate(hardDeadline = 3.seconds)
// once all connections are terminated,
// - you can invoke coordinated shutdown to tear down the rest of the system:
onceAllConnectionsTerminated.flatMap { _ ⇒
system.terminate()
}
I am wondering at what point this get called at, the comment states:
// once ready to terminate the server
What does this mean exactly, i.e. who/what determines the server is ready to terminate?
Do I have to put the shutdown code above in some hook function somewhere so that it is invoked on Akka HTTP receiving a SIGTERM?
I’ve tried putting this into the shutdown hook:
CoordinatedShutdown(system).addCancellableJvmShutdownHook{
// once ready to terminate the server, invoke terminate:
val onceAllConnectionsTerminated: Future[Http.HttpTerminated] =
Await.result(binding, 10.seconds)
.terminate(hardDeadline = 3.seconds)
// once all connections are terminated,
// - you can invoke coordinated shutdown to tear down the rest of the system:
onceAllConnectionsTerminated.flatMap { _ ⇒
system.terminate()
}
}
But requests in progress are ended immediately upon sending a SIGTERM (kill ), rather than completing.
I also found a slightly different way of shutdown from https://github.com/akka/akka-http/issues/1210#issuecomment-338825745:
CoordinatedShutdown(system).addTask(
CoordinatedShutdown.PhaseServiceUnbind, "http_shutdown") { () =>
bind.flatMap(_.unbind).flatMap { _ =>
Http().shutdownAllConnectionPools
}.map { _ =>
Done
}
}
Maybe I should using this to handle SIGTERM? I'm not sure..
Thanks!

Resolution taken from this answer here:
https://discuss.lightbend.com/t/graceful-termination-on-sigterm-using-akka-http/1619
CoordinatedShutdown(system).addTask(
CoordinatedShutdown.PhaseServiceUnbind, "http_shutdown") { () =>
bind.flatMap(_.terminate(hardDeadline = 1.minute)).map { _ =>
Done
}
}
For me the main part was to increase akka.coordinated-shutdown.default-phase-timeout as it took longer to finish the processing of the request than the 5 second default. You can also just increase the timeout for that one phase. I had the following message in my logs:
Coordinated shutdown phase [service-unbind] timed out after 5000 milliseconds

akka-http queries do not run in parallel

I am very new to akka-http and have troubles to run queries on the same route in parallel.
I have a route that may return the result very quickly (if cached) or not (heavy CPU multithreaded computations). I would like to run these queries in parallel, in case a short one arrives after a long one with heavy computation, I do not want the second call to wait for the first to finish.
However it seems that these queries do not run in parallel if they are on the same route (run in parallel if on different routes)
I can reproduice it in a basic project:
Calling the server 3 time in parallel (with 3 Chrome's tab on http://localhost:8080/test) causes the responses to arrive respectively at 3.0s, 6.0-s and 9.0-s. I suppose queries do not run in parallel.
Running on a 6 cores (with HT) machine on Windows 10 with jdk 8.
build.sbt
name := "akka-http-test"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "com.typesafe.akka" %% "akka-http-experimental" % "2.4.11"
*AkkaHttpTest.scala**
import java.util.concurrent.Executors
import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.server.Directives._
import akka.stream.ActorMaterializer
import scala.concurrent.{ExecutionContext, Future}
object AkkaHttpTest extends App {
implicit val actorSystem = ActorSystem("system") // no application.conf here
implicit val executionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(6))
implicit val actorMaterializer = ActorMaterializer()
val route = path("test") {
onComplete(slowFunc()) { slowFuncResult =>
complete(slowFuncResult)
}
}
def slowFunc()(implicit ec: ExecutionContext): Future[String] = Future {
Thread.sleep(3000)
"Waited 3s"
}
Http().bindAndHandle(route, "localhost", 8080)
println("server started")
}
What am I doing wrong here ?
Thanks for your help
EDIT: Thanks to #Ramon J Romero y Vigil, I added Future Wrapping, but the problem still persists
def slowFunc()(implicit ec : ExecutionContext) : Future[String] = Future {
Thread.sleep(3000)
"Waited 3.0s"
}
val route = path("test") {
onComplete(slowFunc()) { slowFuncResult =>
complete(slowFuncResult)
}
}
Tries with a the default Thread pool, the one defined above in the config file, and a Fixed Thread Pool (6 Threads).
It seems that the onComplete directive still waits for the future to complete and then block the Route (with same connection).
Same problem with the Flow trick
import akka.stream.scaladsl.Flow
val parallelism = 10
val reqFlow =
Flow[HttpRequest].filter(_.getUri().path().equalsIgnoreCase("/test"))
.mapAsync(parallelism)(_ => slowFunc())
.map(str => HttpResponse(status=StatusCodes.Ok, entity=str))
Http().bindAndHandle(reqFlow, ...)
Thanks for your help

Each IncomingConnection is handled by the same Route, therefore when you "call the server 3 times in parallel" you are likely using the same Connection and therefore the same Route.
The Route is handling all 3 incoming HttpRequest values in an akka-stream fashion, i.e. the Route is composed of multiple stages but each stage can only processes 1 element at any given time. In your example the "complete" stage of the stream will call Thread.sleep for each incoming Request and process each Request one-at-a-time.
To get multiple concurrent requests handled at the same time you should establish a unique connection for each request.
An example of the client side connection pool can be created similar to the documentation examples:
import akka.http.scaladsl.Http
val connPoolFlow = Http().newHostConnectionPool("localhost", 8080)
This can then be integrated into a stream that makes the requests:
import akka.http.scaladsl.model.Uri._
import akka.http.scaladsl.model.HttpRequest
val request = HttpRequest(uri="/test")
import akka.stream.scaladsl.Source
val reqStream =
Source.fromIterator(() => Iterator.continually(request).take(3))
.via(connPoolFlow)
.via(Flow.mapAsync(3)(identity))
.to(Sink foreach { resp => println(resp)})
.run()
Route Modification
If you want each HttpRequest to be processed in parallel then you can use the same Route to do so but you must spawn off Futures inside of the Route and use the onComplete directive:
def slowFunc()(implicit ec : ExecutionContext) : Future[String] = Future {
Thread.sleep(1500)
"Waited 1.5s"
}
val route = path("test") {
onComplete(slowFunc()) { slowFuncResult =>
complete(slowFuncResult)
}
}
One thing to be aware of: if you don't specify a different ExecutionContext for your sleeping function then the same thread pool for routes will be used for your sleeping. You may exhaust the available threads this way. You should probably use a seperate ec for your sleeping...
Flow Based
One other way to handle the HttpRequests is with a stream Flow:
import akka.stream.scaladsl.Flow
val parallelism = 10
val reqFlow =
Flow[HttpRequest].filter(_.getUri().path().equalsIgnoreCase("/test"))
.mapAsync(parallelism)(_ => slowFunc())
.map(str => HttpResponse(status=StatusCodes.Ok, entity=str))
Http().bindAndHandle(reqFlow, ...)

In case this is still relevant, or for future readers, the answer is inside Http().bindAndHandle documentation:
/**
* Convenience method which starts a new HTTP server...
* ...
* The number of concurrently accepted connections can be configured by overriding
* the `akka.http.server.max-connections` setting....
* ...
*/
def bindAndHandle(...
use akka.http.server.max-connections setting for number of concurrent connections.

The Future is not complete?

object Executor extends App {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
implicit val ec = system.dispatcher
import akka.stream.io._
val file = new File("res/AdviceAnimals.tsv")
import akka.stream.io.Implicits._
val foreach: Future[Long] = SynchronousFileSource(file)
.to( Sink.outputStream(()=>System.out))
.run()
foreach onComplete { v =>
println(s"the foreach is ${v.get}") // the will not be print
}
}
but if I change the Sink.outputStream(()=>System.out) to Sink.ignore, the println(s"the foreach is ${v.get}") will print.
Can somebody explain why?

You are not waiting for the stream to complete, instead, your main method (the body of Executor) will complete, and since the main method is done exits the JVM is shut down.
What you want to do, is to block that thread and not exit the app before the future completes.
object Executor extends App {
// ...your stuff with streams...
val yourFuture: Future[Long] = ???
val result = Await.result(yourFuture, 5 seconds)
println(s"the foreach is ${result}")
// stop the actor system (or it will keep the app alive)
system.terminate()
}

Coincidently I created almost the same app for testing/playing with Akka Streams.
Could the imported implicits cause the problem?
This app works fine for me:
object PrintAllInFile extends App {
val file = new java.io.File("data.txt")
implicit val system = ActorSystem("test")
implicit val mat = ActorMaterializer()
implicit val ec = system.dispatcher
SynchronousFileSource(file)
.to(Sink.outputStream(() => System.out))
.run()
.onComplete(_ => system.shutdown())
}
Note the stopping of the ActorSystem in the 'onComplete'. Otherwise the app will not exit.

Why future example do not work?

I am reading akkaScala documentation, there is an example (p. 171 bottom)
// imports added for compilation
import scala.concurrent.{ExecutionContext, Future}
import ExecutionContext.Implicits.global
class Some {
}
object Some {
def main(args: Array[String]) {
// Create a sequence of Futures
val futures = for (i <- 1 to 1000) yield Future(i * 2)
val futureSum = Future.fold(futures)(0)(_ + _)
futureSum foreach println
}
}
I run it, but nothing happened. I mean that nothing was in console output. What is wrong?

You don't wait for the future to complete, so you create a race between the program exiting and the futures completing and the side-effect running. On your machine, the future seems to lose the race, on the commenters' who say "it works", the future is winning the race.
You can use Await to block on a future and wait for it to complete. This is something you should only be doing "at the ends of the world", you should very rarely actually be using Await...
// imports added for compilation
import scala.concurrent.{ExecutionContext, Future}
import ExecutionContext.Implicits.global
import scala.concurrent.duration._ // for the "1 second" syntax
import scala.concurrent.Await
class Some {
}
object Some {
def main(args: Array[String]) {
// Create a sequence of Futures
val futures = for (i <- 1 to 1000) yield Future(i * 2)
val futureSum = Future.fold(futures)(0)(_ + _)
// we map instead of foreach, to make sure that the side-effect is part of the future
// and we "await" for the future to complete (for 1 second)
Await.result(futureSum map println, 1 second)
}
}

As others have stated, the issue is the race condition where the futures are competing with the program terminating. The JVM has a concept of daemon threads. It waits for non-daemon threads to terminate but not daemon threads. So if you want to wait for threads to complete, use non-daemon threads.
The way threads are created for scala futures is using an implicit scala.concurrent.ExecutionContext. The one you use (import ExecutionContext.Implicits.global) starts daemon threads. However, it is possible to use non-daemon threads. So if you use an ExecutionContext with non-daemon threads, it will wait, which in your case is reasonable behaviour. Naively:
import scala.concurrent.Future
import scala.concurrent.ExecutionContextExecutor
import scala.concurrent.ExecutionContext
class MyExecutionContext extends ExecutionContext {
override def execute(runnable:Runnable) = {
val t = new Thread(runnable)
t.setDaemon(false)
t.start()
}
override def reportFailure(t:Throwable) = t.printStackTrace
}
object Some {
implicit lazy val context: ExecutionContext = new MyExecutionContext
def main(args: Array[String]) {
// Create a sequence of Futures
val futures = for (i <- 1 to 1000) yield Future(i * 2)
val futureSum = Future.fold(futures)(0)(_ + _)
futureSum foreach println
}
}
Careful with using the above ExecutionContext in production because it doesn't use a thread pool and can create unbounded threads, but the message is: you can control everything about the threads behind Futures through an ExecutionContext. Explore the various scala and akka contexts to find what you need, or if nothing suits, write your own.

Both of the following statement at the end of main function would help your need. As the above answers said, allow the future to complete. Main thread is different from the Future thread, as main completes, it terminates before Future thread.
Thread.sleep(500) //... Simple solution
Await.result(futureSum, Duration(500, MILLISECONDS)) //...have to import scala.concurrent.duration._ to use Duration object.