Cats Effect: which thread pool to use for Non-Blocking IO? - scala

From this tutorial https://github.com/slouc/concurrency-in-scala-with-ce#threading
async operations are divided into 3 groups and require significantly different thread pools to run on:
Non-blocking asynchronous operations:
Bounded pool with a very low number of threads (maybe even just one), with a very high priority. These threads will basically just sit idle most of the time and keep polling whether there is a new async IO notification. Time that these threads spend processing a request directly maps into application latency, so it's very important that no other work gets done in this pool apart from receiving notifications and forwarding them to the rest of the application.
Bounded pool with a very low number of threads (maybe even just one), with a very high priority. These threads will basically just sit idle most of the time and keep polling whether there is a new async IO notification. Time that these threads spend processing a request directly maps into application latency, so it's very important that no other work gets done in this pool apart from receiving notifications and forwarding them to the rest of the application.
Blocking asynchronous operations:
Unbounded cached pool. Unbounded because blocking operation can (and will) block a thread for some time, and we want to be able to serve other I/O requests in the meantime. Cached because we could run out of memory by creating too many threads, so it’s important to reuse existing threads.
CPU-heavy operations:
Fixed pool in which number of threads equals the number of CPU cores. This is pretty straightforward. Back in the day the "golden rule" was number of threads = number of CPU cores + 1, but "+1" was coming from the fact that one extra thread was always reserved for I/O (as explained above, we now have separate pools for that).
In my Cats Effect application, I use Scala Future-based ReactiveMongo lib to access MongoDB, which does NOT block threads when talking with MongoDB, e.g. performs non-blocking IO.
It needs execution context.
Cats effect provides default execution context IOApp.executionContext
My question is: which execution context should I use for non-blocking io?
IOApp.executionContext?
But, from IOApp.executionContext documemntation:
Provides a default ExecutionContext for the app.
The default on top of the JVM is lazily constructed as a fixed thread pool based on the number available of available CPUs (see PoolUtils).
Seems like this execution context falls into 3rd group I listed above - CPU-heavy operations (Fixed pool in which number of threads equals the number of CPU cores.),
and it makes me think that IOApp.executionContext is not a good candidate for non-blocking IO.
Am I right and should I create a separate context with a fixed thread pool (1 or 2 threads) for non-blocking IO (so it will fall into the first group I listed above - Non-blocking asynchronous operations: Bounded pool with a very low number of threads (maybe even just one), with a very high priority.)?
Or is IOApp.executionContext designed for both CPU-bound and Non-Blocking IO operations?
The function I use to convert Scala Future to F and excepts execution context:
def scalaFutureToF[F[_]: Async, A](
future: => Future[A]
)(implicit ec: ExecutionContext): F[A] =
Async[F].async { cb =>
future.onComplete {
case Success(value) => cb(Right(value))
case Failure(exception) => cb(Left(exception))
}
}

In Cats Effect 3, each IOApp has a Runtime:
final class IORuntime private[effect] (
val compute: ExecutionContext,
private[effect] val blocking: ExecutionContext,
val scheduler: Scheduler,
val shutdown: () => Unit,
val config: IORuntimeConfig,
private[effect] val fiberErrorCbs: FiberErrorHashtable = new FiberErrorHashtable(16)
)
You will almost always want to keep the default values and not fiddle around declaring your own runtime, except in perhaps tests or educational examples.
Inside your IOApp you can then access the compute pool via:
runtime.compute
If you want to execute a blocking operation, then you can use the blocking construct:
blocking(IO(println("foo!"))) >> IO.unit
This way, you're telling the CE3 runtime that this operation could be blocking and hence should be dispatched to a dedicated pool. See here.
What about CE2? Well, it had similar mechanisms but they were very clunky and also contained quite a few surprises. Blocking calls, for example, were scheduled using Blocker which then had to be somehow summoned out of thin air or threaded through the whole app, and thread pool definitions were done using the awkward ContextShift. If you have any choice in the matter, I highly recommend investing some effort into migrating to CE3.
Fine, but what about Reactive Mongo?
ReactiveMongo uses Netty (which is based on Java NIO API). And Netty has its own thread pool. This is changed in Netty 5 (see here), but ReactiveMongo seems to still be on Netty 4 (see here).
However, the ExecutionContext you're asking about is the thread pool that will perform the callback. This can be your compute pool.
Let's see some code. First, your translation method. I just changed async to async_ because I'm using CE3, and I added the thread printline:
def scalaFutureToF[F[_]: Async, A](future: => Future[A])(implicit ec: ExecutionContext): F[A] =
Async[F].async_ { cb =>
future.onComplete {
case Success(value) => {
println(s"Inside Callback: [${Thread.currentThread.getName}]")
cb(Right(value))
}
case Failure(exception) => cb(Left(exception))
}
}
Now let's pretend we have two execution contexts - one from our IOApp and another one that's going to represent whatever ReactiveMongo uses to run the Future. This is the made-up ReactiveMongo one:
val reactiveMongoContext: ExecutionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1))
and the other one is simply runtime.compute.
Now let's define the Future like this:
def myFuture: Future[Unit] = Future {
println(s"Inside Future: [${Thread.currentThread.getName}]")
}(reactiveMongoContext)
Note how we are pretending that this Future runs inside ReactiveMongo by passing the reactiveMongoContext to it.
Finally, let's run the app:
override def run: IO[Unit] = {
val myContext: ExecutionContext = runtime.compute
scalaFutureToF(myFuture)(implicitly[Async[IO]], myContext)
}
Here's the output:
Inside Future: [pool-1-thread-1]
Inside Callback: [io-compute-6]
The execution context we provided to scalaFutureToF merely ran the callback. Future itself ran on our separate thread pool that represents ReactiveMongo's pool. In reality, you will have no control over this pool, as it's coming from within ReactiveMongo.
Extra info
By the way, if you're not working with the type class hierarchy (F), but with IO values directly, then you could use this simplified method:
def scalaFutureToIo[A](future: => Future[A]): IO[A] =
IO.fromFuture(IO(future))
See how this one doesn't even require you to pass an ExecutionContext - it simply uses your compute pool. Or more specifically, it uses whatever is defined as def executionContext: F[ExecutionContext] for the Async[IO], which turns out to be the compute pool. Let's check:
override def run: IO[Unit] = {
IO.executionContext.map(ec => println(ec == runtime.compute))
}
// prints true
Last, but not least:
If we really had a way of specifying which thread pool ReactiveMongo's underlying Netty should be using, then yes, in that case we should definitely use a separate pool. We should never be providing our runtime.compute pool to other runtimes.

Related

interrupt scala parallel collection

Is there any way to interrupt a parallel collection computation in Scala?
Example:
val r = new Runnable {
override def run(): Unit = {
(1 to 3).par.foreach { _ => Thread.sleep(5000000) }
}
}
val t = new Thread(r)
t.start()
Thread.sleep(300) // let them spin up
t.interrupt()
I'd expect t.interrupt to interrupt all threads spawned by par, but this is not happening, it keeps spinning inside ForkJoinTask.externalAwaitDone. Looks like that method clears the interrupted status and keeps waiting for the spawned threads to finish.
This is Scala 2.12
The thread that you t.start() is responsible just for starting parallel computations and to wait and gather the result.
It is not connected to threads that compute operations. Usually, it runs on default forkJoinPool that independent from the thread that submits computation tasks.
If you want to interrupt the computation, you can use custom execution back-end (like manually created forkJoinPool or a threadPool), and then shut it down. You can read about that here.
Or you can provide a callback from the computation.
But all those approaches are not so good for such a case.
If you producing a production solution or your case is complex and critical for the app, you probably should use something that has cancellation by design. Like Monix.Task or CancellableFuture.
Or at least use Future and cancel it with workarounds.

Cats-effect and asynchronous IO specifics

For few days I have been wrapping my head around cats-effect and IO. And I feel I have some misconceptions about this effect or simply I missed its point.
First of all - if IO can replace Scala's Future, how can we create an async IO task? Using IO.shift? Using IO.async? Is IO.delay sync or async? Can we make a generic async task with code like this Async[F].delay(...)? Or async happens when we call IO with unsafeToAsync or unsafeToFuture?
What's the point of Async and Concurrent in cats-effect? Why they are separated?
Is IO a green thread? If yes, why is there a Fiber object in cats-effect? As I understand the Fiber is the green thread, but docs claim we can think of IOs as green threads.
I would appreciate some clarifing on any of this as I have failed comprehending cats-effect docs on those and internet was not that helpfull...
if IO can replace Scala's Future, how can we create an async IO task
First, we need to clarify what is meant as an async task. Usually async means "does not block the OS thread", but since you're mentioning Future, it's a bit blurry. Say, if I wrote:
Future { (1 to 1000000).foreach(println) }
it would not be async, as it's a blocking loop and blocking output, but it would potentially execute on a different OS thread, as managed by an implicit ExecutionContext. The equivalent cats-effect code would be:
for {
_ <- IO.shift
_ <- IO.delay { (1 to 1000000).foreach(println) }
} yield ()
(it's not the shorter version)
So,
IO.shift is used to maybe change thread / thread pool. Future does it on every operation, but it's not free performance-wise.
IO.delay { ... } (a.k.a. IO { ... }) does NOT make anything async and does NOT switch threads. It's used to create simple IO values from synchronous side-effecting APIs
Now, let's get back to true async. The thing to understand here is this:
Every async computation can be represented as a function taking callback.
Whether you're using API that returns Future or Java's CompletableFuture, or something like NIO CompletionHandler, it all can be converted to callbacks. This is what IO.async is for: you can convert any function taking callback to an IO. And in case like:
for {
_ <- IO.async { ... }
_ <- IO(println("Done"))
} yield ()
Done will be only printed when (and if) the computation in ... calls back. You can think of it as blocking the green thread, but not OS thread.
So,
IO.async is for converting any already asynchronous computation to IO.
IO.delay is for converting any completely synchronous computation to IO.
The code with truly asynchronous computations behaves like it's blocking a green thread.
The closest analogy when working with Futures is creating a scala.concurrent.Promise and returning p.future.
Or async happens when we call IO with unsafeToAsync or unsafeToFuture?
Sort of. With IO, nothing happens unless you call one of these (or use IOApp). But IO does not guarantee that you would execute on a different OS thread or even asynchronously unless you asked for this explicitly with IO.shift or IO.async.
You can guarantee thread switching any time with e.g. (IO.shift *> myIO).unsafeRunAsyncAndForget(). This is possible exactly because myIO would not be executed until asked for it, whether you have it as val myIO or def myIO.
You cannot magically transform blocking operations into non-blocking, however. That's not possible neither with Future nor with IO.
What's the point of Async and Concurrent in cats-effect? Why they are separated?
Async and Concurrent (and Sync) are type classes. They are designed so that programmers can avoid being locked to cats.effect.IO and can give you API that supports whatever you choose instead, such as monix Task or Scalaz 8 ZIO, or even monad transformer type such as OptionT[Task, *something*]. Libraries like fs2, monix and http4s make use of them to give you more choice of what to use them with.
Concurrent adds extra things on top of Async, most important of them being .cancelable and .start. These do not have a direct analogy with Future, since that does not support cancellation at all.
.cancelable is a version of .async that allows you to also specify some logic to cancel the operation you're wrapping. A common example is network requests - if you're not interested in results anymore, you can just abort them without waiting for server response and don't waste any sockets or processing time on reading the response. You might never use it directly, but it has it's place.
But what good are cancelable operations if you can't cancel them? Key observation here is that you cannot cancel an operation from within itself. Somebody else has to make that decision, and that would happen concurrently with the operation itself (which is where the type class gets its name). That's where .start comes in. In short,
.start is an explicit fork of a green thread.
Doing someIO.start is akin to doing val t = new Thread(someRunnable); t.start(), except it's green now. And Fiber is essentially a stripped down version of Thread API: you can do .join, which is like Thread#join(), but it does not block OS thread; and .cancel, which is safe version of .interrupt().
Note that there are other ways to fork green threads. For example, doing parallel operations:
val ids: List[Int] = List.range(1, 1000)
def processId(id: Int): IO[Unit] = ???
val processAll: IO[Unit] = ids.parTraverse_(processId)
will fork processing all IDs to green threads and then join them all. Or using .race:
val fetchFromS3: IO[String] = ???
val fetchFromOtherNode: IO[String] = ???
val fetchWhateverIsFaster = IO.race(fetchFromS3, fetchFromOtherNode).map(_.merge)
will execute fetches in parallel, give you first result completed and automatically cancel the fetch that is slower. So, doing .start and using Fiber is not the only way to fork more green threads, just the most explicit one. And that answers:
Is IO a green thread? If yes, why is there a Fiber object in cats-effect? As I understand the Fiber is the green thread, but docs claim we can think of IOs as green threads.
IO is like a green thread, meaning you can have lots of them running in parallel without overhead of OS threads, and the code in for-comprehension behaves as if it was blocking for the result to be computed.
Fiber is a tool for controlling green threads explicitly forked (waiting for completion or cancelling).

Scala : Futures with map, flatMap for IO/CPU bound tasks

I am aware that in Java 8 it is not a good idea to start long-running tasks in stream framework's filter, map, etc. methods, as there is no way to tune the underlying fork-join pool and it can cause latency problems and starvings.
Now, my question would be, is there any problem like that with Scala? I tried to google it, but I guess I just can't put this question in a google-able sentence.
Let's say I have a list of objects and I want to save them into a database using a forEach, would that cause any issues? I guess this would not be an issue in Scala, as functional transformations are fundamental building blocks of the language, but anyway...
If you don't see any kind of I/O operations then using futures can be an overhead.
def add(x: Int, y: Int) = Future { x + y }
Executing purely CPU-bound operations in a Future constructor will make your logic slower to execute, not faster. Mapping and flatmapping over them, can increase add fuel to this problem.
In case you want to initialize a Future with a constant/simple calculation, you can use Future.successful().
But all blocking I/O, including SQL queries makes sence be wrapped inside a Future with blocking
E.g :
Future {
DB.withConnection { implicit connection =>
val query = SQL("select * from bar")
query()
}
}
Should be done as,
import scala.concurrent.blocking
Future {
blocking {
DB.withConnection { implicit connection =>
val query = SQL("select * from bar")
query()
}
}
This blocking notifies the thread pool that this task is blocking. This allows the pool to temporarily spawn new workers as needed. This is done to prevent starvation in blocking applications.
The thread pool(by default the scala.concurrent.ExecutionContext.global pool) knows when the code in a blocking is completed.(Since it's a fork join thread pool)
Therefore it will remove the spare worker threads as they completes, and the pool will shrink back down to its expected size with time(Number of cores by default).
But this scenario can also backfire if there is not enough memory to expand the thread pool.
So for your scenario, you can use
images.foreach(i => {
import scala.concurrent.blocking
Future {
blocking {
DB.withConnection { implicit connection =>
val query = SQL("insert into .........")
query()
}
}
})
If you're doing a lot of blocking I/O, then it's a good practice to create a separate thread-pool/execution context and execute all blocking calls in that pool.
References :
scala-best-practices
demystifying-the-blocking-construct-in-scala-futures
Hope this helps.

Akka HTTP: Blocking in a future blocks the server

I am trying to use Akka HTTP to basic authenticate my request.
It so happens that I have an external resource to authenticate through, so I have to make a rest call to this resource.
This takes some time, and while it's processing, it seems the rest of my API is blocked, waiting for this call.
I have reproduced this with a very simple example:
// used dispatcher:
implicit val system = ActorSystem()
implicit val executor = system.dispatcher
implicit val materializer = ActorMaterializer()
val routes =
(post & entity(as[String])) { e =>
complete {
Future{
Thread.sleep(5000)
e
}
}
} ~
(get & path(Segment)) { r =>
complete {
"get"
}
}
If I post to the log endpoint, my get endpoint is also stuck waiting for the 5 seconds, which the log endpoint dictated.
Is this expected behaviour, and if is, how do I make blocking operations without blocking my entire API?
What you observe is expected behaviour – yet of course it's very bad. Good that known solutions and best practices exist to guard against it. In this answer I'd like to spend some time to explain the issue short, long, and then in depth – enjoy the read!
Short answer: "don't block the routing infrastructure!", always use a dedicated dispatcher for blocking operations!
Cause of the observed symptom: The problem is that you're using context.dispatcher as the dispatcher the blocking futures execute on. The same dispatcher (which is in simple terms just a "bunch of threads") is used by the routing infrastructure to actually handle the incoming requests – so if you block all available threads, you end up starving the routing infrastructure. (A thing up for debate and benchmarking is if Akka HTTP could protect from this, I'll add that to my research todo-list).
Blocking must be treated with special care to not impact other users of the same dispatcher (which is why we make it so simple to separate execution onto different ones), as explained in the Akka docs section: Blocking needs careful management.
Something else I wanted to bring to attention here is that one should avoid blocking APIs at all if possible - if your long running operation is not really one operation, but a series thereof, you could have separated those onto different actors, or sequenced futures. Anyway, just wanted to point out – if possible, avoid such blocking calls, yet if you have to – then the following explains how to properly deal with those.
In-depth analysis and solutions:
Now that we know what is wrong, conceptually, let's have a look what exactly is broken in the above code, and how the right solution to this problem looks like:
Colour = thread state:
turquoise – SLEEPING
orange - WAITING
green - RUNNABLE
Now let's investigate 3 pieces of code and how the impact the dispatchers, and performance of the app. To force this behaviour the app has been put under the following load:
[a] keep requesting GET requests (see above code in initial question for that), it's not blocking there
[b] then after a while fire 2000 POST requests, which will cause the 5second blocking before returning the future
1) [bad] Dispatcher behaviour on bad code:
// BAD! (due to the blocking in Future):
implicit val defaultDispatcher = system.dispatcher
val routes: Route = post {
complete {
Future { // uses defaultDispatcher
Thread.sleep(5000) // will block on the default dispatcher,
System.currentTimeMillis().toString // starving the routing infra
}
}
}
So we expose our app to [a] load, and you can see a number of akka.actor.default-dispatcher threads already - they're handling the requests – small green snippet, and orange meaning the others are actually idle there.
Then we start the [b] load, which causes blocking of these threads – you can see an early thread "default-dispatcher-2,3,4" going into blocking after being idle before. We also observe that the pool grows – new threads are started "default-dispatcher-18,19,20,21..." however they go into sleeping immediately (!) – we're wasting precious resource here!
The number of the such started threads depends on the default dispatcher configuration, but likely will not exceed 50 or so. Since we just fired 2k blocking ops, we starve the entire threadpool – the blocking operations dominate such that the routing infra has no thread available to handle the other requests – very bad!
Let's do something about it (which is an Akka best practice incidentally – always isolate blocking behaviour like shown below):
2) [good!] Dispatcher behaviour good structured code/dispatchers:
In your application.conf configure this dispatcher dedicated for blocking behaviour:
my-blocking-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
// in Akka previous to 2.4.2:
core-pool-size-min = 16
core-pool-size-max = 16
max-pool-size-min = 16
max-pool-size-max = 16
// or in Akka 2.4.2+
fixed-pool-size = 16
}
throughput = 100
}
You should read more in the Akka Dispatchers documentation, to understand the various options here. The main point though is that we picked a ThreadPoolExecutor which has a hard limit of threads it keeps available for the blocking ops. The size settings depend on what your app does, and how many cores your server has.
Next we need to use it, instead of the default one:
// GOOD (due to the blocking in Future):
implicit val blockingDispatcher = system.dispatchers.lookup("my-blocking-dispatcher")
val routes: Route = post {
complete {
Future { // uses the good "blocking dispatcher" that we configured,
// instead of the default dispatcher – the blocking is isolated.
Thread.sleep(5000)
System.currentTimeMillis().toString
}
}
}
We pressure the app using the same load, first a bit of normal requests and then we add the blocking ones. This is how the ThreadPools will behave in this case:
So initially the normal requests are easily handled by the default dispatcher, you can see a few green lines there - that's actual execution (I'm not really putting the server under heavy load, so it's mostly idle).
Now when we start issuing the blocking ops, the my-blocking-dispatcher-* kicks in, and starts up to the number of configured threads. It handles all the Sleeping in there. Also, after a certain period of nothing happening on those threads, it shuts them down. If we were to hit the server with another bunch of blocking the pool would start new threads that will take care of the sleep()-ing them, but in the meantime – we're not wasting our precious threads on "just stay there and do nothing".
When using this setup, the throughput of the normal GET requests was not impacted, they were still happily served on the (still pretty free) default dispatcher.
This is the recommended way of dealing with any kind of blocking in reactive applications. It often is referred to as "bulkheading" (or "isolating") the bad behaving parts of an app, in this case the bad behaviour is sleeping/blocking.
3) [workaround-ish] Dispatcher behaviour when blocking applied properly:
In this example we use the scaladoc for scala.concurrent.blocking method which can help when faced with blocking ops. It generally causes more threads to be spun up to survive the blocking operations.
// OK, default dispatcher but we'll use `blocking`
implicit val dispatcher = system.dispatcher
val routes: Route = post {
complete {
Future { // uses the default dispatcher (it's a Fork-Join Pool)
blocking { // will cause much more threads to be spun-up, avoiding starvation somewhat,
// but at the cost of exploding the number of threads (which eventually
// may also lead to starvation problems, but on a different layer)
Thread.sleep(5000)
System.currentTimeMillis().toString
}
}
}
}
The app will behave like this:
You'll notice that A LOT of new threads are created, this is because blocking hints at "oh, this'll be blocking, so we need more threads". This causes the total time we're blocked to be smaller than in the 1) example, however then we have hundreds of threads doing nothing after the blocking ops have finished... Sure, they will eventually be shut down (the FJP does this), but for a while we'll have a large (uncontrolled) amount of threads running, in contrast to the 2) solution, where we know exactly how many threads we're dedicating for the blocking behaviours.
Summing up: Never block the default dispatcher :-)
The best practice is to use the pattern shown in 2), to have a dispatcher for the blocking operations available, and execute them there.
Discussed Akka HTTP version: 2.0.1
Profiler used: Many people have asked me in response to this answer privately what profiler I used to visualise the Thread states in the above pics, so adding this info here: I used YourKit which is an awesome commercial profiler (free for OSS), though you can achieve the same results using the free VisualVM from OpenJDK.
Strange, but for me everything works fine (no blocking). Here is code:
import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.server.Route
import akka.stream.ActorMaterializer
import scala.concurrent.Future
object Main {
implicit val system = ActorSystem()
implicit val executor = system.dispatcher
implicit val materializer = ActorMaterializer()
val routes: Route = (post & entity(as[String])) { e =>
complete {
Future {
Thread.sleep(5000)
e
}
}
} ~
(get & path(Segment)) { r =>
complete {
"get"
}
}
def main(args: Array[String]) {
Http().bindAndHandle(routes, "0.0.0.0", 9000).onFailure {
case e =>
system.shutdown()
}
}
}
Also you can wrap you async code into onComplete or onSuccess directive:
onComplete(Future{Thread.sleep(5000)}){e}
onSuccess(Future{Thread.sleep(5000)}){complete(e)}

Futures for blocking calls in Scala

The Akka documentation says:
you may be tempted to just wrap the blocking call inside a Future and work with that instead, but this strategy is too simple: you are quite likely to find bottlenecks or run out of memory or threads when the application runs under increased load.
They suggest the following strategies:
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded number of tasks of this nature will exhaust your memory or thread limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the hardware on which the application runs.
Do you know about any implementation of those strategies?
Futures are run within execution contexts. This is obvious from the Future API: any call which involves attaching some callbacks to a future or to build a future from an arbitrary computation or from another future requires an implicitly available ExecutionContext object. So you can control the concurrency setup for your futures by tuning the ExecutionContext in which they run.
For instance, to implement the second strategy you can do something like
import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
import scala.concurrent.future
object Main extends App {
val ThreadCount = 10
implicit val executionContext = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(ThreadCount))
val f = future {
println(s"Hello ! I'm running in an execution context with $ThreadCount threads")
}
}
Akka itself implements all this, you can wrap your blocking calls into Actors and then use dispatchers to control execution thread pools.