Right way of handling multiple future callbacks using threadpool in Scala - scala

I am trying to do a very simple thing and want to understand the right way of doing it. I need to periodically make some Rest API calls to a separate service and then process the results asynchronously. I am using actor system's default scheduler to schedule the Http requests and have created a separate threadpool to handle the Future callbacks. Since there is no dependency between requests and response I thought a separate threadpool for handling future callbacks should be fine.
Is there some problem with this approach?
I read the Scala doc and it says there is some issue here (though i not clear on it).
Generally what is recommended way of handling these scenarios?
implicit val system = ActorSystem("my-actor-system") // define an actor system
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(10)) // create a thread pool
// define a thread which periodically does some work using the actor system's scheduler
system.scheduler.scheduleWithFixedDelay(5.seconds, 5.seconds)(new Runnable {
override def run(): Unit = {
val urls = getUrls() // get list of urls
val futureResults = urls.map(entry => getData[MyData](entry))) // get data foreach url
futureResults onComplete {
case Success(res) => // do something with the result
case Failure(e) => // do something with the error
}
}
}))
def getdata[T](url : String) : Future[Option[Future[T]] = {
implicit val ec1 = system.dispatcher
val responseFuture: Future[HttpResponse] = execute(url)
responseFuture map { result => {
// transform the response and return data in format T
}
}
}

Whether or not having a separate thread pool really depends on the use case. If the service integration is very critical and is designed to take a lot of resources, then a separate thread pool may make sense, otherwise, just use the default one should be fine. Feel free to refer to Levi's question for more in-depth discussions on this part.
Regarding "job scheduling in an actor system", I think Akka streams are a perfect fit here. I give you an example below. Feel free to refer to the blog post https://blog.colinbreck.com/rethinking-streaming-workloads-with-akka-streams-part-i/ regarding how many things can Akka streams simplify for you.
import akka.actor.ActorSystem
import akka.stream.scaladsl.{Sink, Source}
import scala.concurrent.duration._
import scala.concurrent.{ExecutionContext, Future}
import scala.util.{Failure, Success}
object Timer {
def main(args: Array[String]): Unit = {
implicit val system: ActorSystem = ActorSystem("Timer")
// default thread pool
implicit val ec: ExecutionContext = system.dispatcher
// comment out below if custom thread pool is needed
// also make sure you read https://doc.akka.io/docs/akka/current/dispatchers.html#setting-the-dispatcher-for-an-actor
// to define the custom thread pool
// implicit val ec: ExecutionContext = system.dispatchers.lookup("my-custom-dispatcher")
Source
.tick(5.seconds, 5.seconds, getUrls())
.mapConcat(identity)
.mapAsync(1)(url => fetch(url))
.runWith(Sink.seq)
.onComplete {
case Success(responses) =>
// handle responses
case Failure(ex) =>
// handle exceptions
}
}
def getUrls(): Seq[String] = ???
def fetch(url: String): Future[Response] = ???
case class Response(body: String)
}

In addition to Yik San Chan's answer above (especially regarding using Akka Streams), I'd also point out that what exactly you're doing in the .onComplete block is quite relevant to the choice of which ExecutionContext to use for the onComplete callback.
In general, if what you're doing in the callback will be doing blocking I/O, it's probably best to do it in a threadpool which is large relative to the number of cores (note that each thread on the JVM consumes about 1MB or so of heap, so it's probably not a great idea to use an ExecutionContext that spawns an unbounded number of threads; a fixed pool of about 10x your core count is probably OK).
Otherwise, it's probably OK to use an ExecutionContext with a threadpool roughly equal in size to the number of cores: the default Akka dispatcher is such an ExecutionContext. The only real reason to consider not using the Akka dispatcher, in my experience/opinion, is if the callback is going to occupy the CPU for a long time. The phenomenon known as "thread starvation" can occur in that scenario, with adverse impacts on performance and cluster stability (if using, e.g. Akka Cluster or health-checks). In such a scenario, I'd tend to use a dispatcher with fewer threads than cores and consider configuring the default dispatcher with fewer threads than the default (while the kernel's scheduler can and will manage more threads ready-to-run than cores, there are strong arguments for not letting it do so).
In an onComplete callback (in comparison to the various transformation methods on Future like map/flatMap and friends), since all you can do is side-effect, it's probably more likely than not that you're doing blocking I/O.

Related

flink increase parallelism of async operation

We have AsyncFunction the async operation is done using akka http client
class Foo[A,B] extends AsyncFunction[A, B] with {
val akkaConfig = ConfigFactory.load()
implicit lazy val executor: ExecutionContext = ExecutionContext.fromExecutor(Executors.directExecutor())
implicit lazy val system = ActorSystem("MyActorSystem", akkaConfig)
implicit lazy val materializer = ActorMaterializer()
def postReq(uriStr: String, str: String): Future[HttpResponse] = {
Http().singleRequest(HttpRequest(
method = HttpMethods.POST,
uri = uriStr,
entity = HttpEntity(ContentTypes.`application/json`, str))
)
}
override def asyncInvoke(input: A, resultFuture: ResultFuture[B]) : Unit = {
val resultFutureRequested: Future[HttpResponse] = postReq(...)
//the rest of the class ...
Questions :
If I want to increase the parallelism of the http requests - should I do it using the akka config or is there is a way to config it via flink.yamel
Since Flink is using akka as well is that the correct way to create the ActorSystem and the ExecutionContext ?
As for the first question, You have three different settings that can affect the performance and the number of actual requests executed:
Parallelism, this will cause the Flink to create multiple instances of Your AsyncFunction including multiple instances of Your HttpClient.
Number of concurrent requests in the function itself. When you are calling orderedWait or unorderedWait You should provide the capacity in the function, which will limit the number of concurrent requests.
The actual settings of Your Http client.
As You can see, the points 2. and 3. are connected, since the Flink can limit the number of possible concurrent requests, so sometimes the changes in Your Http Client settings may not have an effect, since number of requests is bounded by Flink intself.
Increasing the throughput of Your AsyncFunction depends on the case. You need to remeber that AsyncFunction is callend IN SINGLE THREAD. This basically means that If the time to respond of the service You are calling is big, You will simply block the number of requests waiting for the response and thus the only way is to increase the parallelism'. Generally however, changing the settings of the HttpClient and the capacity of the function should allow You to obtain better throughput.
As for the second question, I don' t see an issue with creating the multiple ActorSystems. You can see the similar question answered [here].1

Akka Stream from within a Spark Job to write into kafka

Willing to be the most efficient in writing data back into kafka, i am interested in using Akka Stream to write my RDD partition back into Kafka.
The problem is that i need a way to create an actor system per executor and not per partition which would be ridiculous. One may end up with 8 actorSystems on one node on one JVM. However having a Stream per partition is fine.
Has anyone already done that ?
My understanding, an actor system can't be serialized, hence can't be
sent has broadcast variable which would be per executor.
If one has had the experience around figuring a solution to that and tested please would you share ?
Else i can always fall back to https://index.scala-lang.org/benfradet/spark-kafka-writer/spark-kafka-0-10-writer/0.3.0?target=_2.11 but i am not sure it is the most efficient way.
You can always define a global lazy val with an actor system:
object Execution {
implicit lazy val actorSystem: ActorSystem = ActorSystem()
implicit lazy val materializer: Materializer = ActorMaterializer()
}
Then you just import it in any of the classes where you want to use Akka Streams:
import Execution._
val stream: DStream[...] = ...
stream.foreachRDD { rdd =>
...
rdd.foreachPartition { records =>
val (queue, done) = Source.queue(...)
.via(Producer.flow(...))
.toMat(Sink.ignore)(Keep.both)
.run() // implicitly pulls `Execution.materializer` from scope,
// which in turn will initialize `Execution.actorSystem`
... // push records to the queue
// wait until the stream is completed
Await.result(done, 10.minutes)
}
}
The above is kind of pseudocode but I think it should convey the general idea.
This way the system is going to be initialized on every executor JVM only once when it is needed. Additionally you can make the actor system "daemonic" in order for it to shut down automatically when the JVM finishes:
object Execution {
private lazy val config = ConfigFactory.parseString("akka.daemonic = on")
.withFallback(ConfigFactory.load())
implicit lazy val actorSystem: ActorSystem = ActorSystem("system", config)
implicit lazy val materializer: Materializer = ActorMaterializer()
}
We're doing this in our Spark jobs and it works flawlessly.
This works without any kind of broadcast variables, and, naturally, can be used in all kinds of Spark jobs, streaming or otherwise. Because the system is defined in a singleton object, it is guaranteed to be initialized only once per JVM instance (modulo various classloader shenanigans, but it doesn't really matter in the context of Spark), therefore even if some of the partitions get placed onto the same JVM (maybe in different threads), it will only initialize the actor system one time. lazy val ensures the thread-safety of the initialization, and ActorSystem is thread-safe, so this won't cause problems in this regard as well.

Scala : Futures with map, flatMap for IO/CPU bound tasks

I am aware that in Java 8 it is not a good idea to start long-running tasks in stream framework's filter, map, etc. methods, as there is no way to tune the underlying fork-join pool and it can cause latency problems and starvings.
Now, my question would be, is there any problem like that with Scala? I tried to google it, but I guess I just can't put this question in a google-able sentence.
Let's say I have a list of objects and I want to save them into a database using a forEach, would that cause any issues? I guess this would not be an issue in Scala, as functional transformations are fundamental building blocks of the language, but anyway...
If you don't see any kind of I/O operations then using futures can be an overhead.
def add(x: Int, y: Int) = Future { x + y }
Executing purely CPU-bound operations in a Future constructor will make your logic slower to execute, not faster. Mapping and flatmapping over them, can increase add fuel to this problem.
In case you want to initialize a Future with a constant/simple calculation, you can use Future.successful().
But all blocking I/O, including SQL queries makes sence be wrapped inside a Future with blocking
E.g :
Future {
DB.withConnection { implicit connection =>
val query = SQL("select * from bar")
query()
}
}
Should be done as,
import scala.concurrent.blocking
Future {
blocking {
DB.withConnection { implicit connection =>
val query = SQL("select * from bar")
query()
}
}
This blocking notifies the thread pool that this task is blocking. This allows the pool to temporarily spawn new workers as needed. This is done to prevent starvation in blocking applications.
The thread pool(by default the scala.concurrent.ExecutionContext.global pool) knows when the code in a blocking is completed.(Since it's a fork join thread pool)
Therefore it will remove the spare worker threads as they completes, and the pool will shrink back down to its expected size with time(Number of cores by default).
But this scenario can also backfire if there is not enough memory to expand the thread pool.
So for your scenario, you can use
images.foreach(i => {
import scala.concurrent.blocking
Future {
blocking {
DB.withConnection { implicit connection =>
val query = SQL("insert into .........")
query()
}
}
})
If you're doing a lot of blocking I/O, then it's a good practice to create a separate thread-pool/execution context and execute all blocking calls in that pool.
References :
scala-best-practices
demystifying-the-blocking-construct-in-scala-futures
Hope this helps.

Is there an overhead because of nesting Futures

I wrote this code
package com.abhi
import scala.concurrent._
import scala.concurrent.ExecutionContext.Implicits.global
object FutureNesting extends App {
def measure(future: => Future[Unit]) : Future[Long] = {
val start = System.currentTimeMillis()
val ec = implicitly[ExecutionContext]
val t = future
t map { case _ =>
val end = System.currentTimeMillis()
end - start
}
}
measure(Future{ Thread.sleep(10000) }) onSuccess {case a => println(a)}
scala.io.StdIn.readLine()
}
So how many threads am I using in this code. The broader question is that what is the impact of going on nesting future inside futures.
So I ran the application above and observed it using Visual VM. This is what I saw
So the application launched 2 threads ForkJoinPool-1-worker-5 and ForkJoinPool-2-worker-3. However it launches the same 2 threads even if I remove the nesting. So I am not sure what is the overhead because of nesting the futures like above.
Edit:: Some people said it depends on the type of ThreadPool (ForkJoin etc).
I won't know what type of pool do Akka.HTTP or Spray use? I planned to use a code snippet similar to the one above in a Spray web service. The idea was to measure the performance of the web service using Futures.
In your case, you are using wrap over thradpool (ForkJoingPool from java.util.concurrent). Of course, all Futures are executed in it.
import scala.concurrent.ExecutionConext.Implicits.global
Based on this you must implicitly instantiate pool instead import, like this:
implicit val ec: ExecutionContext
And use method from ForkJoinPool: getActiveThreadCount()
Second approach:
You can open profiler (like JProfiler - from Jetbrains or Jvisualvm - attached with jdk) and watch meta information including threads parameters like their amount, activity, memory usage and etc.

Does Scala Futures/ExecutionContext have something like C#'s ConfigureAwait

C#'s Tasks have ConfigureAwait(false) for libraries to prevent synchronization to (for example) the UI-thread which is not always necessary:
http://msdn.microsoft.com/en-us/magazine/hh456402.aspx
In .NET I believe there can only be one SynchonizationContext, so it's clear on which threadpool a Task should execute it's continuation.
For a library, when you can't assume the user is in a webrequest(in .NET HttpContext.Current.Items flows), commandline (normal multithreaded), XAML/Windows Forms (single UI thread), it's almost always better to use ConfigureAwait(false), so the Waiter knows it can just execute the continuation on whatever thread is being used to call the Waiter (this is only bad if you do blocking code in the library which could lead to thread starvation on the threadpool where the initial workload is started, let assume we don't do that).
The point is that from a library perspective you don't want to use a thread from the caller's threadpool to synchronize a continuation, you just want the continuation to run on whatever thread. This saves a context switch and keeps the load of the UI thread for example.
In Scala, for each operation (namely map) on Futures, you need an ExecutionContext (passed implicitly). This makes managing threadpools incredibly easy, which I like a lot more than the way .NET has somewhat strange TaskFactory's (which nobody seems to use, they just use the default TaskFactory).
My question is, does Scala have the same problem as .NET in respect to context switches being sometimes unnecessary, and if so, is there a way, similar to ConfigureAwait, to fix this?
Concrete example I'm finding in Scala where I wonder about this:
def trace[T](message: => String)(block: => Future[T]): Future[T] = {
if (!logger.isTraceEnabled) block
else {
val startedAt = System.currentTimeMillis()
block.map { result =>
val timeTaken = System.currentTimeMillis() - startedAt
logger.trace(s"$message took ${timeTaken}ms")
result
}
}
}
I'm using play and I generally import play's default, implicit ExecutionContext.
The map on block needs to run on an execution context.
If I wrote this piece of Scala in a library and I would add an implicit parameter executionContext:
def trace[T](message: => String)(block: => Future[T])(implicit executionContext: ExecutionContext): Future[T] = {
instead of importing play's default ExecutionContext in the libary.