Scala ExecutionContext for REST API Calls - scala

In my application which is a HTTP service that exposes several API's that can be consumed by other services, I have a situation in which I have to call 2 different external services which would be a messaging service and another REST service.
I understand that for these I/O bound operations, it is a good practice to use a separate thread pool or ExecutionContext. I'm using the following to create a configuration for the custom ExecutionContext in my application.conf:
execution-context {
fork-join-executor {
parallelism-max = 10
}
}
I have a couple of questions:
Is this going to create 10 dedicated threads?
How do I know the size of the parallelism-max?
Say if I'm going to use this execution context to make REST API calls, how should I size this?

Is this going to create 10 dedicated threads?
Close, but not exactly.
As you can read from Akka documentation, three properties, parallelism-min, parallelism-factor and parallelism-max are used to calculate parallelism parameter that is then supplied to underlying ForkJoinPool. The formula is parallelism = clamp(parallelism-min, ceil(available processors * factor), parallelism-max).
Now about parallelism. As you can read from the docs, it roughly corresponds to the number of "hot" threads, but under some circumstances additional threads might be spawned. Namely, when some threads are blocked inside ManagedBlocking. Read that answer for additional details.
How do I know the size of the parallelism-max
It depends on your use case. If you block one thread per task, how many simultaneous task executions do you expect?
Say if I'm going to use this execution context to make REST API calls, how should I size this?
Again, how many simultaneous requests you want to make? If you are going to block your threads, and you expect to have large number of simultaneous http calls, and you want them to be processed as soon as possible, you want large thread pool.
However, if you application makes that many http requests, why not use existent library. Libraries like ApacheHttpClient allow you to configure parallelism in terms of http connections or connection per host.
Also for making http calls from actors, it's natural to use non-blocking http clients, like netty-based AsyncHttpClient. It also has thread pool inside (obviously), but it is fixed and any number of simultaneous connections are handled by this fixed amount of threads in non-blocking way.

Related

Need tips on how to achieve a thread-safe queue in fs2 (Scala)

I need to implement a microservice that loads a ton of data into memory at startup and makes that data available via HTTP GET.
I have been looking at fs2 as an option to make the data available to the web layer via an fs2.Queue.
My concern is that if I use the synchronous queue from fs2, my performance of serving the data might be affected negatively because of the blocking nature of the synchronous queue (on enqueue operation).
Is this a valid concern?
Also, which Queue abstractions (in fs2) are thread-safe? ie: can I pass any Queue around to multiple threads and can they all safely take items out of the queue without more than one of them taking the same element out of the queue?
EDIT:
Use case: 10Mil records served by the Stream -> many workers (threads) picking work from the Stream via a HTTP endpoint (GET)

Concurrency, how to create an efficient actor setup?

Alright so I have never done intense concurrent operations like this before, theres three main parts to this algorithm.
This all starts with a Vector of around 1 Million items.
Each item gets processed in 3 main stages.
Task 1: Make an HTTP Request, Convert received data into a map of around 50 entries.
Task 2: Receive the map and do some computations to generate a class instance based off the info found in the map.
Task 3: Receive the class and generate/add to multiple output files.
I initially started out by concurrently running task 1 with 64K entries across 64 threads (1024 entries per thread.). Generating threads in a for loop.
This worked well and was relatively fast, but I keep hearing about actors and how they are heaps better than basic Java threads/Thread pools. I've created a few actors etc. But don't know where to go from here.
Basically:
1. Are actors the right way to achieve fast concurrency for this specific set of tasks. Or is there another way I should go about it.
2. How do you know how many threads/actors are too many, specifically in task one, how do you know what the limit is on number of simultaneous connections is (Im on mac). Is there a golden rue to follow? How many threads vs how large per thread pool? And the actor equivalents?
3. Is there any code I can look at that implements actors for a similar fashion? All the code Im seeing is either getting an actor to print hello world, or super complex stuff.
1) Actors are a good choice to design complex interactions between components since they resemble "real life" a lot. You can see them as different people sending each other requests, it is very natural to model interactions. However, they are most powerful when you want to manage changing state in your application, which does not seem to be the case for you. You can achieve fast concurrency without actors. Up to you.
2) If none of your operations is blocking the best rule is amount of threads = amount of CPUs. If you use a non blocking HTTP client, and NIO when writing your output files then you should be fully non-blocking on IOs and can just safely set the thread count for your app to the CPU count on your machine.
3) The documentation on http://akka.io is very very good and comprehensive. If you have no clue how to use the actor model I would recommend getting a book - not necessarily about Akka.
1) It sounds like most of your steps aren't stateful, in which case actors add complication for no real benefit. If you need to coordinate multiple tasks in a mutable way (e.g. for generating the output files) then actors are a good fit for that piece. But the HTTP fetches should probably just be calls to some nonblocking HTTP library (e.g. spray-client - which will in fact use actors "under the hood", but in a way that doesn't expose the statefulness to you).
2) With blocking threads you pretty much have to experiment and see how many you can run without consuming too many resources. Worry about how many simultaneous connections the remote system can handle rather than hitting any "connection limits" on your own machine (it's possible you'll hit the file descriptor limit but if so best practice is just to increase it). Once you figure that out, there's no value in having more threads than the number of simultaneous connections you want to make.
As others have said, with nonblocking everything you should probably just have a number of threads similar to the number of CPU cores (I've also heard "2x number of CPUs + 1", on the grounds that that ensures there will always be a thread available whenever a CPU is idle).
With actors I wouldn't worry about having too many. They're very lightweight.
If you have really no expierience in Akka try to start with something simple like doing a one-to-one actor-thread rewriting of your code. This will be easier to grasp how things work in akka.
Spin two actors at the begining one for receiving requests and one for writting to the output file. Then when request is received create an actor in request-receiver actor that will do the computation and send the result to the writting actor.

How to tune Play Framework application with proper threadpools?

I am working with Play Framework (Scala) version 2.3. From the docs:
You can’t magically turn synchronous IO into asynchronous by wrapping it in a Future. If you can’t change the application’s architecture to avoid blocking operations, at some point that operation will have to be executed, and that thread is going to block. So in addition to enclosing the operation in a Future, it’s necessary to configure it to run in a separate execution context that has been configured with enough threads to deal with the expected concurrency.
This has me a bit confused on how to tune my webapp. Specifically, since my app has a good amount of blocking calls: a mix of JDBC calls, and calls to 3rd party services using blocking SDKs, what is the strategy for configuring the execution context and determining the number of threads to provide? Do I need a separate execution context? Why can't I simply configure the default pool to have a sufficient amount of threads (and if I do this, why would I still need to wrap the calls in a Future?)?
I know this ultimately will depend on the specifics of my app, but I'm looking for some guidance on the strategy and approach. The play docs preach the use of non-blocking operations everywhere but in reality the typical web-app hitting a sql database has many blocking calls, and I got the impression from reading the docs that this type of app will perform far from optimally with the default configurations.
[...] what is the strategy for configuring the execution context and
determining the number of threads to provide
Well, that's the tricky part which depends on your individual requirements.
First of all, you probably should choose a basic profile from the docs (pure asynchronous, highly synchronous or many specific thread pools)
The second step is to fine-tune your setup by profiling and benchmarking your application
Do I need a separate execution context?
Not necessarily. But it makes sense to use separate execution contexts if you want to trigger all your blocking IO-calls at once and not in a sequential way (so database call B does not have to wait until database call A is finished).
Why can't I simply configure the default pool to have a sufficient
amount of threads (and if I do this, why would I still need to wrap
the calls in a Future?)?
You can, check the docs:
play {
akka {
akka.loggers = ["akka.event.slf4j.Slf4jLogger"]
loglevel = WARNING
actor {
default-dispatcher = {
fork-join-executor {
parallelism-min = 300
parallelism-max = 300
}
}
}
}
}
With this approach, you basically are turning Play into a one-thread-per-request-model. This is not the idea behind Play, but if you're doing a lot of blocking IO calls, it's the simplest approach. In this case, you don't need to wrap your database calls in a Future.
To put it in a nutshell, you basically have three ways to go:
Only use (IO-)technologies whose API calls are non-blocking and asynchronous. This allows you to use a small threadpool / default execution context which suits the nature of Play
Turn Play into a one-thread-per-request Framework by drastically increasing the default execution context. No futures needed, just call your blocking database as always
Create specific execution contexts for your blocking IO-calls and gain fine-grained control of what you are doing
Firstly, before diving in and refactoring your app, you should determine whether this is actually a problem for you. Run some benchmarks (gatling is superb) and do a few profiles with something like JProfiler. If you can live with the current performance then happy days.
The ideal is to use a reactive driver which would return you a future that then gets passed all the way back to your controller. Unfortunately async is still an Open ticket for slick. Interacting with REST APIs can be made reactive using the PlayWS library, but if you have to go via a library that your 3rd party provides then you're stuck.
So, assuming that none of these are feasible and that you do need to improve performance, the question is what benefit would Play's suggestion have? I think what they're getting at here is that it's useful to partition your threads into those that block and those that can make use of asynchronous techniques.
If, for instance, only some proportion of your requests are long and blocking then with a single thread pool you risk all threads being used for the blocking operations. Your controller would then not be able to handle any new requests, irrespective of whether that request needs to call a blocking service. If you can allocate enough threads that this never happens then no problem.
If, on the other hand, you are hitting your limit for threads then by using two pools you can keep your fast, non-blocking requests snappy. You would have one pool servicing requests in your controller and calling into services which return futures. Some of these futures would actually be performing work using a separate pool of threads, but only for the blocking operations. If there is any portion of your app which could be made reactive, then your controller could take advantage of this while isolating the controller from the blocking operations.

Using Scala Akka framework for blocking CLI calls

I'm relatively new to Akka & Scala, but I would like to use Akka as a generic framework to pull together information from various web tools, and cli commands.
I understand the general principal that in an Actor model, it is highly desirable not to have the actors block. And in the case of the http requests, there are async http clients (such as Spray) that means that I can handle the requests asynchronously within the Actor framework.
However, I'm unsure what is the best approach when combining actors with existing blocking API calls such as the scala ProcessBuilder/ProcessIO libraries. In terms of issuing these CLI commands I expect a relatively small amount of concurrency, e.g. perhaps executing a max of 10 concurrent CLI invocations on a 12 core machine.
Is it better to have a single actor managing these CLI commands, farming the actual work off to Futures that are created as needed? Or would it be cleaner just to maintain a set of separate actors backed by a PinnedDispatcher? Or something else?
From the Akka documentation ( http://doc.akka.io/docs/akka/snapshot/general/actor-systems.html#Blocking_Needs_Careful_Management ):
"
Blocking Needs Careful Management
In some cases it is unavoidable to do blocking operations, i.e. to put a thread to sleep for an indeterminate time, waiting for an external event to occur. Examples are legacy RDBMS drivers or messaging APIs, and the underlying reason in typically that (network) I/O occurs under the covers. When facing this, you may be tempted to just wrap the blocking call inside a Future and work with that instead, but this strategy is too simple: you are quite likely to find bottle-necks or run out of memory or threads when the application runs under increased load.
The non-exhaustive list of adequate solutions to the “blocking problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router [Java, Scala]), making sure to configure a thread pool which is either dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded number of tasks of this nature will exhaust your memory or thread limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they occur as actor messages.
The first possibility is especially well-suited for resources which are single-threaded in nature, like database handles which traditionally can only execute one outstanding query at a time and use internal synchronization to ensure this. A common pattern is to create a router for N actors, each of which wraps a single DB connection and handles queries as sent to the router. The number N must then be tuned for maximum throughput, which will vary depending on which DBMS is deployed on what hardware."

How to limit concurrency when using actors in Scala?

I'm coming from Java, where I'd submit Runnables to an ExecutorService backed by a thread pool. It's very clear in Java how to set limits to the size of the thread pool.
I'm interested in using Scala actors, but I'm unclear on how to limit concurrency.
Let's just say, hypothetically, that I'm creating a web service which accepts "jobs". A job is submitted with POST requests, and I want my service to enqueue the job then immediately return 202 Accepted — i.e. the jobs are handled asynchronously.
If I'm using actors to process the jobs in the queue, how can I limit the number of simultaneous jobs that are processed?
I can think of a few different ways to approach this; I'm wondering if there's a community best practice, or at least, some clearly established approaches that are somewhat standard in the Scala world.
One approach I've thought of is having a single coordinator actor which would manage the job queue and the job-processing actors; I suppose it could use a simple int field to track how many jobs are currently being processed. I'm sure there'd be some gotchyas with that approach, however, such as making sure to track when an error occurs so as to decrement the number. That's why I'm wondering if Scala already provides a simpler or more encapsulated approach to this.
BTW I tried to ask this question a while ago but I asked it badly.
Thanks!
I'd really encourage you to have a look at Akka, an alternative Actor implementation for Scala.
http://www.akkasource.org
Akka already has a JAX-RS[1] integration and you could use that in concert with a LoadBalancer[2] to throttle how many actions can be done in parallell:
[1] http://doc.akkasource.org/rest
[2] http://github.com/jboner/akka/blob/master/akka-patterns/src/main/scala/Patterns.scala
You can override the system properties actors.maxPoolSize and actors.corePoolSize which limit the size of the actor thread pool and then throw as many jobs at the pool as your actors can handle. Why do you think you need to throttle your reactions?
You really have two problems here.
The first is keeping the thread pool used by actors under control. That can be done by setting the system property actors.maxPoolSize.
The second is runaway growth in the number of tasks that have been submitted to the pool. You may or may not be concerned with this one, however it is fully possible to trigger failure conditions such as out of memory errors and in some cases potentially more subtle problems by generating too many tasks too fast.
Each worker thread maintains a dequeue of tasks. The dequeue is implemented as an array that the worker thread will dynamically enlarge up to some maximum size. In 2.7.x the queue can grow itself quite large and I've seen that trigger out of memory errors when combined with lots of concurrent threads. The max dequeue size is smaller 2.8. The dequeue can also fill up.
Addressing this problem requires you control how many tasks you generate, which probably means some sort of coordinator as you've outlined. I've encountered this problem when the actors that initiate a kind of data processing pipeline are much faster than ones later in the pipeline. In order control the process I usually have the actors later in the chain ping back actors earlier in the chain every X messages, and have the ones earlier in the chain stop after X messages and wait for the ping back. You could also do it with a more centralized coordinator.