Suppose I have a simple HTTP client with a naïve method like this:
def httpGet(url: String)(implicit ec: ExecutionContext): Future[String] = Future {
io.Source.fromURL(url).mkString
}
Suppose also I call it against a server with a rate and concurrency limit. I think these limits should be implemented in ExecutionContext.
The concurrency limit is just the number of threads of the java.util.concurrent.Executor backing some ExecutionContext. The rate limiting should be part of the ExecutionContext too. So we can write a new class extending ExecutionContext to implement the rate and concurrency limits and use an instance of this class to invoke httpGet.
class ExecutionContextWithRateAndConcurrencyLimits(
numThreads: Int, // for concurrency limit
rate: Rate // requests per time unit
) extends ExecutionContex { ... }
val ec = new ExecutionContextWithRateAndConcurrencyLimits(
numThreads = 100,
rate = Rate(1000, 1.sec)
)
val fut = httpGet(url)(ec)
Would you agree that rate and concurrency limits should be implemented in ExcecutionContext? Do you know any implementation like that ?
Related
I have a project which does HTTP calls to two seperate API's. The calls to both of these API's need to be rate limited separately. I started with the calls to one of the API's and I'm trying to use a custom ExecutionContext to achieve this. Here's my application.conf:
play.modules.enabled += "playtest.PlayTestModule"
my-context {
fork-join-executor {
parallelism-min = 10
parallelism-max = 10
}
}
This is the scala class I'm using to test if it works:
#Singleton
class MyWsClient #Inject() (client: WSClient, akkaSystem: ActorSystem) {
val myExecutionContext: ExecutionContext = akkaSystem.dispatchers.lookup("my-context")
val i = new AtomicInteger(0)
def doThing: Future[Int] = {
Future {
println(i.incrementAndGet)
println("Awaiting")
Await.result(client.url("http://localhost:9000/test").get, Duration.Inf)
println("Done")
i.decrementAndGet
1
}(myExecutionContext)
}
}
However, no matter what I try, the number of parallel calls exceeds the limits I set in the application.conf. But it gets even stranger, because if I replace the line
Await.result(client.url("http://localhost:9000/test").get, Duration.Inf)
with
Thread.sleep(1000)
the limits ARE respected and the rate is properly limited.
What am I doing wrong and how can I fix it? If there is another way of rate limiting with the scala-ws library I would love to hear it.
I understand you want to keep using scala-ws ok, but what about something not relying on using specific ExecutionContext?
If you agree with that here's an idea... You create a RateLimitedWSClient component, which you will inject into your controllers instead of WSClient. This component should be a singleton, and support a single method def rateLimit[R](rateLimitClass: String)(request: WSClient => Future[R]). The rateLimitClass is meant to specify which ratelimit to apply to the current request, as you said you need to rate-limit requests to different API differently. The request function should be obvious.
Now my suggestion for the implementation is to use a simple akka-stream that will pipe your requests through the actual WSClient while rate-limiting using the throttle flow-stage (https://doc.akka.io/docs/akka/current/scala/stream/stages-overview.html#throttle):
val client: WSClient = ??? // injected into the component
// component initialization, for example create one flow per API
val queue =
Source
.queue[(Promise[_], WSClient => Future[_])](...) // keep this materialized value
.throttle(...)
.map { (promise, request) =>
promise.completeWith(request(client))
}
.to(Sink.ignore)
.run() // You have to get the materialized queue out of here!
def rateLimit[R](rateLimitClass: String)(request: WSClient => Future[R]): Future[R] = {
val result = Promise.empty[R]
// select which queue to use based on rateLimitClass
if (rateLimitClass == "API1")
queue.offer(result -> request)
else ???
result.future
}
The above is only rough code, I hope you get the idea. You can of course choose something else that a queue, or if you keep the queue, you have to decide how to handle overflows...
I have started using mongo scala driver in our scala akka-http project and it has been great help especially the case class support in v2.0.0 is very nifty. I'm trying to wrap my head around how to use the mongo scala driver with non-default execution context using observeOn.
Due to the nature of our java library dependencies, I'm using blocking calls to get the results from MongoDB as shown here, Helpers. I've modified the results and headResult functions from MongoDB Helpers slightly using observeOn as below but I'm noticing some strange race condition that I don't know how to resolve.
trait ImplicitObservable[C] {
val observable: Observable[C]
val converter: (C) => String
def headResult()(implicit executionContext: ExecutionContext) = Await.result(observable.observeOn(executionContext).head(), Duration(10, TimeUnit.SECONDS))
def results()(implicit executionContext: ExecutionContext): List[C] = Await.result(observable.observeOn(executionContext).toFuture(), Duration(20, TimeUnit.SECONDS)).toList
}
The results function doesn't return all the records that I'm expecting and the behavior is different every time except when I use the akka PinnedDispatcher which allows only one thread. Since its a blocking operation I would like to use a non default akka dispatcher so that it will not block my HTTP requests. I really appreciate if someone can help me with this.
# looking up dispatcher
val executionContext = system.dispatchers.lookup("mongo-dispatcher")
# application.conf
mongo-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 100
}
throughput = 1
}
My sample database client code:
def getPersonById(personId: String): PersonCaseClass = {
personCollection.find[PersonCaseClass](equal("_id", "person_12345")).first().headResult()
}
I'm working on a Finagle HTTP application where services were implemented without taking advantage of Futures and accessing Redis via a third-party lib. Such services have the following form:
class SampleOldService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val value: Int = getValueFromRedis()
val response: Response = buildResponse(value)
Future.value(response)
}
}
(They are much more complex than this -- the point here is that they are synchronous.)
At some point we began developing new services with Futures and also using the Finagle Redis API. Redis calls are encapsulated in a Store class. New services have the following form:
class SampleNewService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val value: Future[Int] = Store.getValue()
val response: Future[Response] = value map buildResponse
response
}
}
(They are much more complex than this -- the point here is that they are asynchronous.)
We began refactoring the old services to also take advantage of asynchronicity and Futures. We want to do this incrementally, without having to fully re-implement them at once.
The first step was to try to use the new Store class, with code like this:
class SampleOldService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val valueFuture: Future[Int] = Store.getValue()
val value: Int = Await.result(valueFuture)
val response: Response = buildResponse(value)
Future.value(response)
}
}
However, it proved to be catastrophic, because on heavy loads the requests to the old services are stuck at the Await.result() call. The new asynchronous services show no issue.
The problem seems to be related to exhaustion of thread and/or future pools. We have found several solutions on how to do synchronous calls (which perform I/O) from asynchronous calls by using custom pools (such as FuturePool), but not the other way round, which is our case.
So, what is the recommended way of calling asynchronous code (which perform I/O) from synchronous code in Finagle?
The easiest thing you can do is wrap your synchronous calls with a Thread Pool that return a Future. Twitter's util-core provides the FuturePool utility to achieve exactly that.
Something like that (untested code):
import com.twitter.util.FuturePool
val future = FuturePool.unboundedPool {
val result = myBlockingCall.await()
result
}
You can use FuturePool which are futures that run on top of a cached threadpool, but why do that when you can, have the service return a promise and set the value of the promise when you complete the future from the store class.
val p: Promise[Response] = Promise[Response]()
val value: Future[Int] = Store.getValue()
value onSuccess {x =>
val result: Response = buildResponse(x)
p.setValue(result)
}
p
I'm writing close() method for my ScheduledExecutorService-based timer:
override def close(implicit executionContext: ExecutionContext): Future[Unit] = {
val p = Promise[Unit]
executionContext.execute(new Runnable() {
override def run() = {
blocking {
p complete Try {
executor.shutdown()
//OK for default global execution context
//As we marked this code as blocking, additional thread
//will be used on that so no threadpool starvation
executor.awaitTermination(1, TimeUnit.DAYS)
}
}
}
})
p.future
}
But if I implement ExecutionContext by myself, this code will block one of the pool's threads because I did not find any way to get that blocking context.
So, question: Is it possible to create own ExecutionContext that can properly handle scala.concurrent.blocking?
Of course it's possible, it's just far from trivial. You would need to create an ExecutionContext that creates threads that mix in BlockContext which requires the following method:
def blockOn[T](thunk: => T)(implicit permission: CanAwait): T
blocking(thunk) will eventually lead to calling blockOn(thunk), and blockOn should figure out if the ExecutionContext has reached starvation and needs to do something or not. scala.concurrent.ExecutionContext.Implicits.global does it this way, but as you can see it uses a ForkJoinPool to do the heavy-lifting, and the implementation of that is thousands of lines of code.
Keep in mind that whether you use ExecutionContext.Implicits.global or your own `ExecutionContext, a thread will still be blocked by your code. The only difference is that the former spawns another thread to handle the fact that too many are blocked. Creating your own is likely to create some dangerous bugs though, as a lot of care has to be taken to avoid deadlocks or spawning too many threads.
EDIT: clarification of intent:
I have a (5-10 second) scala computation that aggregates some data from many AWS S3 objects at a given point in time. I want to make this information available through a REST API. I'd also like to update this information every minute or so for new objects that have been written to this bucket in the interim. The summary itself will be a large JSON blob, and can save a bunch of AWS calls if I cache the results of my S3 API calls from the previous updates (since these objects are immutable).
I'm currently writing this Spray.io based REST service in Scala. I'd like the REST server to continue serving 'stale' data even if a computation is currently taking place. Then once the computation is finished, I'd like to atomically start serving requests of the new data snapshot.
My initial idea was to have two actors, one doing the Spray routing and serving, and the other handling the long running computation and feeding the most recent cached result to the routing actor:
class MyCompute extends Actor {
var myvar = 1.0 // will eventually be several megabytes of state
import context.dispatcher
// [ALTERNATIVE A]:
// def compute() = this.synchronized { Thread.sleep(3000); myvar += 1.0 }
// [ALTERNATIVE B]:
// def compute() = { Thread.sleep(3000); this.synchronized { myvar += 1.0 }}
def compute() = { Thread.sleep(3000); myvar += 1.0 }
def receive = {
case "compute" => {
compute() // BAD: blocks this thread!
// [FUTURE]:
Future(compute()) // BAD: Not threadsafe
}
case "retrieve" => {
sender ! myvar
// [ALTERNATIVE C]:
// sender ! this.synchronized { myvar }
}
}
}
class MyHttpService(val dataService:ActorRef) extends HttpServiceActor {
implicit val timeout = Timeout(1 seconds)
import context.dispatcher
def receive = runRoute {
path("ping") {
get {
complete {
(dataService ? "retrieve").map(_.toString).mapTo[String]
}
}
} ~
path("compute") {
post {
complete {
dataService ! "compute"
"computing.."
}
}
}
}
}
object Boot extends App {
implicit val system = ActorSystem("spray-sample-system")
implicit val timeout = Timeout(1 seconds)
val dataService = system.actorOf(Props[MyCompute], name="MyCompute")
val httpService = system.actorOf(Props(classOf[MyHttpService], dataService), name="MyRouter")
val cancellable = system.scheduler.schedule(0 milliseconds, 5000 milliseconds, dataService, "compute")
IO(Http) ? Http.Bind(httpService, system.settings.config.getString("app.interface"), system.settings.config.getInt("app.port"))
}
As things are written, everything is safe, but when passed a "compute" message, the MyCompute actor will block the thread, and not be able to serve requests to the MyHttpService actor.
Some alternatives:
akka.agent
The akka.agent.Agent looks like it is designed to handle this problem nicely (replacing the MyCompute actor with an Agent), except that it seems to be designed for simpler updates of state:: In reality, MyCompute will have multiple bits of state (some of which are several megabyte datastructures), and using the sendOff functionality would seemingly rewrite all of that state every time which would seemingly apply a lot of GC pressure unnecessarily.
Synchronization
The [Future] code above solves the blocking problem, but as if I'm reading the Akka docs correctly, this would not be threadsafe. Would adding a synchronize block in [ALTERNATIVE A] solve this? I would also imagine that I only have to synchronize the actual update to the state in [ALTERNATIVE B] as well. I would seemingly also have to do the same for the reading of the state as in [ALTERNATIVE C] as well?
Spray-cache
The spray-cache pattern seems to be built with a web serving use case in mind (small cached objects available with a key), so I'm not sure if it applies here.
Futures with pipeTo
I've seen examples of wrapping a long running computation in a Future and then piping that back to the same actor with pipeTo to update internal state.
The problem with this is: what if I want to update the mutable internal state of my actor during the long running computation?
Does anyone have any thoughts or suggestions for this use case?
tl;dr:
I want my actor to update internal, mutable state during a long running computation without blocking. Ideas?
So let the MyCompute actor create a Worker actor for each computation:
A "compute" comes to MyCompute
It remembers the sender and spawns the Worker actor. It stores the Worker and the Sender in Map[Worker, Sender]
Worker does the computation. On finish, Worker sends the result to MyCompute
MyCompute updates the result, retrieves the orderer of it from the Map[Worker, Sender] using the completed Worker as the key. Then it sends the result to the orderer, and then it terminates the Worker.
Whenever you have blocking in an Actor, you spawn a dedicated actor to handle it. Whenever you need to use another thread or Future in Actor, you spawn a dedicated actor. Whenever you need to abstract any complexity in Actor, you spawn another actor.