Calling asynchronous/Future code from synchronous code in Finagle - scala

I'm working on a Finagle HTTP application where services were implemented without taking advantage of Futures and accessing Redis via a third-party lib. Such services have the following form:
class SampleOldService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val value: Int = getValueFromRedis()
val response: Response = buildResponse(value)
Future.value(response)
}
}
(They are much more complex than this -- the point here is that they are synchronous.)
At some point we began developing new services with Futures and also using the Finagle Redis API. Redis calls are encapsulated in a Store class. New services have the following form:
class SampleNewService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val value: Future[Int] = Store.getValue()
val response: Future[Response] = value map buildResponse
response
}
}
(They are much more complex than this -- the point here is that they are asynchronous.)
We began refactoring the old services to also take advantage of asynchronicity and Futures. We want to do this incrementally, without having to fully re-implement them at once.
The first step was to try to use the new Store class, with code like this:
class SampleOldService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val valueFuture: Future[Int] = Store.getValue()
val value: Int = Await.result(valueFuture)
val response: Response = buildResponse(value)
Future.value(response)
}
}
However, it proved to be catastrophic, because on heavy loads the requests to the old services are stuck at the Await.result() call. The new asynchronous services show no issue.
The problem seems to be related to exhaustion of thread and/or future pools. We have found several solutions on how to do synchronous calls (which perform I/O) from asynchronous calls by using custom pools (such as FuturePool), but not the other way round, which is our case.
So, what is the recommended way of calling asynchronous code (which perform I/O) from synchronous code in Finagle?

The easiest thing you can do is wrap your synchronous calls with a Thread Pool that return a Future. Twitter's util-core provides the FuturePool utility to achieve exactly that.
Something like that (untested code):
import com.twitter.util.FuturePool
val future = FuturePool.unboundedPool {
val result = myBlockingCall.await()
result
}

You can use FuturePool which are futures that run on top of a cached threadpool, but why do that when you can, have the service return a promise and set the value of the promise when you complete the future from the store class.
val p: Promise[Response] = Promise[Response]()
val value: Future[Int] = Store.getValue()
value onSuccess {x =>
val result: Response = buildResponse(x)
p.setValue(result)
}
p

Related

Play Framework async controller blocks subsequent calls for the same controller

My goal is to do some database queries from the async controller, then return the answer.
I'm playing with the example project, for now just simulating the DB queries with a sleep, but what I noticed is that whatever I do, the REST interface won't even start the sleep of the second query until the first one finishes.
E.g.: If I call the REST interface from one tab in the browser, then 1 second later again from an another tab, I'd expect that the second one gets the reply too in 10 seconds, but actually it's 19.
Also it doesn't seem to use the "database-io" pool either:
1: application-akka.actor.default-dispatcher-2
2: application-akka.actor.default-dispatcher-5
My code:
#Singleton
class AsyncController #Inject()(cc: ControllerComponents, actorSystem: ActorSystem) extends AbstractController(cc) {
implicit val executionContext = actorSystem.dispatchers.lookup("database-io")
def message = Action.async {
getFutureMessage().map { msg => Ok(msg) }
}
private def getFutureMessage(): Future[String] = {
val defaultThreadPool = Thread.currentThread().getName;
println(s"""1: $defaultThreadPool""")
val promise: Promise[String] = Promise[String]()
actorSystem.scheduler.scheduleOnce(0 second) {
val blockingPool = Thread.currentThread().getName;
println(s"""2: $blockingPool""")
Thread.sleep(10000)
promise.success("Hi!")
}(actorSystem.dispatcher)
promise.future
}
}
It could be two reasons for this behavior:
You use the development mode (1 thread), or your product configuration is configured only for one thread.
The browser blocks the second request until receiving the response from the first. This phrase: "If I call the REST interface from one tab in the browser." Try to do the same from different browsers.
You need to avoid blocking code. Basically:
You can a method that returns the Future.
You map into it.
You recover any failure the Future result might bring.
Lets say I have:
def userAge (userId: String): Future[Int] = ???
Then you map into it:
userAge.map{
age => ??? //everything is ok
}.recover{ case e: Throwable => ??? //Do something when it fails
Note that if you have more than one call the other map becomes flatMap because you want Future[...] something and not Future[Future[...]].

Trying to do Scala WS calls in a given ExecutionContext

I have a project which does HTTP calls to two seperate API's. The calls to both of these API's need to be rate limited separately. I started with the calls to one of the API's and I'm trying to use a custom ExecutionContext to achieve this. Here's my application.conf:
play.modules.enabled += "playtest.PlayTestModule"
my-context {
fork-join-executor {
parallelism-min = 10
parallelism-max = 10
}
}
This is the scala class I'm using to test if it works:
#Singleton
class MyWsClient #Inject() (client: WSClient, akkaSystem: ActorSystem) {
val myExecutionContext: ExecutionContext = akkaSystem.dispatchers.lookup("my-context")
val i = new AtomicInteger(0)
def doThing: Future[Int] = {
Future {
println(i.incrementAndGet)
println("Awaiting")
Await.result(client.url("http://localhost:9000/test").get, Duration.Inf)
println("Done")
i.decrementAndGet
1
}(myExecutionContext)
}
}
However, no matter what I try, the number of parallel calls exceeds the limits I set in the application.conf. But it gets even stranger, because if I replace the line
Await.result(client.url("http://localhost:9000/test").get, Duration.Inf)
with
Thread.sleep(1000)
the limits ARE respected and the rate is properly limited.
What am I doing wrong and how can I fix it? If there is another way of rate limiting with the scala-ws library I would love to hear it.
I understand you want to keep using scala-ws ok, but what about something not relying on using specific ExecutionContext?
If you agree with that here's an idea... You create a RateLimitedWSClient component, which you will inject into your controllers instead of WSClient. This component should be a singleton, and support a single method def rateLimit[R](rateLimitClass: String)(request: WSClient => Future[R]). The rateLimitClass is meant to specify which ratelimit to apply to the current request, as you said you need to rate-limit requests to different API differently. The request function should be obvious.
Now my suggestion for the implementation is to use a simple akka-stream that will pipe your requests through the actual WSClient while rate-limiting using the throttle flow-stage (https://doc.akka.io/docs/akka/current/scala/stream/stages-overview.html#throttle):
val client: WSClient = ??? // injected into the component
// component initialization, for example create one flow per API
val queue =
Source
.queue[(Promise[_], WSClient => Future[_])](...) // keep this materialized value
.throttle(...)
.map { (promise, request) =>
promise.completeWith(request(client))
}
.to(Sink.ignore)
.run() // You have to get the materialized queue out of here!
def rateLimit[R](rateLimitClass: String)(request: WSClient => Future[R]): Future[R] = {
val result = Promise.empty[R]
// select which queue to use based on rateLimitClass
if (rateLimitClass == "API1")
queue.offer(result -> request)
else ???
result.future
}
The above is only rough code, I hope you get the idea. You can of course choose something else that a queue, or if you keep the queue, you have to decide how to handle overflows...

Implementing a WebSocketClient in Scala controller

I have a scala controller in which I am calling an external webservice using WS api of Play! framework which returns a json. The same api is now to be called using a WebSocketClient as every connection should be made using a WebSocket instead of http. So normal Action in controller are converted to WebSocket functions, however I am not being able to call a WebSocket function from within the scala code. I have searched and gone through several places on web for the solution, but I didn't found the solution anywhere. How can this be done, calling a WebSocket function and fetch its json using a WebSocketClient in scala code OR we can say, consuming a WebSocket from within scala code ? I found a similar to mine questio on SO but none has given an answer to that! I want to know whether its possible or not in Play framework ?
Consume a WebSocket connection using Scala and Play
Edit: I am implementing the following code:
val c = new AsyncHttpClient()
val webSocketClient = c.prepareGet("ws://0.0.0.0:9000/testSocket").execute(new WebSocketUpgradeHandler.Builder().addWebSocketListener(new WebSocketTextListener {
override def onMessage(s: String): Unit = {
}
override def onOpen(webSocket: websocket.WebSocket): Unit = {
webSocket.sendTextMessage("test")
}
override def onFragment(s: String, b: Boolean): Unit = {}
override def onError(throwable: Throwable): Unit = {}
override def onClose(webSocket: websocket.WebSocket): Unit = {
latch.countDown()
}
}).build()).get()
val result = webSocketClient.sendTextMessage("true")
println("================================" + result)
The result variable is not printed on console giving a json parser exception.
Update: My WebSocket connection in ws://0.0.0.0:9000/testSocket which is inside a scala controller of different project, is as under:
def sockeTest = WebSocket.tryAccept[JsValue] { request =>
futureJsonVariable.map { json =>
val in = Iteratee.ignore[JsValue]
// Some database computation here which generated a Future[JsValue] value in futureJsonVariable variable.
val out = Enumerator(json).andThen(Enumerator.eof)
Right((in, out))
} recover {
case err => Left(InternalServerError(err.getMessage))
}
}
Update2: One last thing I would like to ask regarding this is that, we invoke a WebSocket connection using webSocket.sendMessage("test".getBytes()) which gives us the response of WebSocket in the overridden method onMessage(). I want to know, how can we await until a WebSocket response is being received, so that we can perform the required computations with the WebSocket response data. I have checked by returning a Future[JsValue] variable inside the onMessage() method but that thing is something invalid. So how can we put webSocket.sendMessage("test".getBytes()) in await mode, so that further code is executed upon the response of WebSocket ?
Play doesn't support WebSocket client connections. The best option is probably to use AsyncHttpClient, this is the library that Play's WS API is built on so it will already be on your classpath, instructions for accessing WebSockets using it are here:
https://github.com/AsyncHttpClient/async-http-client

Non-blocking updates of mutable state with Akka Actors

EDIT: clarification of intent:
I have a (5-10 second) scala computation that aggregates some data from many AWS S3 objects at a given point in time. I want to make this information available through a REST API. I'd also like to update this information every minute or so for new objects that have been written to this bucket in the interim. The summary itself will be a large JSON blob, and can save a bunch of AWS calls if I cache the results of my S3 API calls from the previous updates (since these objects are immutable).
I'm currently writing this Spray.io based REST service in Scala. I'd like the REST server to continue serving 'stale' data even if a computation is currently taking place. Then once the computation is finished, I'd like to atomically start serving requests of the new data snapshot.
My initial idea was to have two actors, one doing the Spray routing and serving, and the other handling the long running computation and feeding the most recent cached result to the routing actor:
class MyCompute extends Actor {
var myvar = 1.0 // will eventually be several megabytes of state
import context.dispatcher
// [ALTERNATIVE A]:
// def compute() = this.synchronized { Thread.sleep(3000); myvar += 1.0 }
// [ALTERNATIVE B]:
// def compute() = { Thread.sleep(3000); this.synchronized { myvar += 1.0 }}
def compute() = { Thread.sleep(3000); myvar += 1.0 }
def receive = {
case "compute" => {
compute() // BAD: blocks this thread!
// [FUTURE]:
Future(compute()) // BAD: Not threadsafe
}
case "retrieve" => {
sender ! myvar
// [ALTERNATIVE C]:
// sender ! this.synchronized { myvar }
}
}
}
class MyHttpService(val dataService:ActorRef) extends HttpServiceActor {
implicit val timeout = Timeout(1 seconds)
import context.dispatcher
def receive = runRoute {
path("ping") {
get {
complete {
(dataService ? "retrieve").map(_.toString).mapTo[String]
}
}
} ~
path("compute") {
post {
complete {
dataService ! "compute"
"computing.."
}
}
}
}
}
object Boot extends App {
implicit val system = ActorSystem("spray-sample-system")
implicit val timeout = Timeout(1 seconds)
val dataService = system.actorOf(Props[MyCompute], name="MyCompute")
val httpService = system.actorOf(Props(classOf[MyHttpService], dataService), name="MyRouter")
val cancellable = system.scheduler.schedule(0 milliseconds, 5000 milliseconds, dataService, "compute")
IO(Http) ? Http.Bind(httpService, system.settings.config.getString("app.interface"), system.settings.config.getInt("app.port"))
}
As things are written, everything is safe, but when passed a "compute" message, the MyCompute actor will block the thread, and not be able to serve requests to the MyHttpService actor.
Some alternatives:
akka.agent
The akka.agent.Agent looks like it is designed to handle this problem nicely (replacing the MyCompute actor with an Agent), except that it seems to be designed for simpler updates of state:: In reality, MyCompute will have multiple bits of state (some of which are several megabyte datastructures), and using the sendOff functionality would seemingly rewrite all of that state every time which would seemingly apply a lot of GC pressure unnecessarily.
Synchronization
The [Future] code above solves the blocking problem, but as if I'm reading the Akka docs correctly, this would not be threadsafe. Would adding a synchronize block in [ALTERNATIVE A] solve this? I would also imagine that I only have to synchronize the actual update to the state in [ALTERNATIVE B] as well. I would seemingly also have to do the same for the reading of the state as in [ALTERNATIVE C] as well?
Spray-cache
The spray-cache pattern seems to be built with a web serving use case in mind (small cached objects available with a key), so I'm not sure if it applies here.
Futures with pipeTo
I've seen examples of wrapping a long running computation in a Future and then piping that back to the same actor with pipeTo to update internal state.
The problem with this is: what if I want to update the mutable internal state of my actor during the long running computation?
Does anyone have any thoughts or suggestions for this use case?
tl;dr:
I want my actor to update internal, mutable state during a long running computation without blocking. Ideas?
So let the MyCompute actor create a Worker actor for each computation:
A "compute" comes to MyCompute
It remembers the sender and spawns the Worker actor. It stores the Worker and the Sender in Map[Worker, Sender]
Worker does the computation. On finish, Worker sends the result to MyCompute
MyCompute updates the result, retrieves the orderer of it from the Map[Worker, Sender] using the completed Worker as the key. Then it sends the result to the orderer, and then it terminates the Worker.
Whenever you have blocking in an Actor, you spawn a dedicated actor to handle it. Whenever you need to use another thread or Future in Actor, you spawn a dedicated actor. Whenever you need to abstract any complexity in Actor, you spawn another actor.

Call Redis (or other db) from within Spray Route

I am trying to figure out the best way to establish a Redis Pool and then make calls to Redis from within a Spray route. I want to make sure that I can use the connection pool for Redis connections. What would be the best way to instantiate the pool and use it within my spray routes? Is there a better way to establish a "global" pool that can be used? Should I create an actor instead and use that to make the redis calls? I am obviously a bit ignorant here.
Crude Redis Client:
object RedisClient {
val pool = new JedisPool(new JedisPoolConfig(), "localhost")
def getValue(key: String): String= {
try{
val jedis = pool.getResource()
//returns redis value
jedis.get(key)
}
}
}
Route that ends up calling a function that uses the Redis Client
trait DemoService extends HttpService {
val messageApiRouting =
path("summary" / Segment / Segment) { (dataset, timeslice) =>
onComplete(getSummary(dataset, timeslice)) {
case Success(value) => complete(s"The result was $value")
case Failure(ex) => complete(s"An error occurred: ${ex.getMessage}")
}
}
def getSummary(dataset: String, timeslice: String): Future[String] = Future {
val key = dataset + timeslice
RedisClient.getValue(key)
}
}
As far as I know the Jedis client is not non-blocking and async. So you may not get all the benefits of using Spray if you use a blocking client. I would suggest looking at Rediscala.
Second I would delegate the actual interaction to another actor which has a RedisClient interacting with your Redis instance/cluster.
Finally, you can complete a Spray route by giving it a Future. This essentially means that your entire pipeline will be async and non-blocking.
NOTE: Redis is still single threaded and I don't think there is anyway around that AFAIK.
In general, you should use a reactive driver if possible (e.g., Slick, ReactiveMongo )