I am trying to figure out the best way to establish a Redis Pool and then make calls to Redis from within a Spray route. I want to make sure that I can use the connection pool for Redis connections. What would be the best way to instantiate the pool and use it within my spray routes? Is there a better way to establish a "global" pool that can be used? Should I create an actor instead and use that to make the redis calls? I am obviously a bit ignorant here.
Crude Redis Client:
object RedisClient {
val pool = new JedisPool(new JedisPoolConfig(), "localhost")
def getValue(key: String): String= {
try{
val jedis = pool.getResource()
//returns redis value
jedis.get(key)
}
}
}
Route that ends up calling a function that uses the Redis Client
trait DemoService extends HttpService {
val messageApiRouting =
path("summary" / Segment / Segment) { (dataset, timeslice) =>
onComplete(getSummary(dataset, timeslice)) {
case Success(value) => complete(s"The result was $value")
case Failure(ex) => complete(s"An error occurred: ${ex.getMessage}")
}
}
def getSummary(dataset: String, timeslice: String): Future[String] = Future {
val key = dataset + timeslice
RedisClient.getValue(key)
}
}
As far as I know the Jedis client is not non-blocking and async. So you may not get all the benefits of using Spray if you use a blocking client. I would suggest looking at Rediscala.
Second I would delegate the actual interaction to another actor which has a RedisClient interacting with your Redis instance/cluster.
Finally, you can complete a Spray route by giving it a Future. This essentially means that your entire pipeline will be async and non-blocking.
NOTE: Redis is still single threaded and I don't think there is anyway around that AFAIK.
In general, you should use a reactive driver if possible (e.g., Slick, ReactiveMongo )
Related
I have a project which does HTTP calls to two seperate API's. The calls to both of these API's need to be rate limited separately. I started with the calls to one of the API's and I'm trying to use a custom ExecutionContext to achieve this. Here's my application.conf:
play.modules.enabled += "playtest.PlayTestModule"
my-context {
fork-join-executor {
parallelism-min = 10
parallelism-max = 10
}
}
This is the scala class I'm using to test if it works:
#Singleton
class MyWsClient #Inject() (client: WSClient, akkaSystem: ActorSystem) {
val myExecutionContext: ExecutionContext = akkaSystem.dispatchers.lookup("my-context")
val i = new AtomicInteger(0)
def doThing: Future[Int] = {
Future {
println(i.incrementAndGet)
println("Awaiting")
Await.result(client.url("http://localhost:9000/test").get, Duration.Inf)
println("Done")
i.decrementAndGet
1
}(myExecutionContext)
}
}
However, no matter what I try, the number of parallel calls exceeds the limits I set in the application.conf. But it gets even stranger, because if I replace the line
Await.result(client.url("http://localhost:9000/test").get, Duration.Inf)
with
Thread.sleep(1000)
the limits ARE respected and the rate is properly limited.
What am I doing wrong and how can I fix it? If there is another way of rate limiting with the scala-ws library I would love to hear it.
I understand you want to keep using scala-ws ok, but what about something not relying on using specific ExecutionContext?
If you agree with that here's an idea... You create a RateLimitedWSClient component, which you will inject into your controllers instead of WSClient. This component should be a singleton, and support a single method def rateLimit[R](rateLimitClass: String)(request: WSClient => Future[R]). The rateLimitClass is meant to specify which ratelimit to apply to the current request, as you said you need to rate-limit requests to different API differently. The request function should be obvious.
Now my suggestion for the implementation is to use a simple akka-stream that will pipe your requests through the actual WSClient while rate-limiting using the throttle flow-stage (https://doc.akka.io/docs/akka/current/scala/stream/stages-overview.html#throttle):
val client: WSClient = ??? // injected into the component
// component initialization, for example create one flow per API
val queue =
Source
.queue[(Promise[_], WSClient => Future[_])](...) // keep this materialized value
.throttle(...)
.map { (promise, request) =>
promise.completeWith(request(client))
}
.to(Sink.ignore)
.run() // You have to get the materialized queue out of here!
def rateLimit[R](rateLimitClass: String)(request: WSClient => Future[R]): Future[R] = {
val result = Promise.empty[R]
// select which queue to use based on rateLimitClass
if (rateLimitClass == "API1")
queue.offer(result -> request)
else ???
result.future
}
The above is only rough code, I hope you get the idea. You can of course choose something else that a queue, or if you keep the queue, you have to decide how to handle overflows...
I'm working on a Finagle HTTP application where services were implemented without taking advantage of Futures and accessing Redis via a third-party lib. Such services have the following form:
class SampleOldService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val value: Int = getValueFromRedis()
val response: Response = buildResponse(value)
Future.value(response)
}
}
(They are much more complex than this -- the point here is that they are synchronous.)
At some point we began developing new services with Futures and also using the Finagle Redis API. Redis calls are encapsulated in a Store class. New services have the following form:
class SampleNewService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val value: Future[Int] = Store.getValue()
val response: Future[Response] = value map buildResponse
response
}
}
(They are much more complex than this -- the point here is that they are asynchronous.)
We began refactoring the old services to also take advantage of asynchronicity and Futures. We want to do this incrementally, without having to fully re-implement them at once.
The first step was to try to use the new Store class, with code like this:
class SampleOldService extends Service[Request, Response] {
def apply(req: Request): Future[Response] = {
val valueFuture: Future[Int] = Store.getValue()
val value: Int = Await.result(valueFuture)
val response: Response = buildResponse(value)
Future.value(response)
}
}
However, it proved to be catastrophic, because on heavy loads the requests to the old services are stuck at the Await.result() call. The new asynchronous services show no issue.
The problem seems to be related to exhaustion of thread and/or future pools. We have found several solutions on how to do synchronous calls (which perform I/O) from asynchronous calls by using custom pools (such as FuturePool), but not the other way round, which is our case.
So, what is the recommended way of calling asynchronous code (which perform I/O) from synchronous code in Finagle?
The easiest thing you can do is wrap your synchronous calls with a Thread Pool that return a Future. Twitter's util-core provides the FuturePool utility to achieve exactly that.
Something like that (untested code):
import com.twitter.util.FuturePool
val future = FuturePool.unboundedPool {
val result = myBlockingCall.await()
result
}
You can use FuturePool which are futures that run on top of a cached threadpool, but why do that when you can, have the service return a promise and set the value of the promise when you complete the future from the store class.
val p: Promise[Response] = Promise[Response]()
val value: Future[Int] = Store.getValue()
value onSuccess {x =>
val result: Response = buildResponse(x)
p.setValue(result)
}
p
In an attempt to get out of nested callbacks hell, at least for readability, I am using Scala futures in my vertx application.
I have a simple verticle handling HTTP requests. Upon receiving a request, the verticle calls a method doing async stuff and returning a Future. On future completion, HTTP response is sent to the client:
class HttpVerticle extends Verticle with VertxAccess {
val routeMatcher = RouteMatcher()
routeMatcher.get("/test", request => {
println(Thread.currentThread().getName)
// Using scala default ExecutionContext
import scala.concurrent.ExecutionContext.Implicits.global
val future = getTestFilePath()
future.onComplete {
case Success(filename) => {
println(Thread.currentThread().getName)
request.response().sendFile(filename)
}
case Failure(_) => request.response().sendFile("/default.txt")
}
})
def getTestFilePath(): Future[String] = {
// Some async stuff returning a Future
}
}
I noticed that using the usual (at least for me) ExecutionContext, the thread executing the completion of the future is not part of the vertx pool (this is what the prinln statement is for). The first prinln outputs vert.x-eventloop-thread-4 whereas the second outputs ForkJoinPool-1-worker-5.
Then, I supposed I had to use instead the vertx execution context:
class HttpVerticle extends Verticle with VertxAccess {
val routeMatcher = RouteMatcher()
routeMatcher.get("/test", request => {
println(Thread.currentThread().getName)
// Using vertx execution context
implicit val ec: ExecutionContext = VertxExecutionContext.fromVertxAccess(this)
val future = getTestFilePath()
future.onComplete {
case Success(filename) => {
println(Thread.currentThread().getName)
request.response().sendFile(filename)
}
case Failure(_) => request.response().sendFile("/default.txt")
}
})
def getTestFilePath(): Future[String] = {
// Some async stuff returning a Future
}
}
With this, the first and second println will output vert.x-eventloop-thread-4.
Note that this is a minimal example. In my real application code, I have multiple nested callbacks and thus chained futures.
My questions are:
Should I use the vertx execution context with all my futures in verticles ?
Same question for worker verticles.
If the answers to the above questions are yes, is there a case where, in a vertx application, I should not use the vertx application context ?
Note: I am using vertx 2.1.5 with lang-scala 1.0.0.
I got an answer from Lars Timm on the vert.x Google user group :
Yes, you should use the Vertx specific execution context. That ensures the futures are run on the correct thread/event loop. I haven't tried it with worker verticles but I don't see why it shouldn't work there also.
By the way, you should consider using the 1.0.1-M1 instead of 1.0.0. As far as I remember an error was fixed in the ExecutionContext in that version.
You also don't have to import the VertxExecutionContext. It's automatically done when you inherit from Verticle/VertxAccess.
If I need to write an integration test involving HTTP request via spray-can, how can I make sure that spray-can is using CallingThreadDispatcher?
Currently the following actor will print None
class Processor extends Actor {
override def receive = {
case Msg(n) =>
val response = (IO(Http) ? HttpRequest(GET, Uri("http://www.google.com"))).mapTo[HttpResponse]
println(response.value)
}
}
How can I make sure that the request is being performed on the same thread as the test (resulting in a synchronous request)?
It seems like strange way to do integration internal-testing as you don't mock the "Google", so is more like integration external-testing and synchronous TestActorRef doesn't much fit here. The requirement to control threads inside spray is also pretty tricky. However, if you really need that for http-request - it's possible. In general case, you have to setup several dispatchers in your application.conf:
"manager-dispatcher" (from Http.scala) to dispatch your IO(Http) ? req
"host-connector-dispatcher" to use it by HttpHostConnector(or ProxyHttpHostConnector) which actually dispatch your request
"settings-group-dispatcher" for Http.Connect
They all are decribed in Configuration Section of spray documentation. And they all are pointing to "akka.actor.default-dispatcher" (see Dispatchers), so you can change all of them by changing this one.
The problem here is that calling thread is not guaranteed to be your thread actually, so it will NOT help much with your tests. Just imagine if some of actors registers some handler, responding to your message:
//somewhere in spray...
case r#Request => registerHandler(() => {
...
sender ! response
})
The response may be sent from another thread, so response.value may still be None in current. Actually, the response will be sent from the listening thread of underlying socket library, indepently from your test's thread. Simply saying, request may be sent in one (your) thread, but the response is received in another.
If you really really need to block here, I would recommend you to move such code samples (like IO(Http) ? HttpRequest) out and mock them in any convinient way inside your tests. Smtng like that:
trait AskGoogle {
def okeyGoogle = IO(Http) ? HttpRequest(GET, Uri("http://www.google.com"))).mapTo[HttpResponse]
}
trait AskGoogleMock extends AskGoogle {
def okeyGoogle = Await.result(super.okeyGoogle, timeout)
}
class Processor extends Actor with AskGoogle {
override def receive = {
case Msg(n) =>
val response = okeyGoogle
println(response.value)
}
}
val realActor = system.actorOf(Props[Processor])
val mockedActor = TestActorRef[Processor with AskGoogleMock]
By the way, you can mock IO(HTTP) with another TestActorRef to the custom actor, which will do the outside requests for you - it should require minimal code changes if you have a big project.
I am doing some Http request processing using Spray. For a request I spin up an actor and send the payload to the actor for processing and after the actor is done working on the payload, I call context.stop(self) on the actor to wind the actor down.The idea is to prevent oversaturation of actors on the physical machine.
This is how I have things set up..
In httphandler.scala, I have the route set up as follows:
path("users"){
get{
requestContext => {
val userWorker = actorRefFactory.actorOf(Props(new UserWorker(userservice,requestContext)))
userWorker ! getusers //get user is a case object
}
}
} ~ path("users"){
post{
entity(as[UserInfo]){
requestContext => {
userInfo => {
val userWorker = actorRefFactory.actorOf(Props(new UserWorker(userservice,requestContext)))
userWorker ! userInfo
}
}
}
}
}
My UserWorker actor is defined as follows:
trait RouteServiceActor extends Actor{
implicit val system = context.system
import system.dispatcher
def processRequest[responseModel:ToResponseMarshaller](requestContex:RequestContext)(processFunc: => responseModel):Unit = {
Future{
processFunc
} onComplete {
case Success(result) => {
requestContext.complete(result)
}
case Failure(error) => requestContext.complete(error)
}
}
}
class UserWorker(userservice: UserServiceComponent#UserService,requestContext:RequestContext) extends RouteServiceActor{
def receive = {
case getusers => processRequest(requestContext){
userservice.getAllUsers
}
context.stop(self)
}
case userInfo:UserInfo => {
processRequest(requestContext){
userservice.createUser(userInfo)
}
context.stop(self)
}
}
My first question is, am I handling the request in a true asynchronous fashion? What are some of the pitfalls with my code?
My second question is how does the requestContext.complete work? Since the original request processing thread is no longer there, how does the requestContext send the result of the computation back to the client.
My third question is that since I am calling context.stop(self) after each of my partial methods, is it possible that I terminate the worker while it is in the midst of processing a different message.
What I mean is that while the Actor receives a message to process getusers, the same actor is done processing UserInfo and terminates the Actor before it can get to the "getusers" message. I am creating new actors upon every request, but is it possible that under the covers, the actorRefFactory provides a reference to a previously created actor, instead of a new one?
I am pretty confused by all the abstractions and it would be great if somebody could break it down for me.
Thanks
1) Is the request handled asynchronously? Yes, it is. However, you don't gain much with your per-request actors if you immediately delegate the actual processing to a future. In this simple case a cleaner way would be to write your route just as
path("users") {
get {
complete(getUsers())
}
}
def getUsers(): Future[Users] = // ... invoke userservice
Per-request-actors make more sense if you also want to make route-processing logic run in parallel or if handling the request has more complex requirements, e.g. if you need to query things from multiple service in parallel or need to keep per-request state while some background services are processing the request. See https://github.com/NET-A-PORTER/spray-actor-per-request for some information about this general topic.
2) How does requestContext.complete work? Behind the scenes it sends the HTTP response to the spray-can HTTP connection actor as a normal actor message "tell". So, basically the RequestContext just wraps an ActorRef to the HTTP connection which is safe to use concurrently.
3) Is it possible that "the worker" is terminated by context.stop(self)? I think there's some confusion about how things are scheduled behind the scenes. Of course, you are terminating the actor with context.stop but that just stops the actor but not any threads (as threads are managed completely independently from actor instances in Akka). As you didn't really make use of an actor's advantages, i.e. encapsulating and synchronizing access to mutable state, everything should work (but as said in 1) is needlessly complex for this use case). The akka documentation has lots of information about how actors, futures, dispatchers, and ExecutionContexts work together to make everything work.
In addition to jrudolph answer your spray routing structure shouldn't even compile, cause in your post branch you don't explicitly specify a requestContext. This structure can be simplified a bit to this:
def spawnWorker(implicit ctx: RequestContext): ActorRef = {
actorRefFactory actorOf Props(new UserWorker(userservice, ctx))
}
lazy val route: Route = {
path("users") { implicit ctx =>
get {
spawnWorker ! getUsers
} ~
(post & entity(as[UserInfo])) {
info => spawnWorker ! info
}
}
}
The line info => spawnWorker ! info can be also simplified to spawnWorker ! _.
Also there is an important point concerning explicit ctx declaration and complete directive. If you explicitly declared ctx in your route, you can't use complete directive, you have to explicitly write ctx.complete(...), link on this issue