My http request becomes null inside an Akka future - scala

My server application uses Scalatra, with json4s, and Akka.
Most of the requests it receives are POSTs, and they return immediately to the client with a fixed response. The actual responses are sent asynchronously to a server socket at the client. To do this, I need to getRemoteAddr from the http request. I am trying with the following code:
case class MyJsonParams(foo:String, bar:Int)
class MyServices extends ScalatraServlet {
implicit val formats = DefaultFormats
post("/test") {
withJsonFuture[MyJsonParams]{ params =>
// code that calls request.getRemoteAddr goes here
// sometimes request is null and I get an exception
println(request)
}
}
def withJsonFuture[A](closure: A => Unit)(implicit mf: Manifest[A]) = {
contentType = "text/json"
val params:A = parse(request.body).extract[A]
future{
closure(params)
}
Ok("""{"result":"OK"}""")
}
}
The intention of the withJsonFuture function is to move some boilerplate out of my route processing.
This sometimes works (prints a non-null value for request) and sometimes request is null, which I find quite puzzling. I suspect that I must be "closing over" the request in my future. However, the error also happens with controlled test scenarios when there are no other requests going on. I would imagine request to be immutable (maybe I'm wrong?)
In an attempt to solve the issue, I have changed my code to the following:
case class MyJsonParams(foo:String, bar:Int)
class MyServices extends ScalatraServlet {
implicit val formats = DefaultFormats
post("/test") {
withJsonFuture[MyJsonParams]{ (addr, params) =>
println(addr)
}
}
def withJsonFuture[A](closure: (String, A) => Unit)(implicit mf: Manifest[A]) = {
contentType = "text/json"
val addr = request.getRemoteAddr()
val params:A = parse(request.body).extract[A]
future{
closure(addr, params)
}
Ok("""{"result":"OK"}""")
}
}
This seems to work. However, I really don't know if it is still includes any bad concurrency-related programming practice that could cause an error in the future ("future" meant in its most common sense = what lies ahead :).

Scalatra is not so well suited for asynchronous code. I recently stumbled on the very same problem as you.
The problem is that scalatra tries to make the code as declarative as possible by exposing a dsl that removes as much fuss as possible, and in particular does not require you to explicitly pass data around.
I'll try to explain.
In your example, the code inside post("/test") is an anonymous function. Notice that it does not take any parameter, not even the current request object.
Instead, scalatra will store the current request object inside a thread local value just before it calls your own handler, and you can then get it back through ScalatraServlet.request.
This is the classical Dynamic Scope pattern. It has the advantage that you can write many utility methods that access the current request and call them from your handlers, without explicitly passing the request.
Now, the problem comes when you use asynchronous code, as you do.
In your case, the code inside withJsonFuture executes on another thread than the original thread that the handler was initially called (it will execute on a thread from the ExecutionContext's thread pool).
Thus when accessing the thread local, you are accessing a totally distinct instance of the thread local variable.
Simply put, the classical Dynamic Scope pattern is no fit in an asynchronous context.
The solution here is to capture the request at the very start of your handler, and then exclusively reference that:
post("/test") {
val currentRequest = request
withJsonFuture[MyJsonParams]{ params =>
// code that calls request.getRemoteAddr goes here
// sometimes request is null and I get an exception
println(currentRequest)
}
}
Quite frankly, this is too easy to get wrong IMHO, so I would personally avoid using Scalatra altogether if you are in an synchronous context.

I don't know Scalatra, but it's fishy that you are accessing a value called request that you do not define yourself. My guess is that it is coming as part of extending ScalatraServlet. If that's the case, then it's probably mutable state that it being set (by Scalatra) at the start of the request and then nullified at the end. If that's happening, then your workaround is okay as would be assigning request to another val like val myRequest = request before the future block and then accessing it as myRequest inside of the future and closure.

I do not know scalatra but at first glance, the withJsonFuture function returns an OK but also creates a thread via the future { closure(addr, params) } call.
If that latter thread is run after the OK is processed, the response has been sent and the request is closed/GCed.
Why create a Future to run you closure ?
if withJsonFuture needs to return a Future (again, sorry, I do not know scalatra), you should wrap the whole body of that function in a Future.

Try to put with FutureSupport on your class declaration like this
class MyServices extends ScalatraServlet with FutureSupport {}

Related

Scala Dynamic Variable problem with Akka-Http asynchronous requests

My application is using Akka-Http to handle requests. I would like to somehow pass incoming extracted request context (fe. token) from routes down to other methods calls without explicitly passing them (because that would require modyfing a lot of code).
I tried using Dynamic Variable to handle this but it seems to not work as intended.
Example how I used and tested this:
object holding Dynamic Variable instance:
object TestDynamicContext {
val dynamicContext = new DynamicVariable[String]("")
}
routes wrapper for extracting and setting request context (token)
private def wrapper: Directive0 = {
Directive { routes => ctx =>
val newValue = UUID.randomUUID().toString
TestDynamicContext.dynamicContext.withValue(newValue) {
routes(())(ctx)
}
}
I expected to all calls of TestDynamicContext.dynamicContext.value for single request under my wrapper to return same value defined in said wrapper but thats not the case. I verified this by generating for each request separate UUID and passing it explicitly down the method calls - but for single request TestDynamicContext.dynamicContext.value sometimes returns different values.
I think its worth mentioning that some operations underneath use Futures and I though that this may be the issue but solution proposed in this thread did not solve this for me: https://stackoverflow.com/a/49256600/16511727.
If somebody has any suggestions how to handle this (not necessarily using Dynamic Variable) I would be very grateful.

How should I get this value through DDD to async code in Akka HTTP

I'm trying to write an Akka HTTP webservice using domain-driven design but I'm struggling to pass technical data received by the webservice to the code doing the work inside a Future, namely a correlationId sent by the client to my webservice.
My understanding of DDD is that as a choice of implementation, the correlationId shouldn't appear in the domain:
package domain
trait MyRepository {
def startWork(): Future[Unit]
}
To make it available to my implementation code without it being a parameter of the method, I'm thinking of using thread-local storage like org.slf4j.MDC or a ThreadLocal. However, I don't know if that would work or if several calls to the webservice would be handled by the same thread and overwrite the value.
Here is the implementation:
package infra
class MyRepository(implicit executor: ExecutionContextExecutor) extends domain.MyRepository {
override def startWork(): Future[Unit] = {
Future {
val correlationId = ??? // MDC.get("correlationId") ?
log(s"The correlationId is $correlationId")
}
}
}
And the route in my webservice:
val repo = new infra.MyRepository()
val route = path("my" / "path") {
post {
parameter('correlationId) { correlationId =>
??? // MDC.put("correlationId", correlationId) ?
onComplete(repo.startWork()) {
complete(HttpResponse(StatusCodes.OK))
}
}
}
}
My question is twofold:
Is my general design sound and good DDD?
Would using org.slf4j.MDC or a ThreadLocal work or is there a better / more Akka-friendly way to implement it?
Thread-locals (including MDC, though a Lightbend subscription includes tooling to propagate the MDC alongside messages and futures) in Akka are generally a poor idea because it's generally not guaranteed that a given task (a Future in this case, or the actor handling a sent message) will execute on the same thread as the thread that requested the task (and in the specific case where that task is performing [likely-blocking] interactions with an external service/DB (implied by the use of Future {}), you pretty much don't want that to happen). Further, even if the task ends up executing on the same thread that requested the task, it's somewhat unlikely that no other task which could have mutated the MDC/thread-local would've executed in the meantime.
I myself don't see a problem with passing the correlation ID as an argument to startWork: you've already effectively exposed it by passing it through the HTTP endpoint.

download a file with play web api (async)

I am trying to download a file using play api framework. Since all the data access layer has already been implemented with Futures I would like to get download to work with async action as well. However, the following piece of code does not work. And by not working I mean that the file sent to the client is not the same as the file on server.
val sourcePath = "/tmp/sample.pdf"
def downloadAsync = Action.async {
Future.successful(Ok.sendFile(new java.io.File(sourcePath)))
}
However, this piece works:
def download = Action {
Ok.sendFile(new java.io.File(sourcePath))
}
Any suggestion on how I can get the async method to work?
You actually don't need to use Action.async here, since Ok.sendFile is non-blocking already. From the docs:
Play actions are asynchronous by default. For instance, in the controller code below, the { Ok(...) } part of the code is not the method body of the controller. It is an anonymous function that is being passed to the Action object’s apply method, which creates an object of type Action. Internally, the anonymous function that you wrote will be called and its result will be enclosed in a Future.
def echo = Action { request =>
Ok("Got request [" + request + "]")
}
Note: Both Action.apply and Action.async create Action objects that are handled internally in the same way. There is a single kind of Action, which is asynchronous, and not two kinds (a synchronous one and an asynchronous one). The .async builder is just a facility to simplify creating actions based on APIs that return a Future, which makes it easier to write non-blocking code.
In other words, at this specific case, don't worry about wrapping your Result into a Future and just return Ok.sendFile.
Finally, both versions works as expected to me (the file was properly delivered). Maybe you are having another problem not related to how you have declared your actions.

How can I perform session based logging in Play Framework

We are currently using the Play Framework and we are using the standard logging mechanism. We have implemented a implicit context to support passing username and session id to all service methods. We want to implement logging so that it is session based. This requires implementing our own logger. This works for our own logs but how do we do the same for basic exception handling and logs as a result. Maybe there is a better way to capture this then with implicits or how can we override the exception handling logging. Essentially, we want to get as many log messages to be associated to the session.
It depends if you are doing reactive style development or standard synchronous development:
If standard synchronous development (i.e. no futures, 1 thread per request) - then I'd recommend you just use MDC, which adds values onto Threadlocal for logging. You can then customise the output in logback / log4j. When you get the username / session (possibly in a Filter or in your controller), you can then set the values there and then and you do not need to pass them around with implicits.
If you are doing reactive development you have a couple options:
You can still use MDC, except you'd have to use a custom Execution Context that effectively copies the MDC values to the thread, since each request could in theory be handled by multiple threads. (as described here: http://code.hootsuite.com/logging-contextual-info-in-an-asynchronous-scala-application/)
The alternative is the solution which I tend to use (and close to what you have now): You could make a class which represents MyAppRequest. Set the username, session info, and anything else, on that. You can continue to pass it around as an implicit. However, instead of using Action.async, you make your own MyAction class which an be used like below
myAction.async { implicit myRequest => //some code }
Inside the myAction, you'd have to catch all Exceptions and deal with future failures, and do the error handling manually instead of relying on the ErrorHandler. I often inject myAction into my Controllers and put common filter functionality in it.
The down side of this is, it is just a manual method. Also I've made MyAppRequest hold a Map of loggable values which can be set anywhere, which means it had to be a mutable map. Also, sometimes you need to make more than one myAction.async. The pro is, it is quite explicit and in your control without too much ExecutionContext/ThreadLocal magic.
Here is some very rough sample code as a starter, for the manual solution:
def logErrorAndRethrow(myrequest:MyRequest, x:Throwable): Nothing = {
//log your error here in the format you like
throw x //you can do this or handle errors how you like
}
class MyRequest {
val attr : mutable.Map[String, String] = new mutable.HashMap[String, String]()
}
//make this a util to inject, or move it into a common parent controller
def myAsync(block: MyRequest => Future[Result] ): Action[AnyContent] = {
val myRequest = new MyRequest()
try {
Action.async(
block(myRequest).recover { case cause => logErrorAndRethrow(myRequest, cause) }
)
} catch {
case x:Throwable =>
logErrorAndRethrow(myRequest, x)
}
}
//the method your Route file refers to
def getStuff = myAsync { request:MyRequest =>
//execute your code here, passing around request as an implicit
Future.successful(Results.Ok)
}

Actor-based webservice - How to do it properly?

In the past few months, me and my colleagues have successfully built a server-side system for dispatching push notifications to iPhone devices. Basically, a user registers for these notifications via a RESTful webservice (Spray-Server, recently updated to use Spray-can as the HTTP layer), and the logic schedules one or multiple messages for dispatch in the future, using Akka's scheduler.
This system, as we built it, simply works: it can handle hundreds, maybe even thousands of HTTP requests a second, and can send out notifications at a rate of 23,000 per second - possibly even more if we reduce log output, add multiple notification sender actors (and thus more connections with Apple), and there might be some optimization to be done in the Java library we use (java-apns).
This question is about how to do it Right(tm). My colleague, much more knowledgeable about Scala and actor-based systems in general, noted how the application isn't a 'pure' actor-based system - and he's right. What I'm wondering now is how to do it Right.
At the moment, we have a single Spray HttpService actor, not subclassed, that is initialized with a set of directives that outlines our HTTP service logic. Currently, very much simplified, we have directives like this:
post {
content(as[SomeBusinessObject]) { businessObject => request =>
// store the business object in a MongoDB back-end and wait for the ID to be
// returned; we want to send this back to the user.
val businessObjectId = persister !! new PersistSchedule(businessObject)
request.complete("/businessObject/%s".format(businessObjectId))
}
}
Now, if I get this right, 'waiting for a response' from an actor is a no-no in actor-based programming (plus the !! is deprecated). What I believe is the 'correct' way to do it is to pass the request object over to the persister actor in a message, and have it call request.complete as soon as it's received a generated ID from the back-end.
I have rewritten one of the routes in my application to do just this; in the message that is sent to the actor, the request object / reference is also sent. This seems to work like it's supposed to:
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
My main concern here is that we seem to pass the request object to the 'business logic', in this case the persister. The persister now gets additional responsibility, i.e. call request.complete, and knowledge about what system it runs in, i.e. that it's part of a webservice.
What would be the correct way to handle a situation like this, so that the persister actor becomes unaware of it being part of a http service, and doesn't need to know how to output the generated ID?
I'm thinking that the request should still be passed to the persister actor, but instead of the persister actor calling request.complete, it sends a message back to the HttpService actor (a SchedulePersisted(request, businessObjectId) message), which simply calls request.complete("/businessObject/%s".format(businessObjectId)). Basically:
def receive = {
case SchedulePersisted(request, businessObjectId) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
val directives = post {
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
}
Am I on the right track with this approach?
A smaller secondary spray-server specific question, is it okay to subclass HttpService and override the receive method, or will I break things that way? (I have no clue about subclassing actors, or how to pass unrecognized messages to the 'parent' actor)
Final question, is passing the request object / reference around in actor messages that may pass throughout the entire application an okay approach, or is there a better way to 'remember' what request should be sent a response after flowing the request through the application?
In regards to your first question, yes, you are on the right track. (Although I would also like to see some alternative ways to handle this sort of issue).
One suggestion I have is to insulate the persister actor from knowing about requests at all. You can pass the request as an Any type. Your matcher in your service code can automagically cast the cookie back into a Request.
case class SchedulePersisted(businessObjectId: String, cookie: Any)
// in your actor
override def receive = super.receive orElse {
case SchedulePersisted(businessObjectId, request: Request) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
In regards to your second question, actor classes are really no different than regular classes. But you do need to make sure you call the superclass's receive method, so that it can handle its own messages. I had some other ways of doing this in my original answer, but I think I prefer chaining partial functions like this:
class SpecialHttpService extends HttpService {
override def receive = super.receive orElse {
case SpecialMessage(x) =>
// handle special message
}
}
You could also use the produce directive. It allows you to decouple the actual marshalling from the request completion:
get {
produce(instanceOf[Person]) { personCompleter =>
databaseActor ! ShowPersonJob(personCompleter)
}
}
The produce directive in this example extracts a function Person => Unit that you can use to complete the request transparently deep within the business logic layer, which should not be aware of spray.
https://github.com/spray/spray/wiki/Marshalling-Unmarshalling