How should I get this value through DDD to async code in Akka HTTP

How should I get this value through DDD to async code in Akka HTTP - scala

I'm trying to write an Akka HTTP webservice using domain-driven design but I'm struggling to pass technical data received by the webservice to the code doing the work inside a Future, namely a correlationId sent by the client to my webservice.
My understanding of DDD is that as a choice of implementation, the correlationId shouldn't appear in the domain:
package domain
trait MyRepository {
def startWork(): Future[Unit]
}
To make it available to my implementation code without it being a parameter of the method, I'm thinking of using thread-local storage like org.slf4j.MDC or a ThreadLocal. However, I don't know if that would work or if several calls to the webservice would be handled by the same thread and overwrite the value.
Here is the implementation:
package infra
class MyRepository(implicit executor: ExecutionContextExecutor) extends domain.MyRepository {
override def startWork(): Future[Unit] = {
Future {
val correlationId = ??? // MDC.get("correlationId") ?
log(s"The correlationId is $correlationId")
}
}
}
And the route in my webservice:
val repo = new infra.MyRepository()
val route = path("my" / "path") {
post {
parameter('correlationId) { correlationId =>
??? // MDC.put("correlationId", correlationId) ?
onComplete(repo.startWork()) {
complete(HttpResponse(StatusCodes.OK))
}
}
}
}
My question is twofold:
Is my general design sound and good DDD?
Would using org.slf4j.MDC or a ThreadLocal work or is there a better / more Akka-friendly way to implement it?

Thread-locals (including MDC, though a Lightbend subscription includes tooling to propagate the MDC alongside messages and futures) in Akka are generally a poor idea because it's generally not guaranteed that a given task (a Future in this case, or the actor handling a sent message) will execute on the same thread as the thread that requested the task (and in the specific case where that task is performing [likely-blocking] interactions with an external service/DB (implied by the use of Future {}), you pretty much don't want that to happen). Further, even if the task ends up executing on the same thread that requested the task, it's somewhat unlikely that no other task which could have mutated the MDC/thread-local would've executed in the meantime.
I myself don't see a problem with passing the correlation ID as an argument to startWork: you've already effectively exposed it by passing it through the HTTP endpoint.

Related

How to avoid saving state in Scala class?

I am currently writing a client application in Scala which makes HTTP requests to an API. In this client application I have implemented a service connector which encapsulates all API related logic. Before making API calls, I want to authenticate the user, but I want to abstract this process. Which means that my actor would only call the service connector to initiate the API call, something like that:
class MessageProcessor(credentials: Credentials) extends Actor with ActorLogging {
implicit val ec = context.dispatcher
override def receive: Receive = {
case sendMsg: SendMessage =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
}
def sendMessage(msg: SendMessage) = {
ServiceConnector.sendMessage(credentials, msg).map { result =>
// My result
}
}
}
object MessageProcessor {
def props(credentials: Credentials) = Props(classOf[MessageProcessor], credentials)
}
In my service connector, I want to somehow save "the Scala way" the JWT token and if I am not yet authenticated, send an authentication request before making the actual API call.
How can I code such a service in an immutable manner with Futures in mind?
I thought about creating additional actors and just sending messages around with the token, but is this really necessary?

You need multiple Akka states to do it the proper "Scala way". I'm not completely sure how your API works, but the following example shows a basic approach. In its first state, it authenticates before sending the message. Once the authentication is confirmed, it sends the message. All following messages are immediately sent. If the authentication is lost somehow, you can also add a logout or timeout case that switches back to the first receive state.
class MessageProcessor(credentials: Credentials) extends Actor with ActorLogging {
implicit val ec = context.dispatcher
override def receive: Receive = {
case sendMsg: SendMessage =>
log.info(s"Authenticating...")
sender ! NotAuthenticated(credentials) // or authenticate somehow
context become waitingAuthentication(sendMsg)
}
def waitingAuthentication(sendMsg: SendMessage): Receive = {
case _: AuthenticationConfirmation =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
context become authenticated
}
def authenticated: Receive = {
case sendMsg: SendMessage =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
}
}
It's just an example and doesn't consider all cases (e.g., a SendMessage during waitingAuthentication, therefore you would need a queue of SendMessages). If multiple actors need to know the authentication state or the credentials, you need to broadcast to them if you don't want a bottleneck actor that handles and verifies all messages. In that case, all of them would also need multiple authentication states as described above.

Actors Are Designed For State
This seems like a false exercise. The entire point of Actors is that they can warehouse state. From the documentation specifically on state:
The good news is that Akka actors conceptually each have their own
light-weight thread, which is completely shielded from the rest of the
system. This means that instead of having to synchronize access using
locks you can just write your actor code without worrying about
concurrency at all.
Therefore, "the scala way" is just to keep it in a variable within Actors.
One of the reasons to use Actors instead of Futures is so you can maintain state. Why choose Actors and then dismiss one of their primary advantages???
Your question is the equivalent of "how do I use HashMap without doing any hashing (in a scala way)?"

Lightweight eventing Plain Futures or Akka

Have a few use cases as follow.
1)createUser API call is made via front end. Once this call succeeds, meaning data is saved to db successfully, return success to the frond end. API contract ends there between front and backend.
2)Now backend needs to generate and fire CreateUser event which creates user into third party app (for the sake of example we can say it'll createUser into an external entitlement system). This is fully asynchronous and background type process where client is neither aware of it nor waiting for this API's success or failure. But all calls to this CreateUser event must be logged along with it's failure or success for auditing and remediation(in case of failure) purposes.
First approach is that we design Future based async APIs for these async events (rest of the app is uses Futures, async heavily), log incoming events and success/failure of result into db.
Second approach is that we use Akka and have individual actor for these events (e.g. CreateUser is one example). Which may look something like
class CreateUserActor extends Actor {
def receive = {
case CreateUserEvent(user, role) =>
val originalSender = sender
val res = Future {
blocking {
//persist CreateUserEvent to db
SomeService.createUser(user, role)
}
}
res onComplete {
case Success(u) => //persist success to db
case Failure(e) => //persist failure to db
}
}
Third approach Use Akka Persistence so persisting of events can happen out of the box with event sourcing journaling. however second persistence of event's success or failure will be manual(write code for it). Though this third approach may look promising it may not pay off well since now we're relying on Akka persistence for persisting events, second requirement of persisting success/failure of event is still manual, and now have to maintain one more storage(persisted journal etc) so not sure if we're buying much here?
Second approach will require to write persisting code for both cases (incoming events and results of the events).
First approach may not look very promising.
Although it may sound like it I didn't intend to create a question that may sound like "Opinion based" but trying to cater in genuine advise with its pros/cons on the mentioned approaches or anything else that may fit in well here.
FYI: This particular application is a play application running on a play server so using Actors isn't an issue.

Since this is a Play application you could use the Akka event stream to publish events without needing a reference to the backend worker actor.
For example, with the following in actors/Subscriber.scala:
package actors
import akka.actor.Actor
import model._
class Subscriber extends Actor {
context.system.eventStream.subscribe(self, classOf[DomainEvent])
def receive = {
case event: DomainEvent =>
println("Received DomainEvent: " + event)
}
}
... and something like this in model/events.scala:
package model
trait DomainEvent
case class TestEvent(message: String) extends DomainEvent
... your controller could publish a TestEvent like this:
object Application extends Controller {
import akka.actor.Props
import play.libs.Akka
Akka.system.actorOf(Props(classOf[actors.Subscriber])) // Create the backend actor
def index = Action {
Akka.system.eventStream.publish(model.TestEvent("message")) // publish an event
Ok(views.html.index("Hi!"))
}
}

Spray.io - delegate processing to another actor(s)

I implement a REST service using Spray.io framework. Such service must receive some "search" queries, process them and send result back to the client(s). The code that perfrom searching located in separate actor - SearchActor, so after receiving (JSON) query from user, i re-send (using ask pattern) this query to my SearchActor. But what i don't really understand it's how i must implement interaction between spray.io route actor and my SearchActor.
I see here several variants but which one is more correct and why?
Create one instance of SearchActor at startup and send every request to this actor
For every request create new instance of SearchActor
Create pool of SearchActor actors at startup and send requests to this pool

You're not forced to use the ask pattern. In fact, it will create a thread for each of your request and this is probably not what you want. I would recommend that you use a tell instead. You do this by spawning a new Actor for each request (less expensive than a thread), that has the RequestContext in one of its constructor fields. You will use this context to give the response back, typically with its complete method.
Example code.
class RESTActor extends HttpService {
val route = path("mypath") ~ post {
entity(as[SearchActor.Search]) { search => ctx =>
SearchActor(ctx) ! search
}
}
}
case class SearchActor(ctx: RequestContext) {
def receive = {
case msg: Search => //... search process
case msg: Result => ctx.complete(msg) // sends back reply
}
}

Variant #1 is out of question after the initial implementation - you would want to scale out, so single blocking actor is bad.
Variants #2 and #3 are not very different - creation of new actor is cheap and has minimal overhead. As your actors may die often (i.e. backend is not available), i would say that #2 is the way to go.
Concrete implementation idea is shown at http://techblog.net-a-porter.com/2013/12/ask-tell-and-per-request-actors/

My http request becomes null inside an Akka future

My server application uses Scalatra, with json4s, and Akka.
Most of the requests it receives are POSTs, and they return immediately to the client with a fixed response. The actual responses are sent asynchronously to a server socket at the client. To do this, I need to getRemoteAddr from the http request. I am trying with the following code:
case class MyJsonParams(foo:String, bar:Int)
class MyServices extends ScalatraServlet {
implicit val formats = DefaultFormats
post("/test") {
withJsonFuture[MyJsonParams]{ params =>
// code that calls request.getRemoteAddr goes here
// sometimes request is null and I get an exception
println(request)
}
}
def withJsonFuture[A](closure: A => Unit)(implicit mf: Manifest[A]) = {
contentType = "text/json"
val params:A = parse(request.body).extract[A]
future{
closure(params)
}
Ok("""{"result":"OK"}""")
}
}
The intention of the withJsonFuture function is to move some boilerplate out of my route processing.
This sometimes works (prints a non-null value for request) and sometimes request is null, which I find quite puzzling. I suspect that I must be "closing over" the request in my future. However, the error also happens with controlled test scenarios when there are no other requests going on. I would imagine request to be immutable (maybe I'm wrong?)
In an attempt to solve the issue, I have changed my code to the following:
case class MyJsonParams(foo:String, bar:Int)
class MyServices extends ScalatraServlet {
implicit val formats = DefaultFormats
post("/test") {
withJsonFuture[MyJsonParams]{ (addr, params) =>
println(addr)
}
}
def withJsonFuture[A](closure: (String, A) => Unit)(implicit mf: Manifest[A]) = {
contentType = "text/json"
val addr = request.getRemoteAddr()
val params:A = parse(request.body).extract[A]
future{
closure(addr, params)
}
Ok("""{"result":"OK"}""")
}
}
This seems to work. However, I really don't know if it is still includes any bad concurrency-related programming practice that could cause an error in the future ("future" meant in its most common sense = what lies ahead :).

Scalatra is not so well suited for asynchronous code. I recently stumbled on the very same problem as you.
The problem is that scalatra tries to make the code as declarative as possible by exposing a dsl that removes as much fuss as possible, and in particular does not require you to explicitly pass data around.
I'll try to explain.
In your example, the code inside post("/test") is an anonymous function. Notice that it does not take any parameter, not even the current request object.
Instead, scalatra will store the current request object inside a thread local value just before it calls your own handler, and you can then get it back through ScalatraServlet.request.
This is the classical Dynamic Scope pattern. It has the advantage that you can write many utility methods that access the current request and call them from your handlers, without explicitly passing the request.
Now, the problem comes when you use asynchronous code, as you do.
In your case, the code inside withJsonFuture executes on another thread than the original thread that the handler was initially called (it will execute on a thread from the ExecutionContext's thread pool).
Thus when accessing the thread local, you are accessing a totally distinct instance of the thread local variable.
Simply put, the classical Dynamic Scope pattern is no fit in an asynchronous context.
The solution here is to capture the request at the very start of your handler, and then exclusively reference that:
post("/test") {
val currentRequest = request
withJsonFuture[MyJsonParams]{ params =>
// code that calls request.getRemoteAddr goes here
// sometimes request is null and I get an exception
println(currentRequest)
}
}
Quite frankly, this is too easy to get wrong IMHO, so I would personally avoid using Scalatra altogether if you are in an synchronous context.

I don't know Scalatra, but it's fishy that you are accessing a value called request that you do not define yourself. My guess is that it is coming as part of extending ScalatraServlet. If that's the case, then it's probably mutable state that it being set (by Scalatra) at the start of the request and then nullified at the end. If that's happening, then your workaround is okay as would be assigning request to another val like val myRequest = request before the future block and then accessing it as myRequest inside of the future and closure.

I do not know scalatra but at first glance, the withJsonFuture function returns an OK but also creates a thread via the future { closure(addr, params) } call.
If that latter thread is run after the OK is processed, the response has been sent and the request is closed/GCed.
Why create a Future to run you closure ?
if withJsonFuture needs to return a Future (again, sorry, I do not know scalatra), you should wrap the whole body of that function in a Future.

Try to put with FutureSupport on your class declaration like this
class MyServices extends ScalatraServlet with FutureSupport {}

Actor-based webservice - How to do it properly?

In the past few months, me and my colleagues have successfully built a server-side system for dispatching push notifications to iPhone devices. Basically, a user registers for these notifications via a RESTful webservice (Spray-Server, recently updated to use Spray-can as the HTTP layer), and the logic schedules one or multiple messages for dispatch in the future, using Akka's scheduler.
This system, as we built it, simply works: it can handle hundreds, maybe even thousands of HTTP requests a second, and can send out notifications at a rate of 23,000 per second - possibly even more if we reduce log output, add multiple notification sender actors (and thus more connections with Apple), and there might be some optimization to be done in the Java library we use (java-apns).
This question is about how to do it Right(tm). My colleague, much more knowledgeable about Scala and actor-based systems in general, noted how the application isn't a 'pure' actor-based system - and he's right. What I'm wondering now is how to do it Right.
At the moment, we have a single Spray HttpService actor, not subclassed, that is initialized with a set of directives that outlines our HTTP service logic. Currently, very much simplified, we have directives like this:
post {
content(as[SomeBusinessObject]) { businessObject => request =>
// store the business object in a MongoDB back-end and wait for the ID to be
// returned; we want to send this back to the user.
val businessObjectId = persister !! new PersistSchedule(businessObject)
request.complete("/businessObject/%s".format(businessObjectId))
}
}
Now, if I get this right, 'waiting for a response' from an actor is a no-no in actor-based programming (plus the !! is deprecated). What I believe is the 'correct' way to do it is to pass the request object over to the persister actor in a message, and have it call request.complete as soon as it's received a generated ID from the back-end.
I have rewritten one of the routes in my application to do just this; in the message that is sent to the actor, the request object / reference is also sent. This seems to work like it's supposed to:
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
My main concern here is that we seem to pass the request object to the 'business logic', in this case the persister. The persister now gets additional responsibility, i.e. call request.complete, and knowledge about what system it runs in, i.e. that it's part of a webservice.
What would be the correct way to handle a situation like this, so that the persister actor becomes unaware of it being part of a http service, and doesn't need to know how to output the generated ID?
I'm thinking that the request should still be passed to the persister actor, but instead of the persister actor calling request.complete, it sends a message back to the HttpService actor (a SchedulePersisted(request, businessObjectId) message), which simply calls request.complete("/businessObject/%s".format(businessObjectId)). Basically:
def receive = {
case SchedulePersisted(request, businessObjectId) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
val directives = post {
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
}
Am I on the right track with this approach?
A smaller secondary spray-server specific question, is it okay to subclass HttpService and override the receive method, or will I break things that way? (I have no clue about subclassing actors, or how to pass unrecognized messages to the 'parent' actor)
Final question, is passing the request object / reference around in actor messages that may pass throughout the entire application an okay approach, or is there a better way to 'remember' what request should be sent a response after flowing the request through the application?

In regards to your first question, yes, you are on the right track. (Although I would also like to see some alternative ways to handle this sort of issue).
One suggestion I have is to insulate the persister actor from knowing about requests at all. You can pass the request as an Any type. Your matcher in your service code can automagically cast the cookie back into a Request.
case class SchedulePersisted(businessObjectId: String, cookie: Any)
// in your actor
override def receive = super.receive orElse {
case SchedulePersisted(businessObjectId, request: Request) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
In regards to your second question, actor classes are really no different than regular classes. But you do need to make sure you call the superclass's receive method, so that it can handle its own messages. I had some other ways of doing this in my original answer, but I think I prefer chaining partial functions like this:
class SpecialHttpService extends HttpService {
override def receive = super.receive orElse {
case SpecialMessage(x) =>
// handle special message
}
}

You could also use the produce directive. It allows you to decouple the actual marshalling from the request completion:
get {
produce(instanceOf[Person]) { personCompleter =>
databaseActor ! ShowPersonJob(personCompleter)
}
}
The produce directive in this example extracts a function Person => Unit that you can use to complete the request transparently deep within the business logic layer, which should not be aware of spray.
https://github.com/spray/spray/wiki/Marshalling-Unmarshalling