Actor-based webservice - How to do it properly? - scala

In the past few months, me and my colleagues have successfully built a server-side system for dispatching push notifications to iPhone devices. Basically, a user registers for these notifications via a RESTful webservice (Spray-Server, recently updated to use Spray-can as the HTTP layer), and the logic schedules one or multiple messages for dispatch in the future, using Akka's scheduler.
This system, as we built it, simply works: it can handle hundreds, maybe even thousands of HTTP requests a second, and can send out notifications at a rate of 23,000 per second - possibly even more if we reduce log output, add multiple notification sender actors (and thus more connections with Apple), and there might be some optimization to be done in the Java library we use (java-apns).
This question is about how to do it Right(tm). My colleague, much more knowledgeable about Scala and actor-based systems in general, noted how the application isn't a 'pure' actor-based system - and he's right. What I'm wondering now is how to do it Right.
At the moment, we have a single Spray HttpService actor, not subclassed, that is initialized with a set of directives that outlines our HTTP service logic. Currently, very much simplified, we have directives like this:
post {
content(as[SomeBusinessObject]) { businessObject => request =>
// store the business object in a MongoDB back-end and wait for the ID to be
// returned; we want to send this back to the user.
val businessObjectId = persister !! new PersistSchedule(businessObject)
request.complete("/businessObject/%s".format(businessObjectId))
}
}
Now, if I get this right, 'waiting for a response' from an actor is a no-no in actor-based programming (plus the !! is deprecated). What I believe is the 'correct' way to do it is to pass the request object over to the persister actor in a message, and have it call request.complete as soon as it's received a generated ID from the back-end.
I have rewritten one of the routes in my application to do just this; in the message that is sent to the actor, the request object / reference is also sent. This seems to work like it's supposed to:
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
My main concern here is that we seem to pass the request object to the 'business logic', in this case the persister. The persister now gets additional responsibility, i.e. call request.complete, and knowledge about what system it runs in, i.e. that it's part of a webservice.
What would be the correct way to handle a situation like this, so that the persister actor becomes unaware of it being part of a http service, and doesn't need to know how to output the generated ID?
I'm thinking that the request should still be passed to the persister actor, but instead of the persister actor calling request.complete, it sends a message back to the HttpService actor (a SchedulePersisted(request, businessObjectId) message), which simply calls request.complete("/businessObject/%s".format(businessObjectId)). Basically:
def receive = {
case SchedulePersisted(request, businessObjectId) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
val directives = post {
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
}
Am I on the right track with this approach?
A smaller secondary spray-server specific question, is it okay to subclass HttpService and override the receive method, or will I break things that way? (I have no clue about subclassing actors, or how to pass unrecognized messages to the 'parent' actor)
Final question, is passing the request object / reference around in actor messages that may pass throughout the entire application an okay approach, or is there a better way to 'remember' what request should be sent a response after flowing the request through the application?

In regards to your first question, yes, you are on the right track. (Although I would also like to see some alternative ways to handle this sort of issue).
One suggestion I have is to insulate the persister actor from knowing about requests at all. You can pass the request as an Any type. Your matcher in your service code can automagically cast the cookie back into a Request.
case class SchedulePersisted(businessObjectId: String, cookie: Any)
// in your actor
override def receive = super.receive orElse {
case SchedulePersisted(businessObjectId, request: Request) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
In regards to your second question, actor classes are really no different than regular classes. But you do need to make sure you call the superclass's receive method, so that it can handle its own messages. I had some other ways of doing this in my original answer, but I think I prefer chaining partial functions like this:
class SpecialHttpService extends HttpService {
override def receive = super.receive orElse {
case SpecialMessage(x) =>
// handle special message
}
}

You could also use the produce directive. It allows you to decouple the actual marshalling from the request completion:
get {
produce(instanceOf[Person]) { personCompleter =>
databaseActor ! ShowPersonJob(personCompleter)
}
}
The produce directive in this example extracts a function Person => Unit that you can use to complete the request transparently deep within the business logic layer, which should not be aware of spray.
https://github.com/spray/spray/wiki/Marshalling-Unmarshalling

Related

How should I get this value through DDD to async code in Akka HTTP

I'm trying to write an Akka HTTP webservice using domain-driven design but I'm struggling to pass technical data received by the webservice to the code doing the work inside a Future, namely a correlationId sent by the client to my webservice.
My understanding of DDD is that as a choice of implementation, the correlationId shouldn't appear in the domain:
package domain
trait MyRepository {
def startWork(): Future[Unit]
}
To make it available to my implementation code without it being a parameter of the method, I'm thinking of using thread-local storage like org.slf4j.MDC or a ThreadLocal. However, I don't know if that would work or if several calls to the webservice would be handled by the same thread and overwrite the value.
Here is the implementation:
package infra
class MyRepository(implicit executor: ExecutionContextExecutor) extends domain.MyRepository {
override def startWork(): Future[Unit] = {
Future {
val correlationId = ??? // MDC.get("correlationId") ?
log(s"The correlationId is $correlationId")
}
}
}
And the route in my webservice:
val repo = new infra.MyRepository()
val route = path("my" / "path") {
post {
parameter('correlationId) { correlationId =>
??? // MDC.put("correlationId", correlationId) ?
onComplete(repo.startWork()) {
complete(HttpResponse(StatusCodes.OK))
}
}
}
}
My question is twofold:
Is my general design sound and good DDD?
Would using org.slf4j.MDC or a ThreadLocal work or is there a better / more Akka-friendly way to implement it?
Thread-locals (including MDC, though a Lightbend subscription includes tooling to propagate the MDC alongside messages and futures) in Akka are generally a poor idea because it's generally not guaranteed that a given task (a Future in this case, or the actor handling a sent message) will execute on the same thread as the thread that requested the task (and in the specific case where that task is performing [likely-blocking] interactions with an external service/DB (implied by the use of Future {}), you pretty much don't want that to happen). Further, even if the task ends up executing on the same thread that requested the task, it's somewhat unlikely that no other task which could have mutated the MDC/thread-local would've executed in the meantime.
I myself don't see a problem with passing the correlation ID as an argument to startWork: you've already effectively exposed it by passing it through the HTTP endpoint.

How to avoid saving state in Scala class?

I am currently writing a client application in Scala which makes HTTP requests to an API. In this client application I have implemented a service connector which encapsulates all API related logic. Before making API calls, I want to authenticate the user, but I want to abstract this process. Which means that my actor would only call the service connector to initiate the API call, something like that:
class MessageProcessor(credentials: Credentials) extends Actor with ActorLogging {
implicit val ec = context.dispatcher
override def receive: Receive = {
case sendMsg: SendMessage =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
}
def sendMessage(msg: SendMessage) = {
ServiceConnector.sendMessage(credentials, msg).map { result =>
// My result
}
}
}
object MessageProcessor {
def props(credentials: Credentials) = Props(classOf[MessageProcessor], credentials)
}
In my service connector, I want to somehow save "the Scala way" the JWT token and if I am not yet authenticated, send an authentication request before making the actual API call.
How can I code such a service in an immutable manner with Futures in mind?
I thought about creating additional actors and just sending messages around with the token, but is this really necessary?
You need multiple Akka states to do it the proper "Scala way". I'm not completely sure how your API works, but the following example shows a basic approach. In its first state, it authenticates before sending the message. Once the authentication is confirmed, it sends the message. All following messages are immediately sent. If the authentication is lost somehow, you can also add a logout or timeout case that switches back to the first receive state.
class MessageProcessor(credentials: Credentials) extends Actor with ActorLogging {
implicit val ec = context.dispatcher
override def receive: Receive = {
case sendMsg: SendMessage =>
log.info(s"Authenticating...")
sender ! NotAuthenticated(credentials) // or authenticate somehow
context become waitingAuthentication(sendMsg)
}
def waitingAuthentication(sendMsg: SendMessage): Receive = {
case _: AuthenticationConfirmation =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
context become authenticated
}
def authenticated: Receive = {
case sendMsg: SendMessage =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
}
}
It's just an example and doesn't consider all cases (e.g., a SendMessage during waitingAuthentication, therefore you would need a queue of SendMessages). If multiple actors need to know the authentication state or the credentials, you need to broadcast to them if you don't want a bottleneck actor that handles and verifies all messages. In that case, all of them would also need multiple authentication states as described above.
Actors Are Designed For State
This seems like a false exercise. The entire point of Actors is that they can warehouse state. From the documentation specifically on state:
The good news is that Akka actors conceptually each have their own
light-weight thread, which is completely shielded from the rest of the
system. This means that instead of having to synchronize access using
locks you can just write your actor code without worrying about
concurrency at all.
Therefore, "the scala way" is just to keep it in a variable within Actors.
One of the reasons to use Actors instead of Futures is so you can maintain state. Why choose Actors and then dismiss one of their primary advantages???
Your question is the equivalent of "how do I use HashMap without doing any hashing (in a scala way)?"

Lightweight eventing Plain Futures or Akka

Have a few use cases as follow.
1)createUser API call is made via front end. Once this call succeeds, meaning data is saved to db successfully, return success to the frond end. API contract ends there between front and backend.
2)Now backend needs to generate and fire CreateUser event which creates user into third party app (for the sake of example we can say it'll createUser into an external entitlement system). This is fully asynchronous and background type process where client is neither aware of it nor waiting for this API's success or failure. But all calls to this CreateUser event must be logged along with it's failure or success for auditing and remediation(in case of failure) purposes.
First approach is that we design Future based async APIs for these async events (rest of the app is uses Futures, async heavily), log incoming events and success/failure of result into db.
Second approach is that we use Akka and have individual actor for these events (e.g. CreateUser is one example). Which may look something like
class CreateUserActor extends Actor {
def receive = {
case CreateUserEvent(user, role) =>
val originalSender = sender
val res = Future {
blocking {
//persist CreateUserEvent to db
SomeService.createUser(user, role)
}
}
res onComplete {
case Success(u) => //persist success to db
case Failure(e) => //persist failure to db
}
}
Third approach Use Akka Persistence so persisting of events can happen out of the box with event sourcing journaling. however second persistence of event's success or failure will be manual(write code for it). Though this third approach may look promising it may not pay off well since now we're relying on Akka persistence for persisting events, second requirement of persisting success/failure of event is still manual, and now have to maintain one more storage(persisted journal etc) so not sure if we're buying much here?
Second approach will require to write persisting code for both cases (incoming events and results of the events).
First approach may not look very promising.
Although it may sound like it I didn't intend to create a question that may sound like "Opinion based" but trying to cater in genuine advise with its pros/cons on the mentioned approaches or anything else that may fit in well here.
FYI: This particular application is a play application running on a play server so using Actors isn't an issue.
Since this is a Play application you could use the Akka event stream to publish events without needing a reference to the backend worker actor.
For example, with the following in actors/Subscriber.scala:
package actors
import akka.actor.Actor
import model._
class Subscriber extends Actor {
context.system.eventStream.subscribe(self, classOf[DomainEvent])
def receive = {
case event: DomainEvent =>
println("Received DomainEvent: " + event)
}
}
... and something like this in model/events.scala:
package model
trait DomainEvent
case class TestEvent(message: String) extends DomainEvent
... your controller could publish a TestEvent like this:
object Application extends Controller {
import akka.actor.Props
import play.libs.Akka
Akka.system.actorOf(Props(classOf[actors.Subscriber])) // Create the backend actor
def index = Action {
Akka.system.eventStream.publish(model.TestEvent("message")) // publish an event
Ok(views.html.index("Hi!"))
}
}

Spray.io - delegate processing to another actor(s)

I implement a REST service using Spray.io framework. Such service must receive some "search" queries, process them and send result back to the client(s). The code that perfrom searching located in separate actor - SearchActor, so after receiving (JSON) query from user, i re-send (using ask pattern) this query to my SearchActor. But what i don't really understand it's how i must implement interaction between spray.io route actor and my SearchActor.
I see here several variants but which one is more correct and why?
Create one instance of SearchActor at startup and send every request to this actor
For every request create new instance of SearchActor
Create pool of SearchActor actors at startup and send requests to this pool
You're not forced to use the ask pattern. In fact, it will create a thread for each of your request and this is probably not what you want. I would recommend that you use a tell instead. You do this by spawning a new Actor for each request (less expensive than a thread), that has the RequestContext in one of its constructor fields. You will use this context to give the response back, typically with its complete method.
Example code.
class RESTActor extends HttpService {
val route = path("mypath") ~ post {
entity(as[SearchActor.Search]) { search => ctx =>
SearchActor(ctx) ! search
}
}
}
case class SearchActor(ctx: RequestContext) {
def receive = {
case msg: Search => //... search process
case msg: Result => ctx.complete(msg) // sends back reply
}
}
Variant #1 is out of question after the initial implementation - you would want to scale out, so single blocking actor is bad.
Variants #2 and #3 are not very different - creation of new actor is cheap and has minimal overhead. As your actors may die often (i.e. backend is not available), i would say that #2 is the way to go.
Concrete implementation idea is shown at http://techblog.net-a-porter.com/2013/12/ask-tell-and-per-request-actors/

Any way of appending to the act method in scala?

First off, I am new to Scala:
I am writing a logging facility in Scala that will simply be a class that extends the Actor class. This way a user can just extend this class and the logging features will be available. Basically, I want to write to a log file every time an actor that extends this class sends or receives a message. For clarification every actor will have its own log file which can be collated later. I am taking a Lamport clocks style approach to ordering the events by having each Actor (who extends this class) have their own time variable that gets updated on a message send-receive and the actor will compare the current time variable (simply a positive integer) with the sender's and update its time variable with the greater of the two.
For now I chose to make it a simple method like
sendMessage(recipient, message)
For sending messages. This will just log to the file that the actor is going to send a message to X.
Now, the part that I am stumped on is doing logging when receiving messages. When an actor gets a message I simply want to log this event in a format like
myLogFile.writeLine(self.ToString+": Got a message from "+X+" at time: "+messageSendTime+", processed the message at" +Math.max(myCurrTime+1, messageSendTime+1))
However I need to know who sent this message, unless I force upon the user to include this info (namely the sender's name, time variable, etc) in the messages themselves, it gets hard(er). Is there any way to get the reference of the actual sender? I want this to work with remote actors as well. The only way I can think of is if I append to the act method that the user defines in his/her class with some extra case statements like:
def act {
case => // the user's case statements
...
//somehow I append these statements to the end for the Logger class's use
case (LoggerClassRegisterInboundMessage, message, timeStamp)
InboundMessagesMap.put(timeStamp, message)
}
By having this functionality I can do all the logging "behind the scenes" with these hidden messages being sent whenever the user sends a message. However this only works if the sender also uses the Logging facility. So a more general question is: is there a way in Scala to get the name/toString of a sender in Scala regardless of the sender's class?
I'm actually OK with going with the assumption that every class that sends messages will extend the Logger class. So if anyone knows how to append to the act like or something similar to the above example I will be equally grateful!
As it was said in the comments, Akka is the way to go. It's so much more powerful than the current Scala Actor API which will become deprecated with 2.10 anyway.
But, to attack your specific problem, you could create a trait for actors which support logging, in a way similar to this (I don't know if this actually works, but you can try it):
trait LoggingActor extends Actor {
override def receive[R](pf: PartialFunction[Any, R]): R = {
//we are appending to the partial function pf a case to handle messages with logging:
val loggingPf = pf orElse {
case (LoggerClassRegisterInboundMessage, message, timeStamp) => {
//do somthing with this log message.
message //returning the unwrapped result afterwards
}
}
super.receive(loggingPf)
}
//overriding the send as well
override def !(msg: Any): Unit {
//Wrap it in a logging message
super ! (LoggerClassRegisterInboundMessage, msg, getTimestamp())
}
}
And you would create your actors with something like this:
val myActor = new MyActor with LoggingActor
Hope it helps !