I implement a REST service using Spray.io framework. Such service must receive some "search" queries, process them and send result back to the client(s). The code that perfrom searching located in separate actor - SearchActor, so after receiving (JSON) query from user, i re-send (using ask pattern) this query to my SearchActor. But what i don't really understand it's how i must implement interaction between spray.io route actor and my SearchActor.
I see here several variants but which one is more correct and why?
Create one instance of SearchActor at startup and send every request to this actor
For every request create new instance of SearchActor
Create pool of SearchActor actors at startup and send requests to this pool
You're not forced to use the ask pattern. In fact, it will create a thread for each of your request and this is probably not what you want. I would recommend that you use a tell instead. You do this by spawning a new Actor for each request (less expensive than a thread), that has the RequestContext in one of its constructor fields. You will use this context to give the response back, typically with its complete method.
Example code.
class RESTActor extends HttpService {
val route = path("mypath") ~ post {
entity(as[SearchActor.Search]) { search => ctx =>
SearchActor(ctx) ! search
}
}
}
case class SearchActor(ctx: RequestContext) {
def receive = {
case msg: Search => //... search process
case msg: Result => ctx.complete(msg) // sends back reply
}
}
Variant #1 is out of question after the initial implementation - you would want to scale out, so single blocking actor is bad.
Variants #2 and #3 are not very different - creation of new actor is cheap and has minimal overhead. As your actors may die often (i.e. backend is not available), i would say that #2 is the way to go.
Concrete implementation idea is shown at http://techblog.net-a-porter.com/2013/12/ask-tell-and-per-request-actors/
Related
I'm trying to write an Akka HTTP webservice using domain-driven design but I'm struggling to pass technical data received by the webservice to the code doing the work inside a Future, namely a correlationId sent by the client to my webservice.
My understanding of DDD is that as a choice of implementation, the correlationId shouldn't appear in the domain:
package domain
trait MyRepository {
def startWork(): Future[Unit]
}
To make it available to my implementation code without it being a parameter of the method, I'm thinking of using thread-local storage like org.slf4j.MDC or a ThreadLocal. However, I don't know if that would work or if several calls to the webservice would be handled by the same thread and overwrite the value.
Here is the implementation:
package infra
class MyRepository(implicit executor: ExecutionContextExecutor) extends domain.MyRepository {
override def startWork(): Future[Unit] = {
Future {
val correlationId = ??? // MDC.get("correlationId") ?
log(s"The correlationId is $correlationId")
}
}
}
And the route in my webservice:
val repo = new infra.MyRepository()
val route = path("my" / "path") {
post {
parameter('correlationId) { correlationId =>
??? // MDC.put("correlationId", correlationId) ?
onComplete(repo.startWork()) {
complete(HttpResponse(StatusCodes.OK))
}
}
}
}
My question is twofold:
Is my general design sound and good DDD?
Would using org.slf4j.MDC or a ThreadLocal work or is there a better / more Akka-friendly way to implement it?
Thread-locals (including MDC, though a Lightbend subscription includes tooling to propagate the MDC alongside messages and futures) in Akka are generally a poor idea because it's generally not guaranteed that a given task (a Future in this case, or the actor handling a sent message) will execute on the same thread as the thread that requested the task (and in the specific case where that task is performing [likely-blocking] interactions with an external service/DB (implied by the use of Future {}), you pretty much don't want that to happen). Further, even if the task ends up executing on the same thread that requested the task, it's somewhat unlikely that no other task which could have mutated the MDC/thread-local would've executed in the meantime.
I myself don't see a problem with passing the correlation ID as an argument to startWork: you've already effectively exposed it by passing it through the HTTP endpoint.
I am currently writing a client application in Scala which makes HTTP requests to an API. In this client application I have implemented a service connector which encapsulates all API related logic. Before making API calls, I want to authenticate the user, but I want to abstract this process. Which means that my actor would only call the service connector to initiate the API call, something like that:
class MessageProcessor(credentials: Credentials) extends Actor with ActorLogging {
implicit val ec = context.dispatcher
override def receive: Receive = {
case sendMsg: SendMessage =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
}
def sendMessage(msg: SendMessage) = {
ServiceConnector.sendMessage(credentials, msg).map { result =>
// My result
}
}
}
object MessageProcessor {
def props(credentials: Credentials) = Props(classOf[MessageProcessor], credentials)
}
In my service connector, I want to somehow save "the Scala way" the JWT token and if I am not yet authenticated, send an authentication request before making the actual API call.
How can I code such a service in an immutable manner with Futures in mind?
I thought about creating additional actors and just sending messages around with the token, but is this really necessary?
You need multiple Akka states to do it the proper "Scala way". I'm not completely sure how your API works, but the following example shows a basic approach. In its first state, it authenticates before sending the message. Once the authentication is confirmed, it sends the message. All following messages are immediately sent. If the authentication is lost somehow, you can also add a logout or timeout case that switches back to the first receive state.
class MessageProcessor(credentials: Credentials) extends Actor with ActorLogging {
implicit val ec = context.dispatcher
override def receive: Receive = {
case sendMsg: SendMessage =>
log.info(s"Authenticating...")
sender ! NotAuthenticated(credentials) // or authenticate somehow
context become waitingAuthentication(sendMsg)
}
def waitingAuthentication(sendMsg: SendMessage): Receive = {
case _: AuthenticationConfirmation =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
context become authenticated
}
def authenticated: Receive = {
case sendMsg: SendMessage =>
log.info(s"Sending message ${sendMsg.body}.")
sendMessage(sendMsg)
}
}
It's just an example and doesn't consider all cases (e.g., a SendMessage during waitingAuthentication, therefore you would need a queue of SendMessages). If multiple actors need to know the authentication state or the credentials, you need to broadcast to them if you don't want a bottleneck actor that handles and verifies all messages. In that case, all of them would also need multiple authentication states as described above.
Actors Are Designed For State
This seems like a false exercise. The entire point of Actors is that they can warehouse state. From the documentation specifically on state:
The good news is that Akka actors conceptually each have their own
light-weight thread, which is completely shielded from the rest of the
system. This means that instead of having to synchronize access using
locks you can just write your actor code without worrying about
concurrency at all.
Therefore, "the scala way" is just to keep it in a variable within Actors.
One of the reasons to use Actors instead of Futures is so you can maintain state. Why choose Actors and then dismiss one of their primary advantages???
Your question is the equivalent of "how do I use HashMap without doing any hashing (in a scala way)?"
Have a few use cases as follow.
1)createUser API call is made via front end. Once this call succeeds, meaning data is saved to db successfully, return success to the frond end. API contract ends there between front and backend.
2)Now backend needs to generate and fire CreateUser event which creates user into third party app (for the sake of example we can say it'll createUser into an external entitlement system). This is fully asynchronous and background type process where client is neither aware of it nor waiting for this API's success or failure. But all calls to this CreateUser event must be logged along with it's failure or success for auditing and remediation(in case of failure) purposes.
First approach is that we design Future based async APIs for these async events (rest of the app is uses Futures, async heavily), log incoming events and success/failure of result into db.
Second approach is that we use Akka and have individual actor for these events (e.g. CreateUser is one example). Which may look something like
class CreateUserActor extends Actor {
def receive = {
case CreateUserEvent(user, role) =>
val originalSender = sender
val res = Future {
blocking {
//persist CreateUserEvent to db
SomeService.createUser(user, role)
}
}
res onComplete {
case Success(u) => //persist success to db
case Failure(e) => //persist failure to db
}
}
Third approach Use Akka Persistence so persisting of events can happen out of the box with event sourcing journaling. however second persistence of event's success or failure will be manual(write code for it). Though this third approach may look promising it may not pay off well since now we're relying on Akka persistence for persisting events, second requirement of persisting success/failure of event is still manual, and now have to maintain one more storage(persisted journal etc) so not sure if we're buying much here?
Second approach will require to write persisting code for both cases (incoming events and results of the events).
First approach may not look very promising.
Although it may sound like it I didn't intend to create a question that may sound like "Opinion based" but trying to cater in genuine advise with its pros/cons on the mentioned approaches or anything else that may fit in well here.
FYI: This particular application is a play application running on a play server so using Actors isn't an issue.
Since this is a Play application you could use the Akka event stream to publish events without needing a reference to the backend worker actor.
For example, with the following in actors/Subscriber.scala:
package actors
import akka.actor.Actor
import model._
class Subscriber extends Actor {
context.system.eventStream.subscribe(self, classOf[DomainEvent])
def receive = {
case event: DomainEvent =>
println("Received DomainEvent: " + event)
}
}
... and something like this in model/events.scala:
package model
trait DomainEvent
case class TestEvent(message: String) extends DomainEvent
... your controller could publish a TestEvent like this:
object Application extends Controller {
import akka.actor.Props
import play.libs.Akka
Akka.system.actorOf(Props(classOf[actors.Subscriber])) // Create the backend actor
def index = Action {
Akka.system.eventStream.publish(model.TestEvent("message")) // publish an event
Ok(views.html.index("Hi!"))
}
}
In the past few months, me and my colleagues have successfully built a server-side system for dispatching push notifications to iPhone devices. Basically, a user registers for these notifications via a RESTful webservice (Spray-Server, recently updated to use Spray-can as the HTTP layer), and the logic schedules one or multiple messages for dispatch in the future, using Akka's scheduler.
This system, as we built it, simply works: it can handle hundreds, maybe even thousands of HTTP requests a second, and can send out notifications at a rate of 23,000 per second - possibly even more if we reduce log output, add multiple notification sender actors (and thus more connections with Apple), and there might be some optimization to be done in the Java library we use (java-apns).
This question is about how to do it Right(tm). My colleague, much more knowledgeable about Scala and actor-based systems in general, noted how the application isn't a 'pure' actor-based system - and he's right. What I'm wondering now is how to do it Right.
At the moment, we have a single Spray HttpService actor, not subclassed, that is initialized with a set of directives that outlines our HTTP service logic. Currently, very much simplified, we have directives like this:
post {
content(as[SomeBusinessObject]) { businessObject => request =>
// store the business object in a MongoDB back-end and wait for the ID to be
// returned; we want to send this back to the user.
val businessObjectId = persister !! new PersistSchedule(businessObject)
request.complete("/businessObject/%s".format(businessObjectId))
}
}
Now, if I get this right, 'waiting for a response' from an actor is a no-no in actor-based programming (plus the !! is deprecated). What I believe is the 'correct' way to do it is to pass the request object over to the persister actor in a message, and have it call request.complete as soon as it's received a generated ID from the back-end.
I have rewritten one of the routes in my application to do just this; in the message that is sent to the actor, the request object / reference is also sent. This seems to work like it's supposed to:
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
My main concern here is that we seem to pass the request object to the 'business logic', in this case the persister. The persister now gets additional responsibility, i.e. call request.complete, and knowledge about what system it runs in, i.e. that it's part of a webservice.
What would be the correct way to handle a situation like this, so that the persister actor becomes unaware of it being part of a http service, and doesn't need to know how to output the generated ID?
I'm thinking that the request should still be passed to the persister actor, but instead of the persister actor calling request.complete, it sends a message back to the HttpService actor (a SchedulePersisted(request, businessObjectId) message), which simply calls request.complete("/businessObject/%s".format(businessObjectId)). Basically:
def receive = {
case SchedulePersisted(request, businessObjectId) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
val directives = post {
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
}
Am I on the right track with this approach?
A smaller secondary spray-server specific question, is it okay to subclass HttpService and override the receive method, or will I break things that way? (I have no clue about subclassing actors, or how to pass unrecognized messages to the 'parent' actor)
Final question, is passing the request object / reference around in actor messages that may pass throughout the entire application an okay approach, or is there a better way to 'remember' what request should be sent a response after flowing the request through the application?
In regards to your first question, yes, you are on the right track. (Although I would also like to see some alternative ways to handle this sort of issue).
One suggestion I have is to insulate the persister actor from knowing about requests at all. You can pass the request as an Any type. Your matcher in your service code can automagically cast the cookie back into a Request.
case class SchedulePersisted(businessObjectId: String, cookie: Any)
// in your actor
override def receive = super.receive orElse {
case SchedulePersisted(businessObjectId, request: Request) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
In regards to your second question, actor classes are really no different than regular classes. But you do need to make sure you call the superclass's receive method, so that it can handle its own messages. I had some other ways of doing this in my original answer, but I think I prefer chaining partial functions like this:
class SpecialHttpService extends HttpService {
override def receive = super.receive orElse {
case SpecialMessage(x) =>
// handle special message
}
}
You could also use the produce directive. It allows you to decouple the actual marshalling from the request completion:
get {
produce(instanceOf[Person]) { personCompleter =>
databaseActor ! ShowPersonJob(personCompleter)
}
}
The produce directive in this example extracts a function Person => Unit that you can use to complete the request transparently deep within the business logic layer, which should not be aware of spray.
https://github.com/spray/spray/wiki/Marshalling-Unmarshalling
I implemented in Java what I called a "foldable queue", i.e., a LinkedBlockingQueue used by an ExecutorService. The idea is that each task as a unique id that if is in the queue while another task is submitted via that same id, it is not added to the queue. The Java code looks like this:
public final class FoldablePricingQueue extends LinkedBlockingQueue<Runnable> {
#Override
public boolean offer(final Runnable runnable) {
if (contains(runnable)) {
return true; // rejected, but true not to throw an exception
} else {
return super.offer(runnable);
}
}
}
Threads have to be pre-started but this is a minor detail. I have an Abstract class that implements Runnable that takes a unique id... this is the one passed in
I would like to implement the same logic using Scala and Akka (Actors).
I would need to have access to the mailbox, and I think I would need to override the ! method and check the mailbox for the event.. has anyone done this before?
This is exactly how the Akka mailbox works. The Akka mailbox can only exist once in the task-queue.
Look at:
https://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/Dispatcher.scala#L143
https://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/Dispatcher.scala#L198
Very cheaply implemented using an atomic boolean, so no need to traverse the queue.
Also, by the way, your Queue in Java is broken since it doesn't override put, add or offer(E, long, TimeUnit).
Maybe you could do that with two actors. A facade one and a worker one. Clients send jobs to facade. Facade forwards then to worker, and remember them in its internal state, a Set queuedJobs. When it receives a job that is queued, it just discard it. Each time the worker starts processing a job (or completes it, whichever suits you), it sends a StartingOn(job) message to facade, which removes it from queuedJobs.
The proposed design doesn't make sense. The closest thing to a Runnable would be an Actor. Sure, you can keep them in a list, and not add them if they are already there. Such lists are kept by routing actors, which can be created from ready parts provided by Akka, or from a basic actor using the forward method.
You can't look into another actor's mailbox, and overriding ! makes no sense. What you do is you send all your messages to a routing actor, and that routing actor forwards them to a proper destination.
Naturally, since it receives these messages, it can do any logic at that point.