Akka tracking when the actors finished - scala

I have following code that traverses through a list of people and calls a callback for each of them in class1.
def syncPeople(callback: Person => _) = Future {
person.findAll(criteria).foldLeft(0L) { (count, obj) =>
callback(obj)
count + 1
}
}
Callback and the call to syncPeople is in class2 and looks similar to this
def getActor(person: Person):ActorRef = {
if(person.isMale) maleActor
else femaleActor
}
def process(person: Person): Unit = {
val workActor = getActor(person)
workActor ! person
} //The actor does the actual work and may be quite intense
def syncPeople(process)
Now, I want to track the total time taken to sync all people. ie when the last workActor completes the work. I am using a third Actor: MonitorActor to keep track of start and end times. The MaleActor, FemaleActor can send messages to this when they process an individual
Whats the best way to keep track of this spawned processes?
I explored
Future.sequence // but the class sending the workActor the message is not an actor. so the future does not receive the message
keeping track of personIds when they finish, but without using a var, to accumulate the received messages in MonitorActor its not possible implement this.. and using var is not preferred way of doing things
What could be other ways of implementing this

Funny, I'm working on a very similar problem to this at the moment. The solution I would suggest is using akka-fsm which keeps track of state.
Essentially in something outside of your state object, do something like generate a Long that represents an id:
def getId(): Long = System.currentTimeMillis() / 1000L
The state object when implemented correctly is immutable, so you just keep reusing this id throughout the transaction.
I know this answer is missing a lot of the implementation details but I'm still working on the implementation myself in my own code. Hopefully after reading about akka-fsm a bit and playing with it, this answer will make sense?

Don't demonize mutable state, it's SHARED mutable state, which causes the most of the problems. You don't have shared mutable state inside an actor, because you always talk to actorRefs and the actors process only one message at a time (no race conditions and other evil stuff). What I'm saying is, it's ok to use a var (unless you spawn some futures inside the actor, which mutate the var, because then your are back to SHARING mutable state). FSM is another solution as #devnulled suggested, but it sounds more like an overkill for your use case.

Related

Continuous state events with State Monad

Consider the following state machine:
class AudioRecorder {
private var currentState = Idle
def buttonTapped() = currentState match {
case Idle =>
currentState = Recording
// start recording...
case Recording =>
currentState = Stopped
// stop recording...
}
}
It works but is stateful and ugly.
Unfortunately, I have many situations where I need to deal with this kind of state machine with continuous events, especially in UI setting.
It seems that State Monad is a solution to this but so far through my learning, it is only useful when you can actually layout all the sequences of state events upfront so that you can connect them all through flatMap but not when the state event is continuous and non-deterministic as in this case (when user taps) - but please correct me if I'm wrong.
Although I'm not too sure if I'm asking the right question but,
Is there a better way to model this kind of state machine that changes its behavior with continuous & non-deterministic events?
I've modelled this in server-side with Actor Model using akka but I haven't seen anyone using actor model in UI setting.
Also it's worth noting that AudioRecorder cannot be re-created on every buttonTapped event to return a new instance (which could make the problem with state go away) since it retains many other states that are too expensive to be re-created on every signal.
Using FRP with rxscala you could encapsulate your mutable object and hide it.
It could look like the following (uncompiled code):
class FrpRecorder(button: Observable[Unit]) {
private val recorder = new AudioRecorder
button.subscribe(_ => recorder.buttonTapped())
val recordingLength: Observable[Double] =
interval("100ms").map(_ => recorder.recordingLength)
}

Can it be safe to share a var?

My application has a class ApplicationUsers that has no mutable members. Upon creation of instances, it reads the entire user database (relatively small) into an immutable collection. It has a number of methods to query the data.
I am now faced with the problem of having to create new users (or modify some of their attributes). My current idea is to use an Akka actor that, at a high level, would look like this:
class UserActor extends Actor{
var users = new ApplicationUsers
def receive = {
case GetUsers => sender ! users
case SomeMutableOperation => {
PerformTheChangeOnTheDatabase() // does not alter users (which is immutable)
users = new ApplicationUsers // reads the database from scratch into a new immutable instance
}
}
}
Is this safe? My reasoning is that it should be: whenever users is changed by SomeMutableOperation any other threads making use of previous instances of users already have a handle to an older version, and should not be affected. Also, any GetUsers request will not be acted upon until a new instance is not safely constructed.
Is there anything I am missing? Is my construct safe?
UPDATE: I probably should be using Agents to do this, but the question is still holds: is the above safe?
You are doing it exactly right: have immutable data types and reference them via var within the actor. This way you can freely share the data and mutability is confined to the actor. The only thing to watch out for is if you reference the var from a closure which is executed outside of the actor (e.g. in a Future transformation or a Props instance). In such a case you need to make a stack-local copy:
val currentUsers = users
other ? Process(users) recoverWith { case _ => backup ? Process(currentUsers) }
In the first case you just grab the value—which is fine—but asking the backup happens from a different thread, hence the need for val currentUsers.
Looks fine to me. You don't seem to need Agents here.

Any way of appending to the act method in scala?

First off, I am new to Scala:
I am writing a logging facility in Scala that will simply be a class that extends the Actor class. This way a user can just extend this class and the logging features will be available. Basically, I want to write to a log file every time an actor that extends this class sends or receives a message. For clarification every actor will have its own log file which can be collated later. I am taking a Lamport clocks style approach to ordering the events by having each Actor (who extends this class) have their own time variable that gets updated on a message send-receive and the actor will compare the current time variable (simply a positive integer) with the sender's and update its time variable with the greater of the two.
For now I chose to make it a simple method like
sendMessage(recipient, message)
For sending messages. This will just log to the file that the actor is going to send a message to X.
Now, the part that I am stumped on is doing logging when receiving messages. When an actor gets a message I simply want to log this event in a format like
myLogFile.writeLine(self.ToString+": Got a message from "+X+" at time: "+messageSendTime+", processed the message at" +Math.max(myCurrTime+1, messageSendTime+1))
However I need to know who sent this message, unless I force upon the user to include this info (namely the sender's name, time variable, etc) in the messages themselves, it gets hard(er). Is there any way to get the reference of the actual sender? I want this to work with remote actors as well. The only way I can think of is if I append to the act method that the user defines in his/her class with some extra case statements like:
def act {
case => // the user's case statements
...
//somehow I append these statements to the end for the Logger class's use
case (LoggerClassRegisterInboundMessage, message, timeStamp)
InboundMessagesMap.put(timeStamp, message)
}
By having this functionality I can do all the logging "behind the scenes" with these hidden messages being sent whenever the user sends a message. However this only works if the sender also uses the Logging facility. So a more general question is: is there a way in Scala to get the name/toString of a sender in Scala regardless of the sender's class?
I'm actually OK with going with the assumption that every class that sends messages will extend the Logger class. So if anyone knows how to append to the act like or something similar to the above example I will be equally grateful!
As it was said in the comments, Akka is the way to go. It's so much more powerful than the current Scala Actor API which will become deprecated with 2.10 anyway.
But, to attack your specific problem, you could create a trait for actors which support logging, in a way similar to this (I don't know if this actually works, but you can try it):
trait LoggingActor extends Actor {
override def receive[R](pf: PartialFunction[Any, R]): R = {
//we are appending to the partial function pf a case to handle messages with logging:
val loggingPf = pf orElse {
case (LoggerClassRegisterInboundMessage, message, timeStamp) => {
//do somthing with this log message.
message //returning the unwrapped result afterwards
}
}
super.receive(loggingPf)
}
//overriding the send as well
override def !(msg: Any): Unit {
//Wrap it in a logging message
super ! (LoggerClassRegisterInboundMessage, msg, getTimestamp())
}
}
And you would create your actors with something like this:
val myActor = new MyActor with LoggingActor
Hope it helps !

Akka actors + Play! 2.0 Scala Edition best practices for spawning and bundling actor instances

My app gets a new instance of Something via an API call on a stateless controller. After I do my mission critical stuff (like saving it to my Postgres database and committing the transaction) I would now like to do a bunch of fire-and-forget operations.
In my controller I send the model instance to the post-processor:
import _root_.com.eaio.uuid.UUID
import akka.actor.Props
// ... skip a bunch of code
play.api.libs.concurrent.Akka.system.actorOf(
Props[MySomethingPostprocessorActor],
name = "somethingActor"+new UUID().toString()
) ! something
The MySomethingPostprocessorActor actor looks like this:
class MySomethingPostprocessorActor extends Actor with ActorLogging {
def receive = {
case Something(thing, alpha, beta) => try {
play.api.libs.concurrent.Akka.system.actorOf(
Props[MongoActor],
name = "mongoActor"+new UUID().toString()
) ! Something(thing, alpha, beta)
play.api.libs.concurrent.Akka.system.actorOf(
Props[PubsubActor],
name = "pubsubActor"+new UUID().toString()
) ! Something(thing, alpha, beta)
// ... and so forth
} catch {
case e => {
log.error("MySomethingPostprocessorActor error=[{}]", e)
}
}
}
}
So, here's what I'm not sure about:
I know Actor factories are discouraged as per the warning on this page. My remedy for this is to name each actor instance with a unique string provided by UUID, to get around the your-actor-is-not-unique errors:
play.core.ActionInvoker$$anonfun$receive$1$$anon$1:
Execution exception [[InvalidActorNameException:
actor name somethingActor is not unique!]]
Is there a better way to do the above, i.e. instead of giving everything a unique name? All examples in the Akka docs I encountered give actors a static name, which is a bit misleading.
(any other comments are welcome too, e.g. the if the bundling pattern I use is frowned upon, etc)
As far as I'm aware the name paramater is optional.
This may or may not be the case with Akka + Play (haven't checked). When working with standalone actor systems though, you usually only name an actor when you need that reference for later.
From the sounds of it you're tossing out these instances after using them, so you could probably skip the naming step.
Better yet, you could probably save the overhead of creating each actor instance by just wrapping your operations in Futures and using callbacks if need be: http://doc.akka.io/docs/akka/2.0.3/scala/futures.html

Actor-based webservice - How to do it properly?

In the past few months, me and my colleagues have successfully built a server-side system for dispatching push notifications to iPhone devices. Basically, a user registers for these notifications via a RESTful webservice (Spray-Server, recently updated to use Spray-can as the HTTP layer), and the logic schedules one or multiple messages for dispatch in the future, using Akka's scheduler.
This system, as we built it, simply works: it can handle hundreds, maybe even thousands of HTTP requests a second, and can send out notifications at a rate of 23,000 per second - possibly even more if we reduce log output, add multiple notification sender actors (and thus more connections with Apple), and there might be some optimization to be done in the Java library we use (java-apns).
This question is about how to do it Right(tm). My colleague, much more knowledgeable about Scala and actor-based systems in general, noted how the application isn't a 'pure' actor-based system - and he's right. What I'm wondering now is how to do it Right.
At the moment, we have a single Spray HttpService actor, not subclassed, that is initialized with a set of directives that outlines our HTTP service logic. Currently, very much simplified, we have directives like this:
post {
content(as[SomeBusinessObject]) { businessObject => request =>
// store the business object in a MongoDB back-end and wait for the ID to be
// returned; we want to send this back to the user.
val businessObjectId = persister !! new PersistSchedule(businessObject)
request.complete("/businessObject/%s".format(businessObjectId))
}
}
Now, if I get this right, 'waiting for a response' from an actor is a no-no in actor-based programming (plus the !! is deprecated). What I believe is the 'correct' way to do it is to pass the request object over to the persister actor in a message, and have it call request.complete as soon as it's received a generated ID from the back-end.
I have rewritten one of the routes in my application to do just this; in the message that is sent to the actor, the request object / reference is also sent. This seems to work like it's supposed to:
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
My main concern here is that we seem to pass the request object to the 'business logic', in this case the persister. The persister now gets additional responsibility, i.e. call request.complete, and knowledge about what system it runs in, i.e. that it's part of a webservice.
What would be the correct way to handle a situation like this, so that the persister actor becomes unaware of it being part of a http service, and doesn't need to know how to output the generated ID?
I'm thinking that the request should still be passed to the persister actor, but instead of the persister actor calling request.complete, it sends a message back to the HttpService actor (a SchedulePersisted(request, businessObjectId) message), which simply calls request.complete("/businessObject/%s".format(businessObjectId)). Basically:
def receive = {
case SchedulePersisted(request, businessObjectId) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
val directives = post {
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
}
Am I on the right track with this approach?
A smaller secondary spray-server specific question, is it okay to subclass HttpService and override the receive method, or will I break things that way? (I have no clue about subclassing actors, or how to pass unrecognized messages to the 'parent' actor)
Final question, is passing the request object / reference around in actor messages that may pass throughout the entire application an okay approach, or is there a better way to 'remember' what request should be sent a response after flowing the request through the application?
In regards to your first question, yes, you are on the right track. (Although I would also like to see some alternative ways to handle this sort of issue).
One suggestion I have is to insulate the persister actor from knowing about requests at all. You can pass the request as an Any type. Your matcher in your service code can automagically cast the cookie back into a Request.
case class SchedulePersisted(businessObjectId: String, cookie: Any)
// in your actor
override def receive = super.receive orElse {
case SchedulePersisted(businessObjectId, request: Request) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
In regards to your second question, actor classes are really no different than regular classes. But you do need to make sure you call the superclass's receive method, so that it can handle its own messages. I had some other ways of doing this in my original answer, but I think I prefer chaining partial functions like this:
class SpecialHttpService extends HttpService {
override def receive = super.receive orElse {
case SpecialMessage(x) =>
// handle special message
}
}
You could also use the produce directive. It allows you to decouple the actual marshalling from the request completion:
get {
produce(instanceOf[Person]) { personCompleter =>
databaseActor ! ShowPersonJob(personCompleter)
}
}
The produce directive in this example extracts a function Person => Unit that you can use to complete the request transparently deep within the business logic layer, which should not be aware of spray.
https://github.com/spray/spray/wiki/Marshalling-Unmarshalling