Is it possible to spawn actors into stream? - scala

So i'm using Akka Typed, and want to spawn actor for each message into some stream, according documentation it seems not possible:
Warning: This method is not thread-safe and must not be accessed from threads otherthan the ordinary actor message processing thread, such as [[scala.concurrent.Future]] callbacks.
def spawn[U](behavior: Behavior[U], name: String, props: Props = Props.empty): ActorRef[U]
Example:
Behaviors.receiveMessage {
case StartConsume =>
context.log.info("Starting consume messages")
val source: Source[Int, NotUsed] = Source(1 to 10)
source.runForeach(x => context.spawn(Test(x), "Test"))
Behaviors.same
}
Are there some other ways to do this?

Since the stream will materialize into a different actor, it's virtually certain that you can't close over an ActorContext in the stream (if it happens to execute in the same thread as the enclosing actor last ran on, it won't blow up), e.g for spawning a child.
As alternatives:
If you don't particularly care that the spawned actors be children of this actor (e.g. in classic, you'd be using system.actorOf), you could have the guardian actor (the one with the behavior from spawning the ActorSystem) spawn the actors: you can either roll your own protocol to do such spawning or have that guardian implement SpawnProtocol. You can then send the appropriate message to context.system, but note that you'll need to user context.system.unsafeUpcast to the protocol you're using. Since you should have control over the guardian's protocol, that's unlikely to fail, but the compiler won't really help you.
If you do want the spawned actors to be children, and you also want the spawns to be asynchronous, the best way to accomplish this is probably through an internal message that results in just spawning the actor. Then in the stream you just send those messages to yourself.
If you don't want the spawns to be asynchronous (which it should be noted, the approach of spawning them in a stream is), then just call spawn in the message processing thread without being in a stream.

Related

Scala Akka how to create a Typed ActorContext in route

I am pretty new in Scala Akka . Say I am spawning a Configuration child actor,
object Configuration {
def apply(): Behavior[ConfigurationMessage] = Behaviors.setup(context => new Configuration(context))
}
Now I need same context ActorContext[ConfigurationMessage] in my HTTP router to do some operation.
How can I create the same ActorContext there
The ActorContext cannot be used outside of the actor it's associated with, including in an HTTP router. Any ActorContext which leaks out of an actor (e.g. by sending it as a message) will, by design, throw an exception and not do anything for most operations if it's used outside of its actor.
The only operations on the ActorContext which could possibly be used outside of the associated actor are:
context.ask and friends can be just as easily replaced with a Future-returning ask on the target RecipientRef with the message send occurring in a foreach callback on the future.
context.executionContext: can just as easily use system.executionContext (which will typically be the same) or by looking up via dispatchers
context.pipeToSelf is probably best done as a send in a foreach callback on the future
context.scheduleOnce is better done using the system scheduler directly
context.self is kind of pointless, as you'd have to have the ActorRef already in order to leak the ActorContext
context.system is likewise pointless, as you already have the system

How does akka update a mutable state? [duplicate]

This question already has an answer here:
Does Akka onReceive method execute concurrently?
(1 answer)
Closed 1 year ago.
I read Akka documentation, but I do not understand:
class MyActor extends Actor {
private var _state = 0
override def receive: Receive = {
case x: Int =>
if (x != _state) {
println(s"---------> fail: ${x} and ${_state}")
}
_state = x + 1
}
}
implicit val system = ActorSystem("my-system")
val ref = system.actorOf(Props[MyActor], "my-actor")
(0 to 10000).foreach { x =>
ref ! x
}
I have a _state variable which is not #volatile and not atomic but at the same time _state is always correct, if I do changes with !-method.
How does Akka protect and update the internal state of actors?
Akka is an implementation of the Actor Model of computation. One of the (arguably the) key guarantees made by the actor model is that an actor only ever processes a single message at a time. Simply by virtue of having _state be private to an actor, you get a concurrency guarantee that's at least as strong as if you had every method of an object be #synchronized, with the added bonus of the ! operation to send a message being non-blocking.
Under the hood, a rough (simplified in a few places, but the broad strokes are accurate) outline of how it works and how the guarantee is enforced is:
Using the Props the ActorSystem constructs an instance of MyActor, places the only JVM reference to that instance inside an ActorCell (I'm told this terminology, as well as that the deep internals of Akka are "the dungeon", is inspired by the early development team for Akka being based in an office in what was previously a jail in Uppsala, Sweden), and keys that ActorCell with my-actor. In the meantime (technically this happens after system.actorOf has already returned the ActorRef), it constructs an ActorRef to allow user code to refer to the actor.
Inside the ActorCell, the receive method is called and the resulting PartialFunction[Any, Unit] (which has a type synonym of Receive) is saved in a field of the ActorCell which corresponds to the actor's behavior.
The ! operation on the ActorRef (at least for a local ActorRef) resolves which dispatcher is responsible for the actor and hands the message to the dispatcher. The dispatcher then enqueues the message into the mailbox of the ActorCell corresponding to my-actor (this is done in a thread-safe way).
If there is no task currently scheduled to process messages from the actor's mailbox, such a task is enqueued to the dispatcher's execution context to dequeue some (configurable) number of messages from the ActorCell's mailbox and process them, one-at-a-time, in a loop. After that loop, if there are more messages to process, another such task will be enqueued. Processing a message consists of passing it to the Receive stored in the ActorCell's behavior field (this mechanism allows the context.become pattern for changing behavior).
It's the last bit that provides the core of the guarantee that only one thread is ever invoking the Receive logic.
This is the Classic model for Akka Actors. If you are just learning actors then you should use Typed Actors because that is the supported model going forwards.
With typed actors, the actor system holds the state for each actor, not the actor itself. When an actor needs to process a message the actor system will pass the current state to the actor. The actor will return the new state back to the actor system when it has finished processing the message.
The typed model avoids all synchronisation issues because it does not use any external state, it only uses the state that is passed to it. And it does not modify any external state, it just returns a modified state value.
If you must use Classic actors then you can implement the same model using context.become rather than a var.

Is sending futures in Akka messages OK?

I'm working on implementing a small language to send tasks to execution and control execution flow. After the sending a task to my system, the user gets a future (on which it can call a blocking get() or flatMap() ). My question is: is it OK to send futures in Akka messages?
Example: actor A sends a message Response to actor B and Response contains a future among its fields. Then at some point A will fulfill the promise from which the future was created. After receiving the Response, B can call flatMap() or get() at any time.
I'm asking because Akka messages should be immutable and work even if actors are on different JVMs. I don't see how my example above can work if actors A and B are on different JVMs. Also, are there any problems with my example even if actors are on same JVM?
Something similar is done in the accepted answer in this stackoverflow question. Will this work if actors are on different JVMs?
Without remoting it's possible, but still not advisable. With remoting in play it won't work at all.
If your goal is to have an API that returns Futures, but uses actors as the plumbing underneath, one approach could be that the API creates its own actor internally that it asks, and then returns the future from that ask to the caller. The actor spawned by the API call is guaranteed to be local to the API instance and can communicate with the rest of the actor system via the regular tell/receive mechanism, so that there are no Futures sent as messages.
class MyTaskAPI(actorFactory: ActorRefFactory) {
def doSomething(...): Future[SomethingResult] = {
val taskActor = actorFactory.actorOf(Props[MyTaskActor])
taskActor ? DoSomething(...).mapTo[SomethingResult]
}
}
where MyTaskActor receives the DoSomething, captures the sender, sends out the request for task processince and likely becomes a receiving state for SomethingResult which finally responds to the captured sender and stops itself. This approach creates two actors per request, one explicitly, the MyTaskActor and one implicitly, the handler of the ask, but keeps all state inside of actors.
Alternately, you could use the ActorDSL to create just one actor inline of doSomething and use a captured Promise for completion instead of using ask:
class MyTaskAPI(system: System) {
def doSomething(...): Future[SomethingResult] = {
val p = Promise[SomethingResult]()
val tmpActor = actor(new Act {
become {
case msg:SomethingResult =>
p.success(msg)
self.stop()
}
}
system.actorSelection("user/TaskHandler").tell(DoSomething(...), tmpActor)
p.future
}
}
This approach is a bit off the top of my head and it does use a shared value between the API and the temp actor, which some might consider a smell, but should give an idea how to implement your workflow.
If you're asking if it's possible, then yes, it's possible. Remote actors are basically interprocess communication. If you set everything up on both machines to a state where both can properly handle the future, then it should be good. You don't give any working example so I can't really delve deeper into it.

Scala how to use akka actors to handle a timing out operation efficiently

I am currently evaluating javascript scripts using Rhino in a restful service. I wish for there to be an evaluation time out.
I have created a mock example actor (using scala 2.10 akka actors).
case class Evaluate(expression: String)
class RhinoActor extends Actor {
override def preStart() = { println("Start context'"); super.preStart()}
def receive = {
case Evaluate(expression) ⇒ {
Thread.sleep(100)
sender ! "complete"
}
}
override def postStop() = { println("Stop context'"); super.postStop()}
}
Now I run use this actor as follows:
def run {
val t = System.currentTimeMillis()
val system = ActorSystem("MySystem")
val actor = system.actorOf(Props[RhinoActor])
implicit val timeout = Timeout(50 milliseconds)
val future = (actor ? Evaluate("10 + 50")).mapTo[String]
val result = Try(Await.result(future, Duration.Inf))
println(System.currentTimeMillis() - t)
println(result)
actor ! PoisonPill
system.shutdown()
}
Is it wise to use the ActorSystem in a closure like this which may have simultaneous requests on it?
Should I make the ActorSystem global, and will that be ok in this context?
Is there a more appropriate alternative approach?
EDIT: I think I need to use futures directly, but I will need the preStart and postStop. Currently investigating.
EDIT: Seems you don't get those hooks with futures.
I'll try and answer some of your questions for you.
First, an ActorSystem is a very heavy weight construct. You should not create one per request that needs an actor. You should create one globally and then use that single instance to spawn your actors (and you won't need system.shutdown() anymore in run). I believe this covers your first two questions.
Your approach of using an actor to execute javascript here seems sound to me. But instead of spinning up an actor per request, you might want to pool a bunch of the RhinoActors behind a Router, with each instance having it's own rhino engine that will be setup during preStart. Doing this will eliminate per request rhino initialization costs, speeding up your js evaluations. Just make sure you size your pool appropriately. Also, you won't need to be sending PoisonPill messages per request if you adopt this approach.
You also might want to look into the non-blocking callbacks onComplete, onSuccess and onFailure as opposed to using the blocking Await. These callbacks also respect timeouts and are preferable to blocking for higher throughput. As long as whatever is way way upstream waiting for this response can handle the asynchronicity (i.e. an async capable web request), then I suggest going this route.
The last thing to keep in mind is that even though code will return to the caller after the timeout if the actor has yet to respond, the actor still goes on processing that message (performing the evaluation). It does not stop and move onto the next message just because a caller timed out. Just wanted to make that clear in case it wasn't.
EDIT
In response to your comment about stopping a long execution there are some things related to Akka to consider first. You can call stop the actor, send a Kill or a PosionPill, but none of these will stop if from processing the message that it's currently processing. They just prevent it from receiving new messages. In your case, with Rhino, if infinite script execution is a possibility, then I suggest handling this within Rhino itself. I would dig into the answers on this post (Stopping the Rhino Engine in middle of execution) and setup your Rhino engine in the actor in such a way that it will stop itself if it has been executing for too long. That failure will kick out to the supervisor (if pooled) and cause that pooled instance to be restarted which will init a new Rhino in preStart. This might be the best approach for dealing with the possibility of long running scripts.

Easiest way to do idle processing in a Scala Actor?

I have a scala actor that does some work whenever a client requests it. When, and only when no client is active, I would like the Actor to do some background processing.
What is the easiest way to do this? I can think of two approaches:
Spawn a new thread that times out and wakes up the actor periodically. A straight forward approach, but I would like to avoid creating another thread (to avoid the extra code, complexity and overhead).
The Actor class has a reactWithin method, which could be used to time out from the actor itself. But the documentation says the method doesn't return. So, I am not sure how to use it.
Edit; a clarification:
Assume that the background task can be broken down into smaller units that can be independently processed.
Ok, I see I need to put my 2 cents. From the author's answer I guess the "priority receive" technique is exactly what is needed here. It is possible to find discussion in "Erlang: priority receive question here at SO". The idea is to accept high priority messages first and to accept other messages only in absence of high-priority ones.
As Scala actors are very similar to Erlang, a trivial code to implement this would look like this:
def act = loop {
reactWithin(0) {
case msg: HighPriorityMessage => // process msg
case TIMEOUT =>
react {
case msg: HighPriorityMessage => // process msg
case msg: LowPriorityMessage => // process msg
}
}
}
This works as follows. An actor has a mailbox (queue) with messages. The receive (or receiveWithin) argument is a partial function and Actor library looks for a message in a mailbox which can be applied to this partial function. In our case it would be an object of HighPriorityMessage only. So, if Actor library finds such a message, it applies our partial function and we are processing a message of high priority. Otherwise, reactWithin with timeout 0 calls our partial function with argument TIMEOUT and we immediately try to process any possible message from the queue (as it waits for a message we cannot exclude a possiblity to get HighPriorityMessage).
It sounds like the problem you describe is not well suited to the actor sub-system. An Actor is designed to sequentially process its message queue:
What should happen if the actor is performing the background work and a new task arrives?
An actor can only find out about this is it is continuously checking its mailbox as it performs the background task. How would you implement this (i.e. how would you code the background tasks as a unit of work so that the actor could keep interrupting and checking the mailbox)?
What should happen if the actor has many background tasks in its mailbox in front of the main task?
Do these background tasks get thrown away, or sent to another actor? If the latter, how can you prevent CPU time being given to that actor to perform the tasks?
All in all, it sounds much more like you need to explore some grid-style software that can run in the background (like Data Synapse)!
Just after asking this question I tried out some completely whacky code and it seems to work fine. I am not sure though if there is a gotcha in it.
import scala.actors._
object Idling
object Processor extends Actor {
start
import Actor._
def act() = {
loop {
// here lie dragons >>>>>
if (mailboxSize == 0) this ! Idling
// <<<<<<
react {
case msg:NormalMsg => {
// do the normal work
reply(answer)
}
case Idling=> {
// do the idle work in chunks
}
case msg => println("Rcvd unknown message:" + msg)
}
}
}
}
Explanation
Any code inside the argument of loop but before the call to react seems to get called when the Actor is about to wait for a message. I am sending a Idling message to self here. In the handler for this message I ensure that the mailbox-size is 0, before doing the processing.