Future[Source] pipeTo an Actor - scala

There are two local actors (the remoting is not used). Actors were simplified for the example:
class ProcessorActor extends Actor {
override def receive: Receive = {
case src:Source[Int, NotUsed] =>
//TODO processing of `src` here
}
}
class FrontendActor extends Actor {
val processor = context.system.actorOf(Props[ProcessorActor])
...
override def receive: Receive = {
case "Hello" =>
val f:Future[Source[Int, NotUsed]] = Future (Source(1 to 100))
f pipeTo processor
}
}
// entry point:
val frontend = system.actorOf(Props[FrontendActor])
frontend ! "Hello"
Thus the FrontendActor sends Source to ProcessorActor. In the above example it is works successfully.
Is such approach okay?

Thus the FrontendActor sends Source to ProcessorActor. In the above example it is works successfully.
Is such approach okay?
It's unclear what your concern is.
Sending a Source from one actor to another actor on the same JVM is fine. Because inter-actor communication on the same JVM, as the documentation states, "is simply done via reference passing," there is nothing unusual about your example1. Essentially what is going on is that a reference to a Source is passed to ProcessorActor once the Future is completed. A Source is an object that defines part of a stream; you can send a Source from one actor to another actor locally just as you can any JVM object.
(However, once you cross the boundary of a single JVM, you have to deal with serialization.)
1 A minor, tangential observation: FrontendActor calls context.system.actorOf(Props[ProcessorActor]), which creates a top-level actor. Typically, top-level actors are created in the main program, not within an actor.

Yes, this is OK, but does not work quite how you describe it. FrontendActor does not send Future[Source], it just sends Source.
From the docs:
pipeTo installs an onComplete-handler on the future to affect the submission of the result to another actor.
In other words, pipeTo means "send the result of this Future to the actor when it becomes available".
Note that this will work even if remoting is being used because the Future is resolved locally and is not sent over the wire to a remote actor.

Related

Akka Classic Ask pattern. How does it match asks with responses?

I'm a newbie with Akka Actors, and I am learning about the Ask pattern. I am looking at the following example from alvin alexander:
class TestActor extends Actor {
def receive = {
case AskNameMessage => // respond to the "ask" request
sender ! "Fred"
case _ => println("that was unexpected")
}
}
...
val myActor = system.actorOf(Props[TestActor], name = "myActor")
// (1) this is one way to "ask" another actor
implicit val timeout = Timeout(5 seconds)
val future = myActor ? AskNameMessage
val result = Await.result(future, timeout.duration).asInstanceOf[String]
println(result)
(Yes, I know that Await.result isn't generally the best practice, but this is just a simple example.)
So from what I can tell, the only thing you need to do to implement the "askee" actor to service an Ask operation is to send a message back to the "asker" via the Tell operator, and that will be turned into a future on the "asker" side as a response to the Ask. Seems simple enough.
My question is this:
When the response comes back, how does Akka know that this particular message is the response to a certain Ask message?
In the example above, the "Fred" message doesn't contain any specific routing information that specifies that it's the response to a particular Ask operation. Does it just assume that the next message that the asker receives from that askee is the answer to the Ask? If that's the case, then what if one actor sends multiple Ask operations to the same askee? Wouldn't the responses likely get jumbled, causing random responses to be mapped to the wrong Asks?
Or, what if the asker is also receiving other types of messages from the same askee actor that are unrelated to these Ask messages? Couldn't the Asks receive response messages of the wrong type?
Just to be clear, I'm asking about Akka Classic, not Typed.
For every Ask message sent to an actor, akka creates a proxy ActorRef whose sole responsibility is to process one single message. This temp "actor" is initialized with a promise, which it needs to complete on message processing.
The source code of it is found here
but the main details are
private[akka] final class PromiseActorRef private (
val provider: ActorRefProvider,
val result: Promise[Any],
....
val alreadyCompleted = !result.tryComplete(promiseResult)
Now, it should be clear that Ask pattern is backed by independent unique actor asker for every message sent to the receiver askee.
The askee does know actor reference of the sender, or asker, of every message received via method context.sender(). Thus, it just needs to use this ActorRef to send a response back to the asker.
Finally, this all avoids any race conditions given that an actor only processes a message at a time. Thus it excludes any possibility of retrieving a "wrong" asker via method context.sender().

How does akka update a mutable state? [duplicate]

This question already has an answer here:
Does Akka onReceive method execute concurrently?
(1 answer)
Closed 1 year ago.
I read Akka documentation, but I do not understand:
class MyActor extends Actor {
private var _state = 0
override def receive: Receive = {
case x: Int =>
if (x != _state) {
println(s"---------> fail: ${x} and ${_state}")
}
_state = x + 1
}
}
implicit val system = ActorSystem("my-system")
val ref = system.actorOf(Props[MyActor], "my-actor")
(0 to 10000).foreach { x =>
ref ! x
}
I have a _state variable which is not #volatile and not atomic but at the same time _state is always correct, if I do changes with !-method.
How does Akka protect and update the internal state of actors?
Akka is an implementation of the Actor Model of computation. One of the (arguably the) key guarantees made by the actor model is that an actor only ever processes a single message at a time. Simply by virtue of having _state be private to an actor, you get a concurrency guarantee that's at least as strong as if you had every method of an object be #synchronized, with the added bonus of the ! operation to send a message being non-blocking.
Under the hood, a rough (simplified in a few places, but the broad strokes are accurate) outline of how it works and how the guarantee is enforced is:
Using the Props the ActorSystem constructs an instance of MyActor, places the only JVM reference to that instance inside an ActorCell (I'm told this terminology, as well as that the deep internals of Akka are "the dungeon", is inspired by the early development team for Akka being based in an office in what was previously a jail in Uppsala, Sweden), and keys that ActorCell with my-actor. In the meantime (technically this happens after system.actorOf has already returned the ActorRef), it constructs an ActorRef to allow user code to refer to the actor.
Inside the ActorCell, the receive method is called and the resulting PartialFunction[Any, Unit] (which has a type synonym of Receive) is saved in a field of the ActorCell which corresponds to the actor's behavior.
The ! operation on the ActorRef (at least for a local ActorRef) resolves which dispatcher is responsible for the actor and hands the message to the dispatcher. The dispatcher then enqueues the message into the mailbox of the ActorCell corresponding to my-actor (this is done in a thread-safe way).
If there is no task currently scheduled to process messages from the actor's mailbox, such a task is enqueued to the dispatcher's execution context to dequeue some (configurable) number of messages from the ActorCell's mailbox and process them, one-at-a-time, in a loop. After that loop, if there are more messages to process, another such task will be enqueued. Processing a message consists of passing it to the Receive stored in the ActorCell's behavior field (this mechanism allows the context.become pattern for changing behavior).
It's the last bit that provides the core of the guarantee that only one thread is ever invoking the Receive logic.
This is the Classic model for Akka Actors. If you are just learning actors then you should use Typed Actors because that is the supported model going forwards.
With typed actors, the actor system holds the state for each actor, not the actor itself. When an actor needs to process a message the actor system will pass the current state to the actor. The actor will return the new state back to the actor system when it has finished processing the message.
The typed model avoids all synchronisation issues because it does not use any external state, it only uses the state that is passed to it. And it does not modify any external state, it just returns a modified state value.
If you must use Classic actors then you can implement the same model using context.become rather than a var.

How can I gather state information from a set of actors using only the actorSystem?

I'm creating an actor system, which has a list of actors representing some kind of session state.
These session are created by a factory actor (which might, in the future, get replaced by a router, if performance requires that - this should be transparent to the rest of the system, however).
Now I want to implement an operation where I get some state information from each of my currently existing session actors.
I have no explicit session list, as I want to rely on the actor system "owning" the sessions. I tried to use the actor system to look up the current session actors. The problem is that I did not find a "get all actor refs with this naming pattern" method. I tried to use the "/" operator on the system, followed by resolveOne - but got lost in a maze of future types.
The basic idea I had was:
- Send a message to all current session actors (as given to my by my ActorSystem).
- Wait for a response from them (preferably by using just the "ask" pattern - the method calling this broadcaster request/response is just a monitoring resp. debugging method, so blocking is no probleme here.
- And then collect the responses into a result.
After a death match against Scala's type system I had to give up for now.
Is there really no way of doing something like this?
If I understand the question correctly, then I can offer up a couple of ways you can accomplish this (though there are certainly others).
Option 1
In this approach, there will be an actor that is responsible for waking up periodically and sending a request to all session actors to get their current stats. That actor will use ActorSelection with a wildcard to accomplish that goal. A rough outline if the code for this approach is as follows:
case class SessionStats(foo:Int, bar:Int)
case object GetSessionStats
class SessionActor extends Actor{
def receive = {
case GetSessionStats =>
println(s"${self.path} received a request to get stats")
sender ! SessionStats(1, 2)
}
}
case object GatherStats
class SessionStatsGatherer extends Actor{
context.system.scheduler.schedule(5 seconds, 5 seconds, self, GatherStats)(context.dispatcher)
def receive = {
case GatherStats =>
println("Waking up to gether stats")
val sel = context.system.actorSelection("/user/session*")
sel ! GetSessionStats
case SessionStats(f, b) =>
println(s"got session stats from ${sender.path}, values are $f and $b")
}
}
Then you could test this code with the following:
val system = ActorSystem("test")
system.actorOf(Props[SessionActor], "session-1")
system.actorOf(Props[SessionActor], "session-2")
system.actorOf(Props[SessionStatsGatherer])
Thread.sleep(10000)
system.actorOf(Props[SessionActor], "session-3")
So with this approach, as long as we use a naming convention, we can use an actor selection with a wildcard to always find all of the session actors even though they are constantly coming (starting) and going (stopping).
Option 2
A somewhat similar approach, but in this one, we use a centralized actor to spawn the session actors and act as a supervisor to them. This central actor also contains the logic to periodically poll for stats, but since it's the parent, it does not need an ActorSelection and can instead just use its children list. That would look like this:
case object SpawnSession
class SessionsManager extends Actor{
context.system.scheduler.schedule(5 seconds, 5 seconds, self, GatherStats)(context.dispatcher)
var sessionCount = 1
def receive = {
case SpawnSession =>
val session = context.actorOf(Props[SessionActor], s"session-$sessionCount")
println(s"Spawned session: ${session.path}")
sessionCount += 1
sender ! session
case GatherStats =>
println("Waking up to get session stats")
context.children foreach (_ ! GetSessionStats)
case SessionStats(f, b) =>
println(s"got session stats from ${sender.path}, values are $f and $b")
}
}
And could be tested as follows:
val system = ActorSystem("test")
val manager = system.actorOf(Props[SessionsManager], "manager")
manager ! SpawnSession
manager ! SpawnSession
Thread.sleep(10000)
manager ! SpawnSession
Now, these examples are extremely trivialized, but hopefully they paint a picture for how you could go about solving this issue with either ActorSelection or a management/supervision dynamic. And a bonus is that ask is not needed in either and also no blocking.
There have been many additional changes in this project, so my answer/comments have been delayed quite a bit :-/
First, the session stats gathering should not be periodical, but on request. My original idea was to "mis-use" the actor system as my map of all existing session actors, so that I would not need a supervisor actor knowing all sessions.
This goal has shown to be elusive - session actors depend on shared state, so the session creator must watch sessions anyways.
This makes Option 2 the obvious answer here - the session creator has to watch all children anyways.
The most vexing hurdle with option 1 was "how to determine when all (current) answers are there" - I wanted the statistics request to take a snapshot of all currently existing actor names, query them, ignore failures (if a session dies before it can be queried, it can be ignored here) - the statistics request is only a debugging tool, i.e. something like a "best effort".
The actor selection api tangled me up in a thicket of futures (I am a Scala/Akka newbie), so I gave up on this route.
Option 2 is therefore better suited to my needs.

Scala how to use akka actors to handle a timing out operation efficiently

I am currently evaluating javascript scripts using Rhino in a restful service. I wish for there to be an evaluation time out.
I have created a mock example actor (using scala 2.10 akka actors).
case class Evaluate(expression: String)
class RhinoActor extends Actor {
override def preStart() = { println("Start context'"); super.preStart()}
def receive = {
case Evaluate(expression) ⇒ {
Thread.sleep(100)
sender ! "complete"
}
}
override def postStop() = { println("Stop context'"); super.postStop()}
}
Now I run use this actor as follows:
def run {
val t = System.currentTimeMillis()
val system = ActorSystem("MySystem")
val actor = system.actorOf(Props[RhinoActor])
implicit val timeout = Timeout(50 milliseconds)
val future = (actor ? Evaluate("10 + 50")).mapTo[String]
val result = Try(Await.result(future, Duration.Inf))
println(System.currentTimeMillis() - t)
println(result)
actor ! PoisonPill
system.shutdown()
}
Is it wise to use the ActorSystem in a closure like this which may have simultaneous requests on it?
Should I make the ActorSystem global, and will that be ok in this context?
Is there a more appropriate alternative approach?
EDIT: I think I need to use futures directly, but I will need the preStart and postStop. Currently investigating.
EDIT: Seems you don't get those hooks with futures.
I'll try and answer some of your questions for you.
First, an ActorSystem is a very heavy weight construct. You should not create one per request that needs an actor. You should create one globally and then use that single instance to spawn your actors (and you won't need system.shutdown() anymore in run). I believe this covers your first two questions.
Your approach of using an actor to execute javascript here seems sound to me. But instead of spinning up an actor per request, you might want to pool a bunch of the RhinoActors behind a Router, with each instance having it's own rhino engine that will be setup during preStart. Doing this will eliminate per request rhino initialization costs, speeding up your js evaluations. Just make sure you size your pool appropriately. Also, you won't need to be sending PoisonPill messages per request if you adopt this approach.
You also might want to look into the non-blocking callbacks onComplete, onSuccess and onFailure as opposed to using the blocking Await. These callbacks also respect timeouts and are preferable to blocking for higher throughput. As long as whatever is way way upstream waiting for this response can handle the asynchronicity (i.e. an async capable web request), then I suggest going this route.
The last thing to keep in mind is that even though code will return to the caller after the timeout if the actor has yet to respond, the actor still goes on processing that message (performing the evaluation). It does not stop and move onto the next message just because a caller timed out. Just wanted to make that clear in case it wasn't.
EDIT
In response to your comment about stopping a long execution there are some things related to Akka to consider first. You can call stop the actor, send a Kill or a PosionPill, but none of these will stop if from processing the message that it's currently processing. They just prevent it from receiving new messages. In your case, with Rhino, if infinite script execution is a possibility, then I suggest handling this within Rhino itself. I would dig into the answers on this post (Stopping the Rhino Engine in middle of execution) and setup your Rhino engine in the actor in such a way that it will stop itself if it has been executing for too long. That failure will kick out to the supervisor (if pooled) and cause that pooled instance to be restarted which will init a new Rhino in preStart. This might be the best approach for dealing with the possibility of long running scripts.

Does this Scala actor block when creating new actor in a handler?

I have the following piece of code:
actor {
loop {
react {
case SomeEvent =>
//I want to submit a piece of work to a queue and then send a response
//when that is finished. However, I don't want *this* actor to block
val params = "Some args"
val f: Future[Any] = myQueue.submitWork( params );
actor {
//await here
val response = f.get
publisher ! response
}
}
}
}
As I understood it, the outer actor would not block on f.get because that is actually being performed by a separate actor (the one created inside the SomeEvent handler).
Is this correct?
Yes, that is correct. Your outer actor will simply create an actor and suspend (wait for its next message). However, be very careful of this kind of thing. The inner actor is started automatically on the scheduler, to be handled by a thread. That thread will block on that Future (that looks like a java.util.concurrent.Future to me). If you do this enough times, you can run into starvation problems where all available threads are blocking on Futures. An actor is basically a work queue, so you should use those semantics instead.
Here's a version of your code, using the Scalaz actors library. This library is much simpler and easier to understand than the standard Scala actors (the source is literally a page and a half). It also leads to much terser code:
actor {(e: SomeEvent) => promise { ... } to publisher }
This version is completely non-blocking.