Is it OK to use blocking actor messages when they are wrapped in a future? - scala

My current application is based on akka 1.1. It has multiple ProjectAnalysisActors each responsible for handling analysis tasks for a specific project. The analysis is started when such an actor receives a generic start message. After finishing one step it sends itself a message with the next step as long one is defined. The executing code basically looks as follows
sealed trait AnalysisEvent {
def run(project: Project): Future[Any]
def nextStep: AnalysisEvent = null
}
case class StartAnalysis() extends AnalysisEvent {
override def run ...
override def nextStep: AnalysisEvent = new FirstStep
}
case class FirstStep() extends AnalysisEvent {
override def run ...
override def nextStep: AnalysisEvent = new SecondStep
}
case class SecondStep() extends AnalysisEvent {
...
}
class ProjectAnalysisActor(project: Project) extends Actor {
def receive = {
case event: AnalysisEvent =>
val future = event.run(project)
future.onComplete { f =>
self ! event.nextStep
}
}
}
I have some difficulties how to implement my code for the run-methods for each analysis step. At the moment I create a new future within each run-method. Inside this future I send all follow-up messages into the different subsystems. Some of them are non-blocking fire-and-forget messages, but some of them return a result which should be stored before the next analysis step is started.
At the moment a typical run-method looks as follows
def run(project: Project): Future[Any] = {
Future {
progressActor ! typicalFireAndForget(project.name)
val calcResult = (calcActor1 !! doCalcMessage(project)).getOrElse(...)
val p: Project = ... // created updated project using calcResult
val result = (storage !! updateProjectInformation(p)).getOrElse(...)
result
}
}
Since those blocking messages should be avoided, I'm wondering if this is the right way. Does it make sense to use them in this use case or should I still avoid it? If so, what would be a proper solution?

Apparently the only purpose of the ProjectAnalysisActor is to chain future calls. Second, the runs methods seems also to wait on results to continue computations.
So I think you can try refactoring your code to use Future Composition, as explained here: http://akka.io/docs/akka/1.1/scala/futures.html
def run(project: Project): Future[Any] = {
progressActor ! typicalFireAndForget(project.name)
for(
calcResult <- calcActor1 !!! doCalcMessage(project);
p = ... // created updated project using calcResult
result <- storage !!! updateProjectInformation(p)
) yield (
result
)
}

Related

Gracefully shutdown different supervisor actors without duplicating code

I have an API that creates actor A (at runtime). Then, A creates Actor B (at runtime as well).
I have another API that creates Actor C (different from actor A, No command code between them) and C creates Actor D.
I want to gracefully shutdown A and C once B and D has finished processing their messages (A and C not necessarily run together, They are unrelated).
Sending poison pill to A/C is not good enough because the children (B/D) will still get context stop, and will not be able to finish their tasks.
I understand I need to implement a new type of message.
I didn't understand how to create an infrastructure so both A and C will know how to respond to this message without having same duplicate receive method in both.
The solution I found was to create a new trait that extends Actor and override the unhandled method.
The code looks like this:
object SuicideActor {
case class PleaseKillYourself()
case class IKilledMyself()
}
trait SuicideActor extends Actor {
override def unhandled(message: Any): Unit = message match {
case PleaseKillYourself =>
Logger.debug(s"Actor ${self.path} received PleaseKillYourself - stopping children and aborting...")
val livingChildren = context.children.size
if (livingChildren == 0) {
endLife()
} else {
context.children.foreach(_ ! PleaseKillYourself)
context become waitForChildren(livingChildren)
}
case _ => super.unhandled(message)
}
protected[crystalball] def waitForChildren(livingChildren: Int): Receive = {
case IKilledMyself =>
val remaining = livingChildren - 1
if (remaining == 0) { endLife() }
else { context become waitForChildren(remaining) }
}
private def endLife(): Unit = {
context.parent ! IKilledMyself
context stop self
}
}
But this sound a bit hacky.... Is there a better (non hacky) solution ?
I think designing your own termination procedure is not necessary and potentially can cause headache for future maintainers of you code as this is nonstandard akka behaviour that needs to be yet again understood.
If A and C unrelated and can terminate independently, the following options are possible. To avoid any confusion, I'll use just actor A and B in my explanations.
Option 1.
Actor A uses context.watch on newly created actor B and reacts on Terminated message in its receive method. Actor B calls context.stop(context.self) when it's done with its task and this will generate Terminated event that will be handled by actor A that can clean up its state if needed and terminate too.
Check these docs for more details.
Option 2.
Actor B calls context.stop(context.parent) to terminate the parent directly when it's done with its own task. This option does not allow parent to react and perform additional clean up tasks if needed.
Finally, sharing this logic between actors A and C can be done with a trait in the way you did but the logic is very small and having duplicated code is not all the time a bad thing.
So it took me a bit but I find my answer.
I implemented The Reaper Pattern
The SuicideActor create a dedicated Reaper actor when it finished its block. The Reaper watch all of the SuicideActor children and once they all Terminated it send a PoisonPill to the SuicideActor and to itself
The SuicideActor code is :
trait SuicideActor extends Actor {
def killSwitch(block: => Unit): Unit = {
block
Logger.info(s"Actor ${self.path.name} is commencing suicide sequence...")
context become PartialFunction.empty
val children = context.children
val reaper = context.system.actorOf(ReaperActor.props(self), s"ReaperFor${self.path.name}")
reaper ! Reap(children.toSeq)
}
override def postStop(): Unit = Logger.debug(s"Actor ${self.path.name} is dead.")
}
And the Reaper is:
object ReaperActor {
case class Reap(underWatch: Seq[ActorRef])
def props(supervisor: ActorRef): Props = {
Props(new ReaperActor(supervisor))
}
}
class ReaperActor(supervisor: ActorRef) extends Actor {
override def preStart(): Unit = Logger.info(s"Reaper for ${supervisor.path.name} started")
override def postStop(): Unit = Logger.info(s"Reaper for ${supervisor.path.name} ended")
override def receive: Receive = {
case Reap(underWatch) =>
if (underWatch.isEmpty) {
killLeftOvers
} else {
underWatch.foreach(context.watch)
context become reapRemaining(underWatch.size)
underWatch.foreach(_ ! PoisonPill)
}
}
def reapRemaining(livingActorsNumber: Int): Receive = {
case Terminated(_) =>
val remainingActorsNumber = livingActorsNumber - 1
if (remainingActorsNumber == 0) {
killLeftOvers
} else {
context become reapRemaining(remainingActorsNumber)
}
}
private def killLeftOvers = {
Logger.debug(s"All children of ${supervisor.path.name} are dead killing supervisor")
supervisor ! PoisonPill
self ! PoisonPill
}
}

Akka-Streams ActorPublisher does not receive any Request messages

I am trying to continuously read the wikipedia IRC channel using this lib: https://github.com/implydata/wikiticker
I created a custom Akka Publisher, which will be used in my system as a Source.
Here are some of my classes:
class IrcPublisher() extends ActorPublisher[String] {
import scala.collection._
var queue: mutable.Queue[String] = mutable.Queue()
override def receive: Actor.Receive = {
case Publish(s) =>
println(s"->MSG, isActive = $isActive, totalDemand = $totalDemand")
queue.enqueue(s)
publishIfNeeded()
case Request(cnt) =>
println("Request: " + cnt)
publishIfNeeded()
case Cancel =>
println("Cancel")
context.stop(self)
case _ =>
println("Hm...")
}
def publishIfNeeded(): Unit = {
while (queue.nonEmpty && isActive && totalDemand > 0) {
println("onNext")
onNext(queue.dequeue())
}
}
}
object IrcPublisher {
case class Publish(data: String)
}
I am creating all this objects like so:
def createSource(wikipedias: Seq[String]) {
val dataPublisherRef = system.actorOf(Props[IrcPublisher])
val dataPublisher = ActorPublisher[String](dataPublisherRef)
val listener = new MessageListener {
override def process(message: Message) = {
dataPublisherRef ! Publish(Jackson.generate(message.toMap))
}
}
val ticker = new IrcTicker(
"irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(listener)
)
ticker.start() // if I comment this...
Thread.currentThread().join() //... and this I get Request(...)
Source.fromPublisher(dataPublisher)
}
So the problem I am facing is this Source object. Although this implementation works well with other sources (for example from local file), the ActorPublisher don't receive Request() messages.
If I comment the two marked lines I can see, that my actor has received the Request(count) message from my flow. Otherwise all messages will be pushed into the queue, but not in my flow (so I can see the MSG messages printed).
I think it's something with multithreading/synchronization here.
I am not familiar enough with wikiticker to solve your problem as given. One question I would have is: why is it necessary to join to the current thread?
However, I think you have overcomplicated the usage of Source. It would be easier for you to work with the stream as a whole rather than create a custom ActorPublisher.
You can use Source.actorRef to materialize a stream into an ActorRef and work with that ActorRef. This allows you to utilize akka code to do the enqueing/dequeing onto the buffer while you can focus on the "business logic".
Say, for example, your entire stream is only to filter lines above a certain length and print them to the console. This could be accomplished with:
def dispatchIRCMessages(actorRef : ActorRef) = {
val ticker =
new IrcTicker("irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(new MessageListener {
override def process(message: Message) =
actorRef ! Publish(Jackson.generate(message.toMap))
}))
ticker.start()
Thread.currentThread().join()
}
//these variables control the buffer behavior
val bufferSize = 1024
val overFlowStrategy = akka.stream.OverflowStrategy.dropHead
val minMessageSize = 32
//no need for a custom Publisher/Queue
val streamRef =
Source.actorRef[String](bufferSize, overFlowStrategy)
.via(Flow[String].filter(_.size > minMessageSize))
.to(Sink.foreach[String](println))
.run()
dispatchIRCMessages(streamRef)
The dispatchIRCMessages has the added benefit that it will work with any ActorRef so you aren't required to only work with streams/publishers.
Hopefully this solves your underlying problem...
I think the main problem is Thread.currentThread().join(). This line will 'hang' current thread because this thread is waiting for himself to die. Please read https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#join-long- .

Scala Akka Consumer/Producer: Return Value

Problem Statement
Assume I have a file with sentences that is processed line by line. In my case, I need to extract Named Entities (Persons, Organizations, ...) from these lines. Unfortunately, the tagger is quite slow. Therefore, I decided to parallelize the computation, such that lines could be processed independent from each other and the result is collected in a central location.
Current Approach
My current approach comprises the usage of a single producer multiple consumer concept. However, I'm relative new to Akka, but I think my problem description fits well into its capabilities. Let me show you some code:
Producer
The Producer reads the file line by line and sends it to the Consumer. If it reaches the total line limit, it propagates the result back to WordCount.
class Producer(consumers: ActorRef) extends Actor with ActorLogging {
var master: Option[ActorRef] = None
var result = immutable.List[String]()
var totalLines = 0
var linesProcessed = 0
override def receive = {
case StartProcessing() => {
master = Some(sender)
Source.fromFile("sent.txt", "utf-8").getLines.foreach { line =>
consumers ! Sentence(line)
totalLines += 1
}
context.stop(self)
}
case SentenceProcessed(list) => {
linesProcessed += 1
result :::= list
//If we are done, we can propagate the result to the creator
if (linesProcessed == totalLines) {
master.map(_ ! result)
}
}
case _ => log.error("message not recognized")
}
}
Consumer
class Consumer extends Actor with ActorLogging {
def tokenize(line: String): Seq[String] = {
line.split(" ").map(_.toLowerCase)
}
override def receive = {
case Sentence(sent) => {
//Assume: This is representative for the extensive computation method
val tokens = tokenize(sent)
sender() ! SentenceProcessed(tokens.toList)
}
case _ => log.error("message not recognized")
}
}
WordCount (Master)
class WordCount extends Actor {
val consumers = context.actorOf(Props[Consumer].
withRouter(FromConfig()).
withDispatcher("consumer-dispatcher"), "consumers")
val producer = context.actorOf(Props(new Producer(consumers)), "producer")
context.watch(consumers)
context.watch(producer)
def receive = {
case Terminated(`producer`) => consumers ! Broadcast(PoisonPill)
case Terminated(`consumers`) => context.system.shutdown
}
}
object WordCount {
def getActor() = new WordCount
def getConfig(routerType: String, dispatcherType: String)(numConsumers: Int) = s"""
akka.actor.deployment {
/WordCount/consumers {
router = $routerType
nr-of-instances = $numConsumers
dispatcher = consumer-dispatcher
}
}
consumer-dispatcher {
type = $dispatcherType
executor = "fork-join-executor"
}"""
}
The WordCount actor is responsible for creating the other actors. When the Consumer is finished the Producer sends a message with all tokens. But, how to propagate the message again and also accept and wait for it? The architecture with the third WordCount actor might be wrong.
Main Routine
case class Run(name: String, actor: () => Actor, config: (Int) => String)
object Main extends App {
val run = Run("push_implementation", WordCount.getActor _, WordCount.getConfig("balancing-pool", "Dispatcher") _)
def execute(run: Run, numConsumers: Int) = {
val config = ConfigFactory.parseString(run.config(numConsumers))
val system = ActorSystem("Counting", ConfigFactory.load(config))
val startTime = System.currentTimeMillis
system.actorOf(Props(run.actor()), "WordCount")
/*
How to get the result here?!
*/
system.awaitTermination
System.currentTimeMillis - startTime
}
execute(run, 4)
}
Problem
As you see, the actual problem is to propagate the result back to the Main routine. Can you tell me how to do this in a proper way? The question is also how to wait for the result until the consumers are finished? I had a brief look into the Akka Future documentation section, but the whole system is a little bit overwhelming for beginners. Something like var future = message ? actor seems suitable. Not sure, how to do this. Also using the WordCount actor causes additional complexity. Maybe it is possible to come up with a solution that doesn't need this actor?
Consider using the Akka Aggregator Pattern. That takes care of the low-level primitives (watching actors, poison pill, etc). You can focus on managing state.
Your call to system.actorOf() returns an ActorRef, but you're not using it. You should ask that actor for results. Something like this:
implicit val timeout = Timeout(5 seconds)
val wCount = system.actorOf(Props(run.actor()), "WordCount")
val answer = Await.result(wCount ? "sent.txt", timeout.duration)
This means your WordCount class needs a receive method that accepts a String message. That section of code should aggregate the results and tell the sender(), like this:
class WordCount extends Actor {
def receive: Receive = {
case filename: String =>
// do all of your code here, using filename
sender() ! results
}
}
Also, rather than blocking on the results with Await above, you can apply some techniques for handling Futures.

akka Actor selection without race condition

I have a futures pool , and each future works with the same akka Actor System - some Actors in system should be global, some are used only in one future.
val longFutures = for (i <- 0 until 2 ) yield Future {
val p:Page = PhantomExecutor(isDebug=true)
Await.result( p.open("http://www.stackoverflow.com/") ,timeout = 10.seconds)
}
PhantomExecutor tryes to use one shared global actor (simple increment counter) using system.actorSelection
def selectActor[T <: Actor : ClassTag](system:ActorSystem,name:String) = {
val timeout = Timeout(0.1 seconds)
val myFutureStuff = system.actorSelection("akka://"+system.name+"/user/"+name)
val aid:ActorIdentity = Await.result(myFutureStuff.ask(Identify(1))(timeout).mapTo[ActorIdentity],
0.1 seconds)
aid.ref match {
case Some(cacher) =>
cacher
case None =>
system.actorOf(Props[T],name)
}
}
But in concurrent environment this approach does not work because of race condition.
I know only one solution for this problem - create global actors before splitting to futures. But this means that I can't encapsulate alot of hidden work from top library user.
You're right in that making sure the global actors are initialized first is the right approach. Can't you tie them to a companion object and reference them from there so you know they will only ever be initialized one time? If you really can't go with such an approach then you could try something like this to lookup or create the actor. It is similar to your code but it include logic to go back through the lookup/create logic (recursively) if the race condition is hit (only up to a max number of times):
def findOrCreateActor[T <: Actor : ClassTag](system:ActorSystem, name:String, maxAttempts:Int = 5):ActorRef = {
import system.dispatcher
val timeout = 0.1 seconds
def doFindOrCreate(depth:Int = 0):ActorRef = {
if (depth >= maxAttempts)
throw new RuntimeException(s"Can not create actor with name $name and reached max attempts of $maxAttempts")
val selection = system.actorSelection(s"/user/$name")
val fut = selection.resolveOne(timeout).map(Some(_)).recover{
case ex:ActorNotFound => None
}
val refOpt = Await.result(fut, timeout)
refOpt match {
case Some(ref) => ref
case None => util.Try(system.actorOf(Props[T],name)).getOrElse(doFindOrCreate(depth + 1))
}
}
doFindOrCreate()
}
Now the retry logic would fire for any exception when creating the actor, so you might want to further specify that (probably via another recover combinator) to only recurse when it gets an InvalidActorNameException, but you get the idea.
You may want to consider creating a manager actor that would take care about creating "counter" actors. This way you would ensure that counter actor creation requests are serialized.
object CounterManagerActor {
case class SelectActorRequest(name : String)
case class SelectActorResponse(name : String, actorRef : ActorRef)
}
class CounterManagerActor extends Actor {
def receive = {
case SelectActorRequest(name) => {
sender() ! SelectActorResponse(name, selectActor(name))
}
}
private def selectActor(name : String) = {
// a slightly modified version of the original selectActor() method
???
}
}

Initializing an actor before being able to handle some other messages

I have an actor which creates another one:
class MyActor1 extends Actor {
val a2 = system actorOf Props(new MyActor(123))
}
The second actor must initialize (bootstrap) itself once it created and only after that it must be able to do other job.
class MyActor2(a: Int) extends Actor {
//initialized (bootstrapped) itself, potentially a long operation
//how?
val initValue = // get from a server
//handle incoming messages
def receive = {
case "job1" => // do some job but after it's initialized (bootstrapped) itself
}
}
So the very first thing MyActor2 must do is do some job of initializing itself. It might take some time because it's request to a server. Only after it finishes successfully, it must become able to handle incoming messages through receive. Before that - it must not do that.
Of course, a request to a server must be asynchronous (preferably, using Future, not async, await or other high level stuff like AsyncHttpClient). I know how to use Future, it's not a problem, though.
How do I ensure that?
p.s. My guess is that it must send a message to itself first.
You could use become method to change actor's behavior after initialization:
class MyActor2(a: Int) extends Actor {
server ! GetInitializationData
def initialize(d: InitializationData) = ???
//handle incoming messages
val initialized: Receive = {
case "job1" => // do some job but after it's initialized (bootstrapped) itself
}
def receive = {
case d # InitializationData =>
initialize(d)
context become initialized
}
}
Note that such actor will drop all messages before initialization. You'll have to preserve these messages manually, for instance using Stash:
class MyActor2(a: Int) extends Actor with Stash {
...
def receive = {
case d # InitializationData =>
initialize(d)
unstashAll()
context become initialized
case _ => stash()
}
}
If you don't want to use var for initialization you could create initialized behavior using InitializationData like this:
class MyActor2(a: Int) extends Actor {
server ! GetInitializationData
//handle incoming messages
def initialized(intValue: Int, strValue: String): Receive = {
case "job1" => // use `intValue` and `strValue` here
}
def receive = {
case InitializationData(intValue, strValue) =>
context become initialized(intValue, strValue)
}
}
I don't know wether the proposed solution is a good idea. It seems awkward to me to send a Initialization message. Actors have a lifecycle and offer some hooks. When you have a look at the API, you will discover the prestart hook.
Therefore i propose the following:
When the actor is created, its preStart hook is run, where you do your server request which returns a future.
While the future is not completed all incoming messages are stashed.
When the future completes it uses context.become to use your real/normal receive method.
After the become you unstash everything.
Here is a rough sketch of the code (bad solution, see real solution below):
class MyActor2(a: Int) extends Actor with Stash{
def preStart = {
val future = // do your necessary server request (should return a future)
future onSuccess {
context.become(normalReceive)
unstash()
}
}
def receive = initialReceive
def initialReceive = {
case _ => stash()
}
def normalReceive = {
// your normal Receive Logic
}
}
UPDATE: Improved solution according to Senias feedback
class MyActor2(a: Int) extends Actor with Stash{
def preStart = {
val future = // do your necessary server request (should return a future)
future onSuccess {
self ! InitializationDone
}
}
def receive = initialReceive
def initialReceive = {
case InitializationDone =>
context.become(normalReceive)
unstash()
case _ => stash()
}
def normalReceive = {
// your normal Receive Logic
}
case class InitializationDone
}