akka Actor selection without race condition - scala

I have a futures pool , and each future works with the same akka Actor System - some Actors in system should be global, some are used only in one future.
val longFutures = for (i <- 0 until 2 ) yield Future {
val p:Page = PhantomExecutor(isDebug=true)
Await.result( p.open("http://www.stackoverflow.com/") ,timeout = 10.seconds)
}
PhantomExecutor tryes to use one shared global actor (simple increment counter) using system.actorSelection
def selectActor[T <: Actor : ClassTag](system:ActorSystem,name:String) = {
val timeout = Timeout(0.1 seconds)
val myFutureStuff = system.actorSelection("akka://"+system.name+"/user/"+name)
val aid:ActorIdentity = Await.result(myFutureStuff.ask(Identify(1))(timeout).mapTo[ActorIdentity],
0.1 seconds)
aid.ref match {
case Some(cacher) =>
cacher
case None =>
system.actorOf(Props[T],name)
}
}
But in concurrent environment this approach does not work because of race condition.
I know only one solution for this problem - create global actors before splitting to futures. But this means that I can't encapsulate alot of hidden work from top library user.

You're right in that making sure the global actors are initialized first is the right approach. Can't you tie them to a companion object and reference them from there so you know they will only ever be initialized one time? If you really can't go with such an approach then you could try something like this to lookup or create the actor. It is similar to your code but it include logic to go back through the lookup/create logic (recursively) if the race condition is hit (only up to a max number of times):
def findOrCreateActor[T <: Actor : ClassTag](system:ActorSystem, name:String, maxAttempts:Int = 5):ActorRef = {
import system.dispatcher
val timeout = 0.1 seconds
def doFindOrCreate(depth:Int = 0):ActorRef = {
if (depth >= maxAttempts)
throw new RuntimeException(s"Can not create actor with name $name and reached max attempts of $maxAttempts")
val selection = system.actorSelection(s"/user/$name")
val fut = selection.resolveOne(timeout).map(Some(_)).recover{
case ex:ActorNotFound => None
}
val refOpt = Await.result(fut, timeout)
refOpt match {
case Some(ref) => ref
case None => util.Try(system.actorOf(Props[T],name)).getOrElse(doFindOrCreate(depth + 1))
}
}
doFindOrCreate()
}
Now the retry logic would fire for any exception when creating the actor, so you might want to further specify that (probably via another recover combinator) to only recurse when it gets an InvalidActorNameException, but you get the idea.

You may want to consider creating a manager actor that would take care about creating "counter" actors. This way you would ensure that counter actor creation requests are serialized.
object CounterManagerActor {
case class SelectActorRequest(name : String)
case class SelectActorResponse(name : String, actorRef : ActorRef)
}
class CounterManagerActor extends Actor {
def receive = {
case SelectActorRequest(name) => {
sender() ! SelectActorResponse(name, selectActor(name))
}
}
private def selectActor(name : String) = {
// a slightly modified version of the original selectActor() method
???
}
}

Related

akka - get list of ActorRef from ActorSelection

ActorSelection support wildcard and it allows to send a message to all actors that match the selection:
context.actorSelection("../*") ! msg
Is there a way to retrieve the list of ActorRef that match the selection from the ActorSelection?
From the documentation I can see that there is a resolveOne function but not (for example) a resolveList.
Update
The reason why I want the list of ActorRef is that I would like to use the ask operator on all the actors that match the selection.
Just to be clear here is an example of what I would like to do:
object MyActor {
case object AskAllActors
case object GetActorInfo
}
class MyActor extends Actor {
import MyActor._
def receive = {
case AskAllActors =>
val actors: List[ActorRef] = context.actorSelection("../*").resolveList()
val result: List[Future[String]] = actors.map { a => (a ? GetActorInfo).mapTo[String] }
Future.sequence(result).map { result =>
// do something with result: List[String]
}
}
}
The ability to get a list of the form
val timeout : FiniteDuration = 10 seconds
val actorList = actorSelection.resolveMany(timeout)
does not exist. I suspect the reason is because once the timeout expires there is a chance that an Iterable, of non-zero length, is returned but it would be impossible to know if the Iterable is comprehensive. Some Actors may not have had enough time to respond. With resolveOne the issue does not exist, the first responding ActorRef is the result.
Based on the documentation it looks like you could use the Identify message to have all of the Actors specified in an actorSelection respond with identification:
class Follower extends Actor {
val identifyId = 1
context.actorSelection("../*") ! Identify(identifyId)
def receive = {
case ActorIdentity(`identifyId`, Some(ref)) => ...
}
}

Akka-Streams ActorPublisher does not receive any Request messages

I am trying to continuously read the wikipedia IRC channel using this lib: https://github.com/implydata/wikiticker
I created a custom Akka Publisher, which will be used in my system as a Source.
Here are some of my classes:
class IrcPublisher() extends ActorPublisher[String] {
import scala.collection._
var queue: mutable.Queue[String] = mutable.Queue()
override def receive: Actor.Receive = {
case Publish(s) =>
println(s"->MSG, isActive = $isActive, totalDemand = $totalDemand")
queue.enqueue(s)
publishIfNeeded()
case Request(cnt) =>
println("Request: " + cnt)
publishIfNeeded()
case Cancel =>
println("Cancel")
context.stop(self)
case _ =>
println("Hm...")
}
def publishIfNeeded(): Unit = {
while (queue.nonEmpty && isActive && totalDemand > 0) {
println("onNext")
onNext(queue.dequeue())
}
}
}
object IrcPublisher {
case class Publish(data: String)
}
I am creating all this objects like so:
def createSource(wikipedias: Seq[String]) {
val dataPublisherRef = system.actorOf(Props[IrcPublisher])
val dataPublisher = ActorPublisher[String](dataPublisherRef)
val listener = new MessageListener {
override def process(message: Message) = {
dataPublisherRef ! Publish(Jackson.generate(message.toMap))
}
}
val ticker = new IrcTicker(
"irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(listener)
)
ticker.start() // if I comment this...
Thread.currentThread().join() //... and this I get Request(...)
Source.fromPublisher(dataPublisher)
}
So the problem I am facing is this Source object. Although this implementation works well with other sources (for example from local file), the ActorPublisher don't receive Request() messages.
If I comment the two marked lines I can see, that my actor has received the Request(count) message from my flow. Otherwise all messages will be pushed into the queue, but not in my flow (so I can see the MSG messages printed).
I think it's something with multithreading/synchronization here.
I am not familiar enough with wikiticker to solve your problem as given. One question I would have is: why is it necessary to join to the current thread?
However, I think you have overcomplicated the usage of Source. It would be easier for you to work with the stream as a whole rather than create a custom ActorPublisher.
You can use Source.actorRef to materialize a stream into an ActorRef and work with that ActorRef. This allows you to utilize akka code to do the enqueing/dequeing onto the buffer while you can focus on the "business logic".
Say, for example, your entire stream is only to filter lines above a certain length and print them to the console. This could be accomplished with:
def dispatchIRCMessages(actorRef : ActorRef) = {
val ticker =
new IrcTicker("irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(new MessageListener {
override def process(message: Message) =
actorRef ! Publish(Jackson.generate(message.toMap))
}))
ticker.start()
Thread.currentThread().join()
}
//these variables control the buffer behavior
val bufferSize = 1024
val overFlowStrategy = akka.stream.OverflowStrategy.dropHead
val minMessageSize = 32
//no need for a custom Publisher/Queue
val streamRef =
Source.actorRef[String](bufferSize, overFlowStrategy)
.via(Flow[String].filter(_.size > minMessageSize))
.to(Sink.foreach[String](println))
.run()
dispatchIRCMessages(streamRef)
The dispatchIRCMessages has the added benefit that it will work with any ActorRef so you aren't required to only work with streams/publishers.
Hopefully this solves your underlying problem...
I think the main problem is Thread.currentThread().join(). This line will 'hang' current thread because this thread is waiting for himself to die. Please read https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#join-long- .

Akka: The order of responses

My demo app is simple. Here is an actor:
class CounterActor extends Actor {
#volatile private[this] var counter = 0
def receive: PartialFunction[Any, Unit] = {
case Count(id) ⇒ sender ! self ? Increment(id)
case Increment(id) ⇒ sender ! {
counter += 1
println(s"req_id=$id, counter=$counter")
counter
}
}
}
The main app:
sealed trait ActorMessage
case class Count(id: Int = 0) extends ActorMessage
case class Increment(id: Int) extends ActorMessage
object CountingApp extends App {
// Get incremented counter
val future0 = counter ? Count(1)
val future1 = counter ? Count(2)
val future2 = counter ? Count(3)
val future3 = counter ? Count(4)
val future4 = counter ? Count(5)
// Handle response
handleResponse(future0)
handleResponse(future1)
handleResponse(future2)
handleResponse(future3)
handleResponse(future4)
// Bye!
exit()
}
My handler:
def handleResponse(future: Future[Any]): Unit = {
future.onComplete {
case Success(f) => f.asInstanceOf[Future[Any]].onComplete {
case x => x match {
case Success(n) => println(s" -> $n")
case Failure(t) => println(s" -> ${t.getMessage}")
}
}
case Failure(t) => println(t.getMessage)
}
}
If I run the app I'll see the next output:
req_id=1, counter=1
req_id=2, counter=2
req_id=3, counter=3
req_id=4, counter=4
req_id=5, counter=5
-> 4
-> 1
-> 5
-> 3
-> 2
The order of handled responses is random. Is it normal behaviour? If no, how can I make it ordered?
PS
Do I need volatile var in the actor?
PS2
Also, I'm looking for some more convenient logic for handleResponse, because matching here is very ambiguous...
normal behavior?
Yes, this is absolutely normal behavior.
Your Actor is receiving the Count increments in the order you sent them but the Futures are being completed via submission to an underlying thread pool. It is that indeterminate ordering of Future-thread binding that is resulting in the out of order println executions.
how can I make it ordered?
If you want ordered execution of Futures then that is synonymous with synchronous programming, i.e. no concurrency at all.
do I need volatile?
The state of an Actor is only accessible within an Actor itself. That is why users of the Actor never get an actual Actor object, they only get an ActorRef, e.g. val actorRef = actorSystem actorOf Props[Actor] . This is partially to ensure that users of Actors never have the ability to change an Actor's state except through messaging. From the docs:
The good news is that Akka actors conceptually each have their own
light-weight thread, which is completely shielded from the rest of the
system. This means that instead of having to synchronize access using
locks you can just write your actor code without worrying about
concurrency at all.
Therefore, you don't need volatile.
more convenient logic
For more convenient logic I would recommend Agents, which are a kind of typed Actor with a simpler message framework. From the docs:
import scala.concurrent.ExecutionContext.Implicits.global
import akka.agent.Agent
val agent = Agent(5)
val result = agent()
val result = agent.get
agent send 7
agent send (_ + 1)
Reads are synchronous but instantaneous. Writes are asynch. This means any time you do a read you don't have to worry about Futures because the internal value returns immediately. But definitely read the docs because there are more complicated tricks you can play with the queueing logic.
Real trouble in your approach is not asynchronous nature but overcomplicated logic.
And despite pretty answer from Ramon which I +1d, yes there is way to ensure order in some parts of akka. As we can read from the doc there is message ordering per sender–receiver pair guarantee.
It means that for each one-way channel of two actors there is guarantee, that messages will be delivered in order they been sent.
But there is no such guarantee for Future task accomplishment order which you are using to handle answers. And sending Future from ask as message to original sender is way strange.
Thing you can do:
redefine your Increment as
case class Increment(id: Int, requester: ActorRef) extends ActorMessage
so handler could know original requester
modify CounterActor's receive as
def receive: Receive = {
case Count(id) ⇒ self ! Increment(id, sender)
case Increment(id, snd) ⇒ snd ! {
counter += 1
println(s"req_id=$id, counter=$counter")
counter
}
}
simplify your handleResponse to
def handleResponse(future: Future[Any]): Unit = {
future.onComplete {
case Success(n: Int) => println(s" -> $n")
case Failure(t) => println(t.getMessage)
}
}
Now you can probably see that messages are received back in the same order.
I said probably because handling still occures in Future.onComplete so we need another actor to ensure the order.
Lets define additional message
case object StartCounting
And actor itself:
class SenderActor extends Actor {
val counter = system.actorOf(Props[CounterActor])
def receive: Actor.Receive = {
case n: Int => println(s" -> $n")
case StartCounting =>
counter ! Count(1)
counter ! Count(2)
counter ! Count(3)
counter ! Count(4)
counter ! Count(5)
}
}
In your main you can now just write
val sender = system.actorOf(Props[SenderActor])
sender ! StartCounting
And throw away that handleResponse method.
Now you definitely should see your message handling in the right order.
We've implemented whole logic without single ask, and that's good.
So magic rule is: leave handling responses to actors, get only final results from them via ask.
Note there is also forward method but this creates proxy actor so message ordering will be broken again.

Scala Akka Consumer/Producer: Return Value

Problem Statement
Assume I have a file with sentences that is processed line by line. In my case, I need to extract Named Entities (Persons, Organizations, ...) from these lines. Unfortunately, the tagger is quite slow. Therefore, I decided to parallelize the computation, such that lines could be processed independent from each other and the result is collected in a central location.
Current Approach
My current approach comprises the usage of a single producer multiple consumer concept. However, I'm relative new to Akka, but I think my problem description fits well into its capabilities. Let me show you some code:
Producer
The Producer reads the file line by line and sends it to the Consumer. If it reaches the total line limit, it propagates the result back to WordCount.
class Producer(consumers: ActorRef) extends Actor with ActorLogging {
var master: Option[ActorRef] = None
var result = immutable.List[String]()
var totalLines = 0
var linesProcessed = 0
override def receive = {
case StartProcessing() => {
master = Some(sender)
Source.fromFile("sent.txt", "utf-8").getLines.foreach { line =>
consumers ! Sentence(line)
totalLines += 1
}
context.stop(self)
}
case SentenceProcessed(list) => {
linesProcessed += 1
result :::= list
//If we are done, we can propagate the result to the creator
if (linesProcessed == totalLines) {
master.map(_ ! result)
}
}
case _ => log.error("message not recognized")
}
}
Consumer
class Consumer extends Actor with ActorLogging {
def tokenize(line: String): Seq[String] = {
line.split(" ").map(_.toLowerCase)
}
override def receive = {
case Sentence(sent) => {
//Assume: This is representative for the extensive computation method
val tokens = tokenize(sent)
sender() ! SentenceProcessed(tokens.toList)
}
case _ => log.error("message not recognized")
}
}
WordCount (Master)
class WordCount extends Actor {
val consumers = context.actorOf(Props[Consumer].
withRouter(FromConfig()).
withDispatcher("consumer-dispatcher"), "consumers")
val producer = context.actorOf(Props(new Producer(consumers)), "producer")
context.watch(consumers)
context.watch(producer)
def receive = {
case Terminated(`producer`) => consumers ! Broadcast(PoisonPill)
case Terminated(`consumers`) => context.system.shutdown
}
}
object WordCount {
def getActor() = new WordCount
def getConfig(routerType: String, dispatcherType: String)(numConsumers: Int) = s"""
akka.actor.deployment {
/WordCount/consumers {
router = $routerType
nr-of-instances = $numConsumers
dispatcher = consumer-dispatcher
}
}
consumer-dispatcher {
type = $dispatcherType
executor = "fork-join-executor"
}"""
}
The WordCount actor is responsible for creating the other actors. When the Consumer is finished the Producer sends a message with all tokens. But, how to propagate the message again and also accept and wait for it? The architecture with the third WordCount actor might be wrong.
Main Routine
case class Run(name: String, actor: () => Actor, config: (Int) => String)
object Main extends App {
val run = Run("push_implementation", WordCount.getActor _, WordCount.getConfig("balancing-pool", "Dispatcher") _)
def execute(run: Run, numConsumers: Int) = {
val config = ConfigFactory.parseString(run.config(numConsumers))
val system = ActorSystem("Counting", ConfigFactory.load(config))
val startTime = System.currentTimeMillis
system.actorOf(Props(run.actor()), "WordCount")
/*
How to get the result here?!
*/
system.awaitTermination
System.currentTimeMillis - startTime
}
execute(run, 4)
}
Problem
As you see, the actual problem is to propagate the result back to the Main routine. Can you tell me how to do this in a proper way? The question is also how to wait for the result until the consumers are finished? I had a brief look into the Akka Future documentation section, but the whole system is a little bit overwhelming for beginners. Something like var future = message ? actor seems suitable. Not sure, how to do this. Also using the WordCount actor causes additional complexity. Maybe it is possible to come up with a solution that doesn't need this actor?
Consider using the Akka Aggregator Pattern. That takes care of the low-level primitives (watching actors, poison pill, etc). You can focus on managing state.
Your call to system.actorOf() returns an ActorRef, but you're not using it. You should ask that actor for results. Something like this:
implicit val timeout = Timeout(5 seconds)
val wCount = system.actorOf(Props(run.actor()), "WordCount")
val answer = Await.result(wCount ? "sent.txt", timeout.duration)
This means your WordCount class needs a receive method that accepts a String message. That section of code should aggregate the results and tell the sender(), like this:
class WordCount extends Actor {
def receive: Receive = {
case filename: String =>
// do all of your code here, using filename
sender() ! results
}
}
Also, rather than blocking on the results with Await above, you can apply some techniques for handling Futures.

on demand actor get or else create

I can create actors with actorOf and look them with actorFor. I now want to get an actor by some id:String and if it doesnt exist, I want it to be created. Something like this:
def getRCActor(id: String):ActorRef = {
Logger.info("getting actor %s".format(id))
var a = system.actorFor(id)
if(a.isTerminated){
Logger.info("actor is terminated, creating new one")
return system.actorOf(Props[RC], id:String)
}else{
return a
}
}
But this doesn't work as isTerminated is always true and I get actor name 1 is not unique! exception for the second call. I guess I am using the wrong pattern here. Can someone help how to achieve this? I need
Create actors on demand
Lookup actors by id and if not present create them
Ability to destroy on, as I don't know if I will need it again
Should I use a Dispatcher or Router for this?
Solution
As proposed I use a concrete Supervisor that holds the available actors in a map. It can be asked to provide one of his children.
class RCSupervisor extends Actor {
implicit val timeout = Timeout(1 second)
var as = Map.empty[String, ActorRef]
def getRCActor(id: String) = as get id getOrElse {
val c = context actorOf Props[RC]
as += id -> c
context watch c
Logger.info("created actor")
c
}
def receive = {
case Find(id) => {
sender ! getRCActor(id)
}
case Terminated(ref) => {
Logger.info("actor terminated")
as = as filterNot { case (_, v) => v == ref }
}
}
}
His companion object
object RCSupervisor {
// this is specific to Playframework (Play's default actor system)
var supervisor = Akka.system.actorOf(Props[RCSupervisor])
implicit val timeout = Timeout(1 second)
def findA(id: String): ActorRef = {
val f = (supervisor ? Find(id))
Await.result(f, timeout.duration).asInstanceOf[ActorRef]
}
...
}
I've not been using akka for that long, but the creator of the actors is by default their supervisor. Hence the parent can listen for their termination;
var as = Map.empty[String, ActorRef]
def getRCActor(id: String) = as get id getOrElse {
val c = context actorOf Props[RC]
as += id -> c
context watch c
c
}
But obviously you need to watch for their Termination;
def receive = {
case Terminated(ref) => as = as filterNot { case (_, v) => v == ref }
Is that a solution? I must say I didn't completely understand what you meant by "terminated is always true => actor name 1 is not unique!"
Actors can only be created by their parent, and from your description I assume that you are trying to have the system create a non-toplevel actor, which will always fail. What you should do is to send a message to the parent saying “give me that child here”, then the parent can check whether that currently exists, is in good health, etc., possibly create a new one and then respond with an appropriate result message.
To reiterate this extremely important point: get-or-create can ONLY ever be done by the direct parent.
I based my solution to this problem on oxbow_lakes' code/suggestion, but instead of creating a simple collection of all the children actors I used a (bidirectional) map, which might be beneficial if the number of child actors is significant.
import play.api._
import akka.actor._
import scala.collection.mutable.Map
trait ResponsibleActor[K] extends Actor {
val keyActorRefMap: Map[K, ActorRef] = Map[K, ActorRef]()
val actorRefKeyMap: Map[ActorRef, K] = Map[ActorRef, K]()
def getOrCreateActor(key: K, props: => Props, name: => String): ActorRef = {
keyActorRefMap get key match {
case Some(ar) => ar
case None => {
val newRef: ActorRef = context.actorOf(props, name)
//newRef shouldn't be present in the map already (if the key is different)
actorRefKeyMap get newRef match{
case Some(x) => throw new Exception{}
case None =>
}
keyActorRefMap += Tuple2(key, newRef)
actorRefKeyMap += Tuple2(newRef, key)
newRef
}
}
}
def getOrCreateActorSimple(key: K, props: => Props): ActorRef = getOrCreateActor(key, props, key.toString)
/**
* method analogous to Actor's receive. Any subclasses should implement this method to handle all messages
* except for the Terminate(ref) message passed from children
*/
def responsibleReceive: Receive
def receive: Receive = {
case Terminated(ref) => {
//removing both key and actor ref from both maps
val pr: Option[Tuple2[K, ActorRef]] = for{
key <- actorRefKeyMap.get(ref)
reref <- keyActorRefMap.get(key)
} yield (key, reref)
pr match {
case None => //error
case Some((key, reref)) => {
actorRefKeyMap -= ref
keyActorRefMap -= key
}
}
}
case sth => responsibleReceive(sth)
}
}
To use this functionality you inherit from ResponsibleActor and implement responsibleReceive. Note: this code isn't yet thoroughly tested and might still have some issues. I ommited some error handling to improve readability.
Currently you can use Guice dependency injection with Akka, which is explained at http://www.lightbend.com/activator/template/activator-akka-scala-guice. You have to create an accompanying module for the actor. In its configure method you then need to create a named binding to the actor class and some properties. The properties could come from a configuration where, for example, a router is configured for the actor. You can also put the router configuration in there programmatically. Anywhere you need a reference to the actor you inject it with #Named("actorname"). The configured router will create an actor instance when needed.