I am trying to continuously read the wikipedia IRC channel using this lib: https://github.com/implydata/wikiticker
I created a custom Akka Publisher, which will be used in my system as a Source.
Here are some of my classes:
class IrcPublisher() extends ActorPublisher[String] {
import scala.collection._
var queue: mutable.Queue[String] = mutable.Queue()
override def receive: Actor.Receive = {
case Publish(s) =>
println(s"->MSG, isActive = $isActive, totalDemand = $totalDemand")
queue.enqueue(s)
publishIfNeeded()
case Request(cnt) =>
println("Request: " + cnt)
publishIfNeeded()
case Cancel =>
println("Cancel")
context.stop(self)
case _ =>
println("Hm...")
}
def publishIfNeeded(): Unit = {
while (queue.nonEmpty && isActive && totalDemand > 0) {
println("onNext")
onNext(queue.dequeue())
}
}
}
object IrcPublisher {
case class Publish(data: String)
}
I am creating all this objects like so:
def createSource(wikipedias: Seq[String]) {
val dataPublisherRef = system.actorOf(Props[IrcPublisher])
val dataPublisher = ActorPublisher[String](dataPublisherRef)
val listener = new MessageListener {
override def process(message: Message) = {
dataPublisherRef ! Publish(Jackson.generate(message.toMap))
}
}
val ticker = new IrcTicker(
"irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(listener)
)
ticker.start() // if I comment this...
Thread.currentThread().join() //... and this I get Request(...)
Source.fromPublisher(dataPublisher)
}
So the problem I am facing is this Source object. Although this implementation works well with other sources (for example from local file), the ActorPublisher don't receive Request() messages.
If I comment the two marked lines I can see, that my actor has received the Request(count) message from my flow. Otherwise all messages will be pushed into the queue, but not in my flow (so I can see the MSG messages printed).
I think it's something with multithreading/synchronization here.
I am not familiar enough with wikiticker to solve your problem as given. One question I would have is: why is it necessary to join to the current thread?
However, I think you have overcomplicated the usage of Source. It would be easier for you to work with the stream as a whole rather than create a custom ActorPublisher.
You can use Source.actorRef to materialize a stream into an ActorRef and work with that ActorRef. This allows you to utilize akka code to do the enqueing/dequeing onto the buffer while you can focus on the "business logic".
Say, for example, your entire stream is only to filter lines above a certain length and print them to the console. This could be accomplished with:
def dispatchIRCMessages(actorRef : ActorRef) = {
val ticker =
new IrcTicker("irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(new MessageListener {
override def process(message: Message) =
actorRef ! Publish(Jackson.generate(message.toMap))
}))
ticker.start()
Thread.currentThread().join()
}
//these variables control the buffer behavior
val bufferSize = 1024
val overFlowStrategy = akka.stream.OverflowStrategy.dropHead
val minMessageSize = 32
//no need for a custom Publisher/Queue
val streamRef =
Source.actorRef[String](bufferSize, overFlowStrategy)
.via(Flow[String].filter(_.size > minMessageSize))
.to(Sink.foreach[String](println))
.run()
dispatchIRCMessages(streamRef)
The dispatchIRCMessages has the added benefit that it will work with any ActorRef so you aren't required to only work with streams/publishers.
Hopefully this solves your underlying problem...
I think the main problem is Thread.currentThread().join(). This line will 'hang' current thread because this thread is waiting for himself to die. Please read https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#join-long- .
Related
I have an external (that is, I cannot change it) Java API which looks like this:
public interface Sender {
void send(Event e);
}
I need to implement a Sender which accepts each event, transforms it to a JSON object, collects some number of them into a single bundle and sends over HTTP to some endpoint. This all should be done asynchronously, without send() blocking the calling thread, with some fixed-size buffer and dropping new events if the buffer is full.
With akka-streams this is quite simple: I create a graph of stages (which uses akka-http to send HTTP requests), materialize it and use the materialized ActorRef to push new events to the stream:
lazy val eventPipeline = Source.actorRef[Event](Int.MaxValue, OverflowStrategy.fail)
.via(CustomBuffer(bufferSize)) // buffer all events
.groupedWithin(batchSize, flushDuration) // group events into chunks
.map(toBundle) // convert each chunk into a JSON message
.mapAsyncUnordered(1)(sendHttpRequest) // send an HTTP request
.toMat(Sink.foreach { response =>
// print HTTP response for debugging
})(Keep.both)
lazy val (eventsActor, completeFuture) = eventPipeline.run()
override def send(e: Event): Unit = {
eventsActor ! e
}
Here CustomBuffer is a custom GraphStage which is very similar to the library-provided Buffer but tailored to our specific needs; it probably does not matter for this particular question.
As you can see, interacting with the stream from non-stream code is very simple - the ! method on the ActorRef trait is asynchronous and does not need any additional machinery to be called. Each event which is sent to the actor is then processed through the entire reactive pipeline. Moreover, because of how akka-http is implemented, I even get connection pooling for free, so no more than one connection is opened to the server.
However, I cannot find a way to do the same thing with FS2 properly. Even discarding the question of buffering (I will probably need to write a custom Pipe implementation which does additional things that we need) and HTTP connection pooling, I'm still stuck with a more basic thing - that is, how to push the data to the reactive stream "from outside".
All tutorials and documentation that I can find assume that the entire program happens inside some effect context, usually IO. This is not my case - the send() method is invoked by the Java library at unspecified times. Therefore, I just cannot keep everything inside one IO action, I necessarily have to finalize the "push" action inside the send() method, and have the reactive stream as a separate entity, because I want to aggregate events and hopefully pool HTTP connections (which I believe is naturally tied to the reactive stream).
I assume that I need some additional data structure, like Queue. fs2 does indeed have some kind of fs2.concurrent.Queue, but again, all documentation shows how to use it inside a single IO context, so I assume that doing something like
val queue: Queue[IO, Event] = Queue.unbounded[IO, Event].unsafeRunSync()
and then using queue inside the stream definition and then separately inside the send() method with further unsafeRun calls:
val eventPipeline = queue.dequeue
.through(customBuffer(bufferSize))
.groupWithin(batchSize, flushDuration)
.map(toBundle)
.mapAsyncUnordered(1)(sendRequest)
.evalTap(response => ...)
.compile
.drain
eventPipeline.unsafeRunAsync(...) // or something
override def send(e: Event) {
queue.enqueue(e).unsafeRunSync()
}
is not the correct way and most likely would not even work.
So, my question is, how do I properly use fs2 to solve my problem?
Consider the following example:
import cats.implicits._
import cats.effect._
import cats.effect.implicits._
import fs2._
import fs2.concurrent.Queue
import scala.concurrent.ExecutionContext
import scala.concurrent.duration._
object Answer {
type Event = String
trait Sender {
def send(event: Event): Unit
}
def main(args: Array[String]): Unit = {
val sender: Sender = {
val ec = ExecutionContext.global
implicit val cs: ContextShift[IO] = IO.contextShift(ec)
implicit val timer: Timer[IO] = IO.timer(ec)
fs2Sender[IO](2)
}
val events = List("a", "b", "c", "d")
events.foreach { evt => new Thread(() => sender.send(evt)).start() }
Thread sleep 3000
}
def fs2Sender[F[_]: Timer : ContextShift](maxBufferedSize: Int)(implicit F: ConcurrentEffect[F]): Sender = {
// dummy impl
// this is where the actual logic for batching
// and shipping over the network would live
val consume: Pipe[F, Event, Unit] = _.evalMap { event =>
for {
_ <- F.delay { println(s"consuming [$event]...") }
_ <- Timer[F].sleep(1.seconds)
_ <- F.delay { println(s"...[$event] consumed") }
} yield ()
}
val suspended = for {
q <- Queue.bounded[F, Event](maxBufferedSize)
_ <- q.dequeue.through(consume).compile.drain.start
sender <- F.delay[Sender] { evt =>
val enqueue = for {
wasEnqueued <- q.offer1(evt)
_ <- F.delay { println(s"[$evt] enqueued? $wasEnqueued") }
} yield ()
enqueue.toIO.unsafeRunAsyncAndForget()
}
} yield sender
suspended.toIO.unsafeRunSync()
}
}
The main idea is to use a concurrent Queue from fs2. Note, that the above code demonstrates that neither the Sender interface nor the logic in main can be changed. Only an implementation of the Sender interface can be swapped out.
I don't have much experience with exactly that library but it should look somehow like that:
import cats.effect.{ExitCode, IO, IOApp}
import fs2.concurrent.Queue
case class Event(id: Int)
class JavaProducer{
new Thread(new Runnable {
override def run(): Unit = {
var id = 0
while(true){
Thread.sleep(1000)
id += 1
send(Event(id))
}
}
}).start()
def send(event: Event): Unit ={
println(s"Original producer prints $event")
}
}
class HackedProducer(queue: Queue[IO, Event]) extends JavaProducer {
override def send(event: Event): Unit = {
println(s"Hacked producer pushes $event")
queue.enqueue1(event).unsafeRunSync()
println(s"Hacked producer pushes $event - Pushed")
}
}
object Test extends IOApp{
override def run(args: List[String]): IO[ExitCode] = {
val x: IO[Unit] = for {
queue <- Queue.unbounded[IO, Event]
_ = new HackedProducer(queue)
done <- queue.dequeue.map(ev => {
println(s"Got $ev")
}).compile.drain
} yield done
x.map(_ => ExitCode.Success)
}
}
We can create a bounded queue that will consume elements from sender and make them available to fs2 stream processing.
import cats.effect.IO
import cats.effect.std.Queue
import fs2.Stream
trait Sender[T]:
def send(e: T): Unit
object Sender:
def apply[T](bufferSize: Int): IO[(Sender[T], Stream[IO, T])] =
for
q <- Queue.bounded[IO, T](bufferSize)
yield
val sender: Sender[T] = (e: T) => q.offer(e).unsafeRunSync()
def stm: Stream[IO, T] = Stream.eval(q.take) ++ stm
(sender, stm)
Then we'll have two ends - one for Java worlds, to send new elements to Sender. Another one - for stream processing in fs2.
class TestSenderQueue:
#Test def testSenderQueue: Unit =
val (sender, stream) = Sender[Int](1)
.unsafeRunSync()// we have to run it preliminary to make `sender` available to external system
val processing =
stream
.map(i => i * i)
.evalMap{ ii => IO{ println(ii)}}
sender.send(1)
processing.compile.toList.start//NB! we start processing in a separate fiber
.unsafeRunSync() // immediately right now.
sender.send(2)
Thread.sleep(100)
(0 until 100).foreach(sender.send)
println("finished")
Note that we push data in the current thread and have to run fs2 in a separate thread (.start).
I have two actors in my system. Talker and Conversation. Conversation consists in two talkers (by now). When a Talker wants to join a conversation I should check if conversation exists (another talker has created it) and if it not, create it. I have this code in a method of my Talker actor:
def getOrCreateConversation(conversationId: UUID): ActorRef = {
// #TODO try to get conversation actor by conversationId
context.actorSelection("user/conversation/" + conversationId.toString)
// #TODO if it not exists... create it
context.actorOf(Conversation.props(conversationId), conversationId.toString)
}
As you can see, when I create my converastion actor with actorOf I'm passing as a second argument the conversationId. I do this for easy searching this actor... Is it the correct way to do this?
Thank you
edited
Thanks to #Arne I've finally did this:
class ConversationRouter extends Actor with ActorLogging {
def receive = {
case ConversationEnv(conversationId, msg) =>
val conversation = findConversation(conversationId) match {
case None => createNewConversation(conversationId)
case Some(x) => x
}
conversation forward msg
}
def findConversation(conversationId: UUID): Option[ActorRef] = context.child(conversationId.toString)
def createNewConversation(conversationId: UUID): ActorRef = {
context.actorOf(Conversation.props(conversationId), conversationId.toString)
}
}
And the test:
class ConversationRouterSpec extends ChatUnitTestCase("ConversationRouterSpec") {
trait ConversationRouterSpecHelper {
val conversationId = UUID.randomUUID()
var newConversationCreated = false
def conversationRouterWithConversation(existingConversation: Option[ActorRef]) = {
val conversationRouterRef = TestActorRef(new ConversationRouter {
override def findConversation(conversationId: UUID) = existingConversation
override def createNewConversation(conversationId: UUID) = {
newConversationCreated = true
TestProbe().ref
}
})
conversationRouterRef
}
}
"ConversationRouter" should {
"create a new conversation when a talker join it" in new ConversationRouterSpecHelper {
val nonExistingConversationOption = None
val conversationRouterRef = conversationRouterWithConversation(nonExistingConversationOption)
conversationRouterRef ! ConversationEnv(conversationId, Join(conversationId))
newConversationCreated should be(right = true)
}
"not create a new conversation if it already exists" in new ConversationRouterSpecHelper {
val existingConversation = Option(TestProbe().ref)
val conversationRouterRef = conversationRouterWithConversation(existingConversation)
conversationRouterRef ! ConversationEnv(conversationId, Join(conversationId))
newConversationCreated should be(right = false)
}
}
}
Determining the existence of an actor cannot be done synchronously. So you have a couple of choices. The first two are more conceptual in nature to illustrate doing asynchronous lookups, but I offer them more for reference about the asynchronous nature of actors. The third is likely the correct way of doing things:
1. Make the function return a Future[ActorRef]
def getOrCreateConversation(conversationId: UUID): Unit {
context.actorSelection(s"user/conversation/$conversationId")
.resolveOne()
.recover { case _:Exception =>
context.actorOf(Conversation.props(conversationId),conversationId.toString)
}
}
2. Make it Unit and have it send the ActorRef back to your current actor
Pretty much the same as the above, but now you we pipe the future back the current actor, so that the resolved actor can be dealt with in the context of the calling actor's receive loop:
def getOrCreateConversation(conversationId: UUID): Unit {
context.actorSelection(s"user/conversation/$conversationId")
.resolveOne()
.recover { case _:Exception =>
context.actorOf(Conversation.props(conversationId),conversationId.toString)
}.pipeTo(self)
}
3. Create a router actor that you send your Id'ed messages to and it creates/resolves the child and forwards the message
I say that this is likely the correct way, since your goal seems to be cheap lookup at a specific named path. The example you give makes the assumption that the function is always called from within the actor at path /user/conversation otherwise the context.actorOf would not create the child at /user/conversation/{id}/.
Which is to say that you have a router pattern on your hands and the child you create is already known to the router in its child collection. This pattern assumes you have an envelope around any conversation message, something like this:
case class ConversationEnv(id: UUID, msg: Any)
Now all conversation messages get sent to the router instead of to the conversation child directly. The router can now look up the child in its child collection:
def receive = {
case ConversationEnv(id,msg) =>
val conversation = context.child(id.toString) match {
case None => context.actorOf(Conversation.props(id),id.toString)
case Some(x) => x
}
conversation forward msg
}
The additional benefit is that your router is also the conversation supervisor, so if the conversation child dies, it can deal with it. Not exposing the child ActorRef to the outside world also has the benefit that you could have it die when idle and have it get re-created on the next message receipt, etc.
Problem Statement
Assume I have a file with sentences that is processed line by line. In my case, I need to extract Named Entities (Persons, Organizations, ...) from these lines. Unfortunately, the tagger is quite slow. Therefore, I decided to parallelize the computation, such that lines could be processed independent from each other and the result is collected in a central location.
Current Approach
My current approach comprises the usage of a single producer multiple consumer concept. However, I'm relative new to Akka, but I think my problem description fits well into its capabilities. Let me show you some code:
Producer
The Producer reads the file line by line and sends it to the Consumer. If it reaches the total line limit, it propagates the result back to WordCount.
class Producer(consumers: ActorRef) extends Actor with ActorLogging {
var master: Option[ActorRef] = None
var result = immutable.List[String]()
var totalLines = 0
var linesProcessed = 0
override def receive = {
case StartProcessing() => {
master = Some(sender)
Source.fromFile("sent.txt", "utf-8").getLines.foreach { line =>
consumers ! Sentence(line)
totalLines += 1
}
context.stop(self)
}
case SentenceProcessed(list) => {
linesProcessed += 1
result :::= list
//If we are done, we can propagate the result to the creator
if (linesProcessed == totalLines) {
master.map(_ ! result)
}
}
case _ => log.error("message not recognized")
}
}
Consumer
class Consumer extends Actor with ActorLogging {
def tokenize(line: String): Seq[String] = {
line.split(" ").map(_.toLowerCase)
}
override def receive = {
case Sentence(sent) => {
//Assume: This is representative for the extensive computation method
val tokens = tokenize(sent)
sender() ! SentenceProcessed(tokens.toList)
}
case _ => log.error("message not recognized")
}
}
WordCount (Master)
class WordCount extends Actor {
val consumers = context.actorOf(Props[Consumer].
withRouter(FromConfig()).
withDispatcher("consumer-dispatcher"), "consumers")
val producer = context.actorOf(Props(new Producer(consumers)), "producer")
context.watch(consumers)
context.watch(producer)
def receive = {
case Terminated(`producer`) => consumers ! Broadcast(PoisonPill)
case Terminated(`consumers`) => context.system.shutdown
}
}
object WordCount {
def getActor() = new WordCount
def getConfig(routerType: String, dispatcherType: String)(numConsumers: Int) = s"""
akka.actor.deployment {
/WordCount/consumers {
router = $routerType
nr-of-instances = $numConsumers
dispatcher = consumer-dispatcher
}
}
consumer-dispatcher {
type = $dispatcherType
executor = "fork-join-executor"
}"""
}
The WordCount actor is responsible for creating the other actors. When the Consumer is finished the Producer sends a message with all tokens. But, how to propagate the message again and also accept and wait for it? The architecture with the third WordCount actor might be wrong.
Main Routine
case class Run(name: String, actor: () => Actor, config: (Int) => String)
object Main extends App {
val run = Run("push_implementation", WordCount.getActor _, WordCount.getConfig("balancing-pool", "Dispatcher") _)
def execute(run: Run, numConsumers: Int) = {
val config = ConfigFactory.parseString(run.config(numConsumers))
val system = ActorSystem("Counting", ConfigFactory.load(config))
val startTime = System.currentTimeMillis
system.actorOf(Props(run.actor()), "WordCount")
/*
How to get the result here?!
*/
system.awaitTermination
System.currentTimeMillis - startTime
}
execute(run, 4)
}
Problem
As you see, the actual problem is to propagate the result back to the Main routine. Can you tell me how to do this in a proper way? The question is also how to wait for the result until the consumers are finished? I had a brief look into the Akka Future documentation section, but the whole system is a little bit overwhelming for beginners. Something like var future = message ? actor seems suitable. Not sure, how to do this. Also using the WordCount actor causes additional complexity. Maybe it is possible to come up with a solution that doesn't need this actor?
Consider using the Akka Aggregator Pattern. That takes care of the low-level primitives (watching actors, poison pill, etc). You can focus on managing state.
Your call to system.actorOf() returns an ActorRef, but you're not using it. You should ask that actor for results. Something like this:
implicit val timeout = Timeout(5 seconds)
val wCount = system.actorOf(Props(run.actor()), "WordCount")
val answer = Await.result(wCount ? "sent.txt", timeout.duration)
This means your WordCount class needs a receive method that accepts a String message. That section of code should aggregate the results and tell the sender(), like this:
class WordCount extends Actor {
def receive: Receive = {
case filename: String =>
// do all of your code here, using filename
sender() ! results
}
}
Also, rather than blocking on the results with Await above, you can apply some techniques for handling Futures.
I have a futures pool , and each future works with the same akka Actor System - some Actors in system should be global, some are used only in one future.
val longFutures = for (i <- 0 until 2 ) yield Future {
val p:Page = PhantomExecutor(isDebug=true)
Await.result( p.open("http://www.stackoverflow.com/") ,timeout = 10.seconds)
}
PhantomExecutor tryes to use one shared global actor (simple increment counter) using system.actorSelection
def selectActor[T <: Actor : ClassTag](system:ActorSystem,name:String) = {
val timeout = Timeout(0.1 seconds)
val myFutureStuff = system.actorSelection("akka://"+system.name+"/user/"+name)
val aid:ActorIdentity = Await.result(myFutureStuff.ask(Identify(1))(timeout).mapTo[ActorIdentity],
0.1 seconds)
aid.ref match {
case Some(cacher) =>
cacher
case None =>
system.actorOf(Props[T],name)
}
}
But in concurrent environment this approach does not work because of race condition.
I know only one solution for this problem - create global actors before splitting to futures. But this means that I can't encapsulate alot of hidden work from top library user.
You're right in that making sure the global actors are initialized first is the right approach. Can't you tie them to a companion object and reference them from there so you know they will only ever be initialized one time? If you really can't go with such an approach then you could try something like this to lookup or create the actor. It is similar to your code but it include logic to go back through the lookup/create logic (recursively) if the race condition is hit (only up to a max number of times):
def findOrCreateActor[T <: Actor : ClassTag](system:ActorSystem, name:String, maxAttempts:Int = 5):ActorRef = {
import system.dispatcher
val timeout = 0.1 seconds
def doFindOrCreate(depth:Int = 0):ActorRef = {
if (depth >= maxAttempts)
throw new RuntimeException(s"Can not create actor with name $name and reached max attempts of $maxAttempts")
val selection = system.actorSelection(s"/user/$name")
val fut = selection.resolveOne(timeout).map(Some(_)).recover{
case ex:ActorNotFound => None
}
val refOpt = Await.result(fut, timeout)
refOpt match {
case Some(ref) => ref
case None => util.Try(system.actorOf(Props[T],name)).getOrElse(doFindOrCreate(depth + 1))
}
}
doFindOrCreate()
}
Now the retry logic would fire for any exception when creating the actor, so you might want to further specify that (probably via another recover combinator) to only recurse when it gets an InvalidActorNameException, but you get the idea.
You may want to consider creating a manager actor that would take care about creating "counter" actors. This way you would ensure that counter actor creation requests are serialized.
object CounterManagerActor {
case class SelectActorRequest(name : String)
case class SelectActorResponse(name : String, actorRef : ActorRef)
}
class CounterManagerActor extends Actor {
def receive = {
case SelectActorRequest(name) => {
sender() ! SelectActorResponse(name, selectActor(name))
}
}
private def selectActor(name : String) = {
// a slightly modified version of the original selectActor() method
???
}
}
I can create actors with actorOf and look them with actorFor. I now want to get an actor by some id:String and if it doesnt exist, I want it to be created. Something like this:
def getRCActor(id: String):ActorRef = {
Logger.info("getting actor %s".format(id))
var a = system.actorFor(id)
if(a.isTerminated){
Logger.info("actor is terminated, creating new one")
return system.actorOf(Props[RC], id:String)
}else{
return a
}
}
But this doesn't work as isTerminated is always true and I get actor name 1 is not unique! exception for the second call. I guess I am using the wrong pattern here. Can someone help how to achieve this? I need
Create actors on demand
Lookup actors by id and if not present create them
Ability to destroy on, as I don't know if I will need it again
Should I use a Dispatcher or Router for this?
Solution
As proposed I use a concrete Supervisor that holds the available actors in a map. It can be asked to provide one of his children.
class RCSupervisor extends Actor {
implicit val timeout = Timeout(1 second)
var as = Map.empty[String, ActorRef]
def getRCActor(id: String) = as get id getOrElse {
val c = context actorOf Props[RC]
as += id -> c
context watch c
Logger.info("created actor")
c
}
def receive = {
case Find(id) => {
sender ! getRCActor(id)
}
case Terminated(ref) => {
Logger.info("actor terminated")
as = as filterNot { case (_, v) => v == ref }
}
}
}
His companion object
object RCSupervisor {
// this is specific to Playframework (Play's default actor system)
var supervisor = Akka.system.actorOf(Props[RCSupervisor])
implicit val timeout = Timeout(1 second)
def findA(id: String): ActorRef = {
val f = (supervisor ? Find(id))
Await.result(f, timeout.duration).asInstanceOf[ActorRef]
}
...
}
I've not been using akka for that long, but the creator of the actors is by default their supervisor. Hence the parent can listen for their termination;
var as = Map.empty[String, ActorRef]
def getRCActor(id: String) = as get id getOrElse {
val c = context actorOf Props[RC]
as += id -> c
context watch c
c
}
But obviously you need to watch for their Termination;
def receive = {
case Terminated(ref) => as = as filterNot { case (_, v) => v == ref }
Is that a solution? I must say I didn't completely understand what you meant by "terminated is always true => actor name 1 is not unique!"
Actors can only be created by their parent, and from your description I assume that you are trying to have the system create a non-toplevel actor, which will always fail. What you should do is to send a message to the parent saying “give me that child here”, then the parent can check whether that currently exists, is in good health, etc., possibly create a new one and then respond with an appropriate result message.
To reiterate this extremely important point: get-or-create can ONLY ever be done by the direct parent.
I based my solution to this problem on oxbow_lakes' code/suggestion, but instead of creating a simple collection of all the children actors I used a (bidirectional) map, which might be beneficial if the number of child actors is significant.
import play.api._
import akka.actor._
import scala.collection.mutable.Map
trait ResponsibleActor[K] extends Actor {
val keyActorRefMap: Map[K, ActorRef] = Map[K, ActorRef]()
val actorRefKeyMap: Map[ActorRef, K] = Map[ActorRef, K]()
def getOrCreateActor(key: K, props: => Props, name: => String): ActorRef = {
keyActorRefMap get key match {
case Some(ar) => ar
case None => {
val newRef: ActorRef = context.actorOf(props, name)
//newRef shouldn't be present in the map already (if the key is different)
actorRefKeyMap get newRef match{
case Some(x) => throw new Exception{}
case None =>
}
keyActorRefMap += Tuple2(key, newRef)
actorRefKeyMap += Tuple2(newRef, key)
newRef
}
}
}
def getOrCreateActorSimple(key: K, props: => Props): ActorRef = getOrCreateActor(key, props, key.toString)
/**
* method analogous to Actor's receive. Any subclasses should implement this method to handle all messages
* except for the Terminate(ref) message passed from children
*/
def responsibleReceive: Receive
def receive: Receive = {
case Terminated(ref) => {
//removing both key and actor ref from both maps
val pr: Option[Tuple2[K, ActorRef]] = for{
key <- actorRefKeyMap.get(ref)
reref <- keyActorRefMap.get(key)
} yield (key, reref)
pr match {
case None => //error
case Some((key, reref)) => {
actorRefKeyMap -= ref
keyActorRefMap -= key
}
}
}
case sth => responsibleReceive(sth)
}
}
To use this functionality you inherit from ResponsibleActor and implement responsibleReceive. Note: this code isn't yet thoroughly tested and might still have some issues. I ommited some error handling to improve readability.
Currently you can use Guice dependency injection with Akka, which is explained at http://www.lightbend.com/activator/template/activator-akka-scala-guice. You have to create an accompanying module for the actor. In its configure method you then need to create a named binding to the actor class and some properties. The properties could come from a configuration where, for example, a router is configured for the actor. You can also put the router configuration in there programmatically. Anywhere you need a reference to the actor you inject it with #Named("actorname"). The configured router will create an actor instance when needed.