Akka. How to know that all children actors finished their job - scala

I created Master actor and child actors (created using router from Master).
Master receives some Job and splits this job into small tasks and sends them to child actors (to routees).
The problem I trying to solve is how I can properly notify my Master when child actors finished their job?
In some tutorials (Pi approximation and in example from Scala In Action book) the Master actor after receiving response from children trying to compare the size of initial array of task with size of received results:
if(receivedResultsFromChildren.size == initialTasks.size) {
// it's mean children finished their job
}
But I think it is very bad, because if some child actor throws exception then it will not send result back to sender (back to Master), so this condition will never evaluate to true.
So how properly notify master that all children finished their jobs?
I think one of the option is to Broadcast(PoisonPill) to children and then listen to Terminated(router) message (using so-called deathWatch). Is it ok solution?
If using Broadcast(PoisonPill) is better, then should I register some supervising strategy which will stop some routee in case of exception? Because if exception occurs, then routee will be restarted as I know, and it's mean that Master actor will never receive Terminated(router). Is it correct?

In Akka this is actually quite simple.
The successful children can send an ordinary reply message to the parent actor. The unexpected failures from failing actors can be caught in the supervision strategy and handled appropriately (e.g. by restarting the actor, or by stopping it and removing it from the list of actors to wait).
So it could look something like this:
var waitingFor = Set.empty[ActorRef]
override def preStart() = ??? // Start the children with their subtasks
override def supervisionStrategy = OneForOneStrategy() {
case _ => {
waitingFor -= sender()
if (waitingFor.isEmpty) ??? // processing finished
Stop
}
}
override def receive = {
case Reply => {
waitingFor -= sender()
if (waitingFor.isEmpty) ??? // processing finished
}
}

Related

Spark throws Not Serializable Exception inside a foreachRDD operation

i'm trying to implement an observer pattern using scala and spark streaming. the idea is that whenever i receive a record from the stream (from kafka) i notify the observer by calling the method "notifyObservers" inside the closure. here's the code:
the stream is provided by the kafka utils.
the method notifyObserver is defined into an abstract class following the rules of the pattern.
the error I think is related on the fact that methods cant be serialize.
Am I thinking correctly? and if it was, what kind of solution should I follow?
thanks
def onMessageConsumed() = {
stream.foreachRDD(rdd => {
rdd.foreach(consumerRecord => {
val record = new Record[T](consumerRecord.topic(),
consumerRecord.value())
//notify observers with the record to compute
notifyObservers(record)
})
})
}
Yes, the classes that are used in the code that is sent to other executors (executed in foreach, etc.), should implement Serializable interface.
also, if you're notification code requires connection to some resource, you need to wrap foreach into foreachPartition, something like this:
stream.foreachRDD(rdd => {
rdd.foreachPartition(rddPartition =>
// setup connection to external component
rddPartition.foreach(consumerRecord => {
val record = new Record[T](consumerRecord.topic(),
consumerRecord.value())
notifyObservers(record)
})
// close connection to external component
})
})

Rx .net subject OnNext exception is losing downstream observers

Dislaimer: I am newbie to Rx.Net.
I want to understand the best way to consume events from the server using Rx.Net. Currently, I have a consumer class that contains a rx Subject, to delegate the consumed update to downstream consumers as :
Event Listener/Processor:
public IObservable<IUpdate> UpdateStream => _subject?.AsObservable();
try
{
// ... processing ...
_subject.OnNext(update); // update is the variable
}
catch (Exception ex)
{
_subject.OnError(ex);
}
Downstream-subscribers:
public void Subscribe()
{
_eventListener.UpdateStream.Subscribe(update =>
{
_fooProcessor.Process(update);
},
ex =>
{
// log
Subscribe(); // an effort to resubscribe lost subscription
},
() => { // log completion (optional)...}
}
I have noticed that subject throws exception onNext (an item with the same key has already been added), wherein, the subject.HasObservers property is false (in other words, the downstream subscription list is lost). The OnError code line does hit, but the downstream subscribers do not get notified (because of lost subscription).
I tried using Observer.EventPattern to listen to the consuming event and create the observable to be consumed by downstream-subscribers; but that did not work as well (I could not evaluate the point of failure in this case).
Is there a pattern to resubscribe from downstream consumers (on different dlls), in such cases?
Appreciate any help.
Thanks!
I found that the downstream-subscriber was throwing an exception, resulting in dropping the subscription. This is not an issue now.
Thanks - How to handle exceptions in OnNext when using ObserveOn?

How to handle failures when one of the router actors throws an exception

Say I have an email sender actor that is responsible for sending out emails.
I want to limit the # of actors created for sending emails, so I created a router.
class EmailSender extends Actor {
val router = context.actorOf(RoundRobinRouter(4).props(EmailSender.props))
def recieve = {
case SendEmail(emailTo: String, ..) =>
router ! SendEmailMessage(emailTo,.. )
case ...
}
}
I want to understand 2 things here:
If an email message sending fails by one of the router actors, how will EmailSender get notified and will I get the exact email that failed?
If the email sending fails within the Routee actors, the default supervision strategy is to restart the actor.
So you should be able to hook into the preRestart method and see which message caused the EmailSender to fail.
class EmailSender extends Actor {
def recieve = {
case SendEmail(emailTo: String, ..) =>
router ! SendEmailMessage(emailTo,.. )
case ...
}
override def preRestart(reason: Throwable, message: Option[Any]): Unit = {
// log the failed message. Or send it back to failedMsg processer
if (message.isDefined) {
failedEmailsProcessor ! message.get
}
}
}
Note: I the supervision strategy is AllForOneStrategy then preRestart would be called for all the child actors. So it's better to have OneForOneStrategy here, so that preRestart is called only for the failed actor.
In the class that contains the router you put the emails into a list pending email.
When a email has been sent out successfully to EmailSender, it sends back an message saying success and the router class will remove it if from the pending list or a failure message and the router can try one more time depending you your business logic.

scala, swing : thread issue with the Event Dispatch Thread(actors)

I have a scala class inheriting from SimpleSwingApplication.This class defines a window (with def top = new MainFrame) and instanciates an actor. the actor's code is simple:
class Deselectionneur extends Actor {
def act() {
while (true) {
receive {
case a:sTable => {
Thread.sleep(3000)
a.peer.changeSelection(0,0,false,false)
a.peer.changeSelection(0,0,true,false)
}
}
}
}
}
and the main class uses also "substance", a API allowing gui customization(there's no more ugly swing controls with it!).
the actor is called when I leaves a given swing table with my mouse; then the actor is called & deselects all the rows of the table.
the actor behaves very well, but when I launch my program, each times the actor is called, I get this error message:
org.pushingpixels.substance.api.UiThreadingViolationException: State tracking must be done on Event Dispatch Thread
do you know how I can remove this error message?
You need to move the gui update onto the EDT
Something like (I haven't compiled this)
case a:sTable => {
scala.swing.Swing.onEDT {
Thread.sleep(3000) // this will stop GUI updates
a.peer.changeSelection(0,0,false,false)
a.peer.changeSelection(0,0,true,false)
}
}
Some background on EDT can be found here: http://docs.oracle.com/javase/tutorial/uiswing/concurrency/initial.html

Using Akka with Scalatra

My target is building a highly concurrent backend for my widgets. I'm currently exposing the backend as a web service, which receives requests to run a specific widget (using Scalatra), fetches widget's code from DB and runs it in an actor (using Akka) which then replies with the results. So imagine I'm doing something like:
get("/run:id") {
...
val actor = Actor.actorOf("...").start
val result = actor !! (("Run",id), 10000)
...
}
Now I believe this is not the best concurrent solution and I should somehow combine listening for requests and running widgets in one actor implementation. How would you design this for maximum concurrency? Thanks.
You can start your actors in an akka boot file or in your own ServletContextListener so that they are started without being tied to a servlet.
Then you can look for them with the akka registry.
Actor.registry.actorFor[MyActor] foreach { _ !! (("Run",id), 10000) }
Apart from that there is no real integration for akka with scalatra at this moment.
So until now the best you can do is by using blocking requests to a bunch of actors.
I'm not sure but I wouldn't necessary spawn an actor for each request but rather have a pool of widget actors which you can send those requests. If you use a supervisor hierarchy then the you can use a supervisor to resize the pool if it is too big or too small.
class MyContextListener extends ServletContextListener {
def contextInitialized(sce: ServletContextEvent) {
val factory = SupervisorFactory(
SupervisorConfig(
OneForOneStrategy(List(classOf[Exception]), 3, 1000),
Supervise(actorOf[WidgetPoolSupervisor], Permanent)
}
def contextDestroyed(sce: ServletContextEvent) {
Actor.registry.shutdownAll()
}
}
class WidgetPoolSupervisor extends Actor {
self.faultHandler = OneForOneStrategy(List(classOf[Exception]), 3, 1000)
override def preStart() {
(1 to 5) foreach { _ =>
self.spawnLink[MyWidgetProcessor]
}
Scheduler.schedule(self, 'checkPoolSize, 5, 5, TimeUnit.MINUTES)
}
protected def receive = {
case 'checkPoolSize => {
//implement logic that checks how quick the actors respond and if
//it takes to long add some actors to the pool.
//as a bonus you can keep downsizing the actor pool until it reaches 1
//or until the message starts returning too late.
}
}
}
class ScalatraApp extends ScalatraServlet {
get("/run/:id") {
// the !! construct should not appear anywhere else in your code except
// in the scalatra action. You don't want to block anywhere else, but in a
// scalatra action it's ok as the web request itself is synchronous too and needs to
// to wait for the full response to have come back anyway.
Actor.registry.actorFor[MyWidgetProcessor] foreach {
_ !! ((Run, id), 10000)
} getOrElse {
throw new HeyIExpectedAResultException()
}
}
}
Please do regard the code above as pseudo code that happens to look like scala, I just wanted to illustrate the concept.