I am spawning a small number of actors to fetch, process and save RSS feed items to a database. This is done through a main method of an object running on cron. I create these actors and dole out jobs to them as they complete the previous job assigned to them. My main class spawns a single actor, the one that doles out jobs to a pool of actors. Eventually the main method seems to hang. It doesn't exit, but execution halts on all the actors. My CTO believes the main is exiting before the actors complete their work and leaving them, but I am not convinced that's the case. I receive no success exit on main (no exit at all).
Essentially I'm wondering how to debug these actors, and what possible reason could cause this to happen. Will main exit before actors have completed their execution (and if it does, does that matter?) From what I can tell actors using receive are mapped 1-to-1 to threads, correct? Code is below. Please ask any follow-up questions, help is greatly appreciated. I know I may not have provided sufficient detail, I'm new to scala and actors and will update as needed.
object ActorTester {
val poolSize = 10
var pendingQueue :Set[RssFeed] = RssFeed.pendingQueue
def main(args :Array[String]) {
val manager = new SpinnerManager(poolSize, pendingQueue)
manager.start
}
}
case object Stop
class SpinnerManager(poolSize :Int = 1, var pendingQueue :Set[RssFeed]) extends Actor {
val pool = new Array[Spinner](poolSize)
override def start() :Actor = {
for (i <- 0 to (poolSize - 1)) {
val spinner = new Spinner(i)
spinner.start()
pool(i) = spinner
}
super.start
}
def act() {
for {
s <- pool
if (!pendingQueue.isEmpty)
} {
s ! pendingQueue.head
pendingQueue = pendingQueue.tail
}
while(true) {
receive {
case id :Int => {
if (!pendingQueue.isEmpty) {
pool(id) ! pendingQueue.head
pendingQueue = pendingQueue.tail
} else if ((true /: pool) { (done, s) => {
if (s.getState != Actor.State.Runnable) {
val exited = future {
s ! Stop
done && true
}
exited()
} else {
done && false
}
}}) {
exit
}
}
}
}
}
}
class Spinner(id :Int) extends Actor {
def act() {
while(true) {
receive {
case dbFeed :RssFeed => {
//process rss feed
//this has multiple network requests, to the original blogs, bing image api
//our instance of solr - some of these spawn their own actors
sender ! id
}
case Stop => exit
}
}
}
}
For one thing you're making a tiny but important mistake when you're folding left in order to determine whether all Spinner actors have "terminated" or not. What you should do is evaluate to done && true resp. done && false at the end of the if cases, but currently you just say true resp. false without respect to done.
For example, imagine having 4 Spinner actors where the first and second ones were Runnable, the third one not, and the fourth one Runnable again. In that case the result of your foldleft would be true in spite of the fact that the third actor hasn't finished yet. If you were using a logical &&, you'd get the correct result.
This is possibily also what causes your application to hang.
EDIT: There also was an issue wrt a race condition. The following code works now, hope it helps. Anyway, I was wondering, doesn't Scala's actor implementation automatically make use of worker threads?
import actors.Actor
import scala.collection.mutable.Queue
case class RssFeed()
case class Stop()
class Spinner(id: Int) extends Actor {
def act() {
loop {
react {
case dbFeed: RssFeed => {
// Process RSS feed
sender ! id
}
case Stop => exit()
}
}
}
}
class SpinnerManager(poolSize: Int, pendingQueue: Queue[RssFeed]) extends Actor {
val pool = Array.tabulate(poolSize)(new Spinner(_).start())
def act() {
for (s <- pool; if (!pendingQueue.isEmpty)) {
pendingQueue.synchronized {
s ! pendingQueue.dequeue()
}
}
loop {
react {
case id: Int => pendingQueue.synchronized {
if (!pendingQueue.isEmpty) {
Console println id
pool(id) ! pendingQueue.dequeue()
} else {
if (pool forall (_.getState != Actor.State.Runnable)) {
pool foreach (_ ! Stop)
exit()
}
}
}
}
}
}
}
object ActorTester {
def main(args: Array[String]) {
val poolSize = 10
val pendingQueue: Queue[RssFeed] = Queue.tabulate(100)(_ => RssFeed())
new SpinnerManager(poolSize, pendingQueue).start()
}
}
So after several days of debugging I've solved this issue. fotNelton's code suggestions were very helpful in doing so, so I've given him a vote. However, they didn't address the problem itself. What I've found is that if you are running this in a main method then if the parent actors exit before their child actors then the program will hang forever and never exit, still holding all of its memory. In the process of handling the RSS feed, a Fetcher would spawn actors and send them messages to do things involving network requests. These actors need to complete their work before the parent actor quits. Fetcher wouldn't wait for these actors to finish though, once he sent the message he would just move on. So he would tell manager he was finished before his child actors had finished all their work. To deal with this, one option would be to use futures and wait until the actors are done (pretty slow). My solution was to create services accessible via URL (POST to a service that has an actor waiting to react). The service would respond right away, and send a message to its own actor. Thus the actors can quit once they send the request to the service, and don't need to spawn any other actors.
object FeedFetcher {
val poolSize = 10
var pendingQueue :Queue[RssFeed] = RssFeed.pendingQueue
def main(args :Array[String]) {
new FetcherManager(poolSize, pendingQueue).start
}
}
case object Stop
class FetcherManager(poolSize :Int = 1, var pendingQueue :Queue[RssFeed]) extends Actor {
val pool = new Array[Fetcher](poolSize)
var numberProcessed = 0
override def start() :Actor = {
for (i <- 0 to (poolSize - 1)) {
val fetcher = new Fetcher(i)
fetcher.start()
pool(i) = fetcher
}
super.start
}
def act() {
for {
f <- pool
if (!pendingQueue.isEmpty)
} {
pendingQueue.synchronized {
f ! pendingQueue.dequeue
}
}
loop {
reactWithin(10000L) {
case id :Int => pendingQueue.synchronized {
numberProcessed = numberProcessed + 1
if (!pendingQueue.isEmpty) {
pool(id) ! pendingQueue.dequeue
} else if ((true /: pool) { (done, f) => {
if (f.getState == Actor.State.Suspended) {
f ! Stop
done && true
} else if (f.getState == Actor.State.Terminated) {
done && true
} else {
false
}
}}) {
pool foreach { f => {
println(f.getState)
}}
println("Processed " + numberProcessed + " feeds total.")
exit
}
}
case TIMEOUT => {
if (pendingQueue.isEmpty) {
println("Manager just woke up from timeout with all feeds assigned.")
pool foreach { f => {
if (f.getState == Actor.State.Suspended) {
println("Sending Stop to Fetcher " + f.id)
f ! Stop
}
}}
println("Checking state of all Fetchers for termination.")
if ((true /: pool) { (done, f) => {
done && (f.getState == Actor.State.Terminated)
}}) {
exit
}
}
}
}
}
}
}
class Fetcher(val id :Int) extends Actor {
var feedsIveDone = 0
def act() {
loop {
react {
case dbFeed :RssFeed => {
println("Fetcher " + id + " starting feed")
//process rss feed here
feedsIveDone = feedsIveDone + 1
sender ! id
}
case Stop => {
println(id + " exiting")
println(feedsIveDone)
exit
}
}
}
}
Related
I'm using akka to dynamically create actors and destroy them when they're finished with a particular job. I've got a handle on actor creation, however stopping the actors keeps them in memory regardless of how I've terminated them. Eventually this causes an out of memory exception, despite the fact that I should only have a handful of active actors at any given time.
I've used:
self.tell(PoisonPill, self)
and:
context.stop(self)
to try and destroy the actors. Any ideas?
Edit: Here's a bit more to flesh out what I'm trying to do. The program opens up and spawns ten actors.
val system = ActorSystem("system")
(1 to 10) foreach { x =>
Entity.count += 1
system.actorOf(Props[Entity], name = Entity.count.toString())
}
Here's the code for the Entity:
class Entity () extends Actor {
Entity.entities += this
val id = Entity.count
import context.dispatcher
val tick = context.system.scheduler.schedule(0 millis, 100 millis, self, "update")
def receive = {
case "update" => {
Entity.entities.foreach(that => collide(that))
}
}
override def postStop() = tick.cancel()
def collide(that:Entity) {
if (!this.isBetterThan(that)) {
destroyMe()
spawnNew()
}
}
def isBetterThan() :Boolean = {
//computationally intensive logic
}
private def destroyMe(){
Entity.entities.remove(Entity.entities.indexOf(this))
self.tell(PoisonPill, self)
//context.stop(self)
}
private def spawnNew(){
val system = ActorSystem("system")
Entity.count += 1
system.actorOf(Props[Entity], name = Entity.count.toString())
}
}
object Entity {
val entities = new ListBuffer[Entity]()
var count = 0
}
Thanks #AmigoNico, you pointed me in the right direction. It turns out that neither
self.tell(PoisonPill, self)
nor
context.stop(self)
worked for timely Actor disposal; I switched the line to:
system.stop(self)
and everything works as expected.
I need an actor to stop one of its children, so that I can possibly create a new actor with same name (UUID ?).
I've got an ActorSystem with one Actor child. And this child creates new actors with context.actorOf and context.watch. When I try to stop one of these using context.stop, I observe that its postStop method is called as expected, but no matter how long I wait (seconds... minutes...), it never sends back the Terminated message to its creator (and watching) actor.
I read this in the AKKA documentation:
Since stopping an actor is asynchronous, you cannot immediately reuse the name of the child you just stopped; this will result in an InvalidActorNameException. Instead, watch the terminating actor and create its replacement in response to the Terminated message which will eventually arrive.
I don't care waiting for normal termination, but I really need actors to eventually terminate when asked to. Am I missing something ? Should I create actors directly from the system instead of from an actor ?
EDIT:
Here is my code :
object MyApp extends App {
def start() = {
val system = ActorSystem("MySystem")
val supervisor = system.actorOf(Supervisor.props(), name = "Supervisor")
}
override def main(args: Array[String]) {
start()
}
}
object Supervisor {
def props(): Props = Props(new Supervisor())
}
case class Supervisor() extends Actor {
private var actor: ActorRef = null
start()
def newActor(name: String): ActorRef = {
try {
actor = context.actorOf(MyActor.props(name), name)
context.watch(actor)
} catch {
case iane: InvalidActorNameException =>
println(name + " not terminated yet.")
null
}
}
def terminateActor() {
if (actor != null) context.stop(actor)
actor = null
}
def start() {
while (true) {
// do something
terminateActor()
newActor("new name possibly same name as a previously terminated one")
Thread.sleep(5000)
}
}
override def receive = {
case Terminated(x) => println("Received termination confirmation: " + x)
case _ => println("Unexpected message.")
}
override def postStop = {
println("Supervisor called postStop().")
}
}
object MyActor {
def props(name: String): Props = Props(new MyActor(name))
}
case class MyActor(name: String) extends Actor {
run()
def run() = {
// do something
}
override def receive = {
case _ => ()
}
override def postStop {
println(name + " called postStop().")
}
}
EDIT²: As mentionned by #DanGetz, one shall not need to call Thread.sleep in an AKKA actor. Here what I needed was a periodical routine. This can be done using the AKKA context scheduler. See: http://doc.akka.io/docs/akka/2.3.3/scala/howto.html#scheduling-periodic-messages . Instead I was blocking the actor in an infinite loop, preventing it to use its asynchronous mecanisms (messages). I changed the title since the problem was actually not involving actor termination.
It's hard to gauge exactly what you want now that the question has changed a bit, but I'm going to take a stab anyway. Below you will find a modified version of your code that shows both periodic scheduling of a task (one that kicks off the child termination process) and also watching a child and only creating a new one with the same name when we are sure the previous one has stopped. If you run the code below, every 5 seconds you should see it kill the child and wait for the termination message before stating a new one with the exact same name. I hope this is what you were looking for:
object Supervisor {
val ChildName = "foo"
def props(): Props = Props(new Supervisor())
case class TerminateChild(name:String)
}
case class Supervisor() extends Actor {
import Supervisor._
import scala.concurrent.duration._
import context._
//Start child upon creation of this actor
newActor(ChildName)
override def preStart = {
//Schedule regular job to run every 5 seconds
context.system.scheduler.schedule(5 seconds, 5 seconds, self, TerminateChild(ChildName))
}
def newActor(name: String): ActorRef = {
val child = context.actorOf(MyActor.props(name), name)
watch(child)
println(s"created child for name $name")
child
}
def terminateActor(name:String) = context.child(ChildName).foreach{ ref =>
println(s"terminating child for name $name")
context stop ref
}
override def receive = {
case TerminateChild(name) =>
terminateActor(name)
case Terminated(x) =>
println("Received termination confirmation: " + x)
newActor(ChildName)
case _ => println("Unexpected message.")
}
override def postStop = {
println("Supervisor called postStop().")
}
}
I'm testing how a new Actor I'm working on handles unexpected messages. I'd like to assert that it throws a GibberishException in these cases. Here's the test and the implementation so far:
Test:
"""throw a GibberishException for unrecognized messages""" in {
//define a service that creates gibberish-speaking repositories
val stubs = new svcStub(
actorOf(new Actor{
def receive = { case _ => {
self.channel ! "you're savage with the cabbage"
}
}
})
)
val model = actorOf(new HomeModel(stubs.svc,stubs.store))
val supervisor = Supervisor(
SupervisorConfig(
OneForOneStrategy(List(classOf[Exception]), 3, 1000),
Supervise(model,Permanent) :: Nil
)
)
try{
intercept[GibberishException] {
supervisor.start
model !! "plan"
}
} finally {
supervisor.shutdown
}
stubs.store.plan should equal (null)
stubs.svcIsOpen should be (false)
}
Implementation:
class HomeModel(service: PlanService, store: HomeStore)
extends Actor {
private val loaderRepo = service.getRepo()
private var view: Channel[Any] = null
override def postStop() = {
service.close()
}
def receive = {
case "plan" => {
view=self.channel
loaderRepo ! LoadRequest()
}
case p: Plan => {
store.plan=p
view ! store.plan
}
case _ => throw new GibberishException(_)
}
}
However, when I run the test, the exception details get to the Supervisor I established, but I don't know how to do anything with them (like log them or test their type). I'd like to be able to get the exception details here from the supervisor so i can rethrow and intercept them in my test. Outside of a test method, I could imagine this being useful if you wanted to report the nature of an exception in the UI of a running app. Is there a way to get this from the Supervisor when it happens?
Change the OneForOneStrategy to only handle GibberishException, should solve it.
I would like to drop messages from an actor's mailbox if it becomes too full. For example, if the queue size reaches 1000 messages, the oldest one should be deleted.
You cannot work with the mailbox directly, but you can implement Message Expiration pattern on top of the existing library.
Send a creation date with every message:
case class ExpirableMessage(msg: String, createdAt: Long)
Scan the mailbox with reactWithin(0), and filter out expired messages:
react{
case msg: ExpirableMessage =>
// handle the message
// clean the mailbox with nested react
reactWithin(0){
case ExpirableMessage(_, createdAt) if(currentTimeMillis - createdAt > INTERVAL) =>
case TIMEOUT =>
}
}
You can also reify an actor's queue on the heap and throttle its utilization by using a proxy actor. Then you can write something like the following:
// adder actor with a bounded queue size of 4
val adder = boundActor(4) {
loop {
react {
case x: Int => reply(x*2)
}
}
}
// test the adder
actor {
for (i <- 1 to 10) {
adder !! (i, { case answer: Int => println("Computed " + i + " -> " + answer) })
}
}
Here is the implementation of boundedActor. Note that a boundedActor must always reply to its sender, otherwise there is no way to track its queue size, and the boundedActor will freeze refusing to accept any further messages.
object ActorProxy extends scala.App {
import scala.actors._
import scala.actors.Actor._
import scala.collection.mutable._
/**
* Accepts an actor and a message queue size, and
* returns a proxy that drops messages if the queue
* size of the target actor exceeds the given queue size.
*/
def boundActorQueue(target: Actor, maxQueueLength: Int) = actor {
val queue = new Queue[Tuple2[Any, OutputChannel[Any]]]
var lastMessageSender: Option[OutputChannel[Any]] = None
def replyHandler(response: Any) {
if (lastMessageSender.get != null) lastMessageSender.get ! response
if (queue.isEmpty) {
lastMessageSender = None
} else {
val (message, messageSender) = queue.dequeue
forwardMessage(message, messageSender)
}
}
def forwardMessage(message: Any, messageSender: OutputChannel[Any]) = {
lastMessageSender = Some(messageSender)
target !! (message, { case response => replyHandler(response) })
}
loop {
react {
case message =>
if (lastMessageSender == None) {
forwardMessage(message, sender)
} else {
queue.enqueue((message, sender))
// Restrict the queue size
if (queue.length > maxQueueLength) {
val dropped = queue.dequeue
println("!!!!!!!! Dropped message " + dropped._1)
}
}
}
}
}
// Helper method
def boundActor(maxQueueLength: Int)(body: => Unit): Actor = boundActorQueue(actor(body), maxQueueLength)
}
I was writing a little test program to try out some things with Remote Actors that I was going to need in a Scala project.
The basic goal was to write a test application of one server that could handle a bunch of clients and more important clients that can send multiple messages at the same time (like pings, requests for updates and user induced requests for data)
What I came up with was this:
brief overview: the client starts 3 different actors which again start actors in while loops with different offsets in order to simulate rather random messages.
import scala.actors.remote.RemoteActor
import scala.actors.remote.Node
import scala.actors.Actor
trait Request
trait Response
case object WhoAmI extends Request
case class YouAre(s:String) extends Response
case object Ping extends Request
case object Pong extends Response
case class PrintThis(s:String) extends Request
case object PrintingDone extends Response
object Server {
def main(args: Array[String]) {
val server = new Server
server.start
}
}
class Server extends Actor {
RemoteActor.alive(12345)
RemoteActor.register('server, this)
var count:Int = 0
def act() {
while(true) {
receive {
case WhoAmI => {
count += 1
sender ! YouAre(count.toString)
}
case Ping => sender ! Pong
case PrintThis(s) => {
println(s)
sender ! PrintingDone
}
case x => println("Got a bad request: " + x)
}
}
}
}
object Act3 extends scala.actors.Actor {
def act = {
var i = 0
Thread.sleep(900)
while (i <= 12) {
i += 1
val a = new Printer
a.start
Thread.sleep(900)
}
}
}
class Printer extends scala.actors.Actor {
def act = {
val server = RemoteActor.select(Node("localhost",12345), 'server)
server ! PrintThis("gagagagagagagagagagagagaga")
receive {
case PrintingDone => println("yeah I printed")
case _ => println("got something bad from printing")
}
}
}
object Act2 extends scala.actors.Actor {
def act = {
var i = 0
while (i < 10) {
i+=1
val a = new Pinger
a.start
Thread.sleep(700)
}
}
}
class Pinger extends scala.actors.Actor {
def act = {
val server = RemoteActor.select(Node("localhost",12345), 'server)
server ! Ping
receive {
case Pong => println("so I pinged and it fits")
case x => println("something wrong with ping. Got " + x)
}
}
}
object Act extends scala.actors.Actor {
def act = {
var i = 0
while(i < 10) {
i+=1
val a = new SayHi
a.start()
Thread.sleep(200)
}
}
}
class SayHi extends scala.actors.Actor {
def act = {
val server = RemoteActor.select(Node("localhost",12345), 'server)
server ! "Hey!"
}
}
object Client {
def main(args: Array[String]) {
Act.start()
//Act2.start()
Act3.start()
}
}
The problem is that things don't run as smoothly as I'd expect them to:
when I start only one of the client actors (by commenting the others out as I did with Act2in Client) things usually but not always go well. If I start two or more actors, quite often the printouts appear in bulk (meaning: there's nothing happening at once and then the printouts appear rather fast). Also the client sometimes terminates and sometimes doesn't.
This may not be the biggest problems but they're enough to make me feel quite uncomfortable. I did a lot of reading on Actors and Remote Actors but I find the available info rather lacking.
Tried to add exit statements where ever it seemed fit. But that didn't help.
Has anybody got an idea what I'm doing wrong? Any general tricks here? Some dos and donts?
My guess is that your issues stem from blocking your actor's threads by using receive and Thread.sleep. Blocking operations consume threads in the actors' thread pool, which can prevent other actors from executing until new threads are added to the pool. This question may provide some additional insight.
You can use loop, loopWhile, react, and reactWithin to rewrite many of your actors to use non-blocking operations. For example
import scala.actors.TIMEOUT
object Act extends scala.actors.Actor {
def act = {
var i = 0
loopWhile(i < 10) {
reactWithin(200) { case TIMEOUT =>
i+=1
val a = new SayHi
a.start()
}
}
}
}
Of course, you can eliminate some boilerplate by writing your own control construct:
def doWithin(msec: Long)(f: => Unit) = reactWithin(msec) { case TIMEOUT => f }
def repeat(times: Int)(f: => Unit) = {
var i = 0
loopWhile(i < times) {
f
i+=1
}
}
This would allow you to write
repeat(10) {
doWithin(200) {
(new SayHi).start
}
}
You may try Akka actors framework instead http://akkasource.org/