scala actors: drop messages if queue is too long? - scala

I would like to drop messages from an actor's mailbox if it becomes too full. For example, if the queue size reaches 1000 messages, the oldest one should be deleted.

You cannot work with the mailbox directly, but you can implement Message Expiration pattern on top of the existing library.
Send a creation date with every message:
case class ExpirableMessage(msg: String, createdAt: Long)
Scan the mailbox with reactWithin(0), and filter out expired messages:
react{
case msg: ExpirableMessage =>
// handle the message
// clean the mailbox with nested react
reactWithin(0){
case ExpirableMessage(_, createdAt) if(currentTimeMillis - createdAt > INTERVAL) =>
case TIMEOUT =>
}
}

You can also reify an actor's queue on the heap and throttle its utilization by using a proxy actor. Then you can write something like the following:
// adder actor with a bounded queue size of 4
val adder = boundActor(4) {
loop {
react {
case x: Int => reply(x*2)
}
}
}
// test the adder
actor {
for (i <- 1 to 10) {
adder !! (i, { case answer: Int => println("Computed " + i + " -> " + answer) })
}
}
Here is the implementation of boundedActor. Note that a boundedActor must always reply to its sender, otherwise there is no way to track its queue size, and the boundedActor will freeze refusing to accept any further messages.
object ActorProxy extends scala.App {
import scala.actors._
import scala.actors.Actor._
import scala.collection.mutable._
/**
* Accepts an actor and a message queue size, and
* returns a proxy that drops messages if the queue
* size of the target actor exceeds the given queue size.
*/
def boundActorQueue(target: Actor, maxQueueLength: Int) = actor {
val queue = new Queue[Tuple2[Any, OutputChannel[Any]]]
var lastMessageSender: Option[OutputChannel[Any]] = None
def replyHandler(response: Any) {
if (lastMessageSender.get != null) lastMessageSender.get ! response
if (queue.isEmpty) {
lastMessageSender = None
} else {
val (message, messageSender) = queue.dequeue
forwardMessage(message, messageSender)
}
}
def forwardMessage(message: Any, messageSender: OutputChannel[Any]) = {
lastMessageSender = Some(messageSender)
target !! (message, { case response => replyHandler(response) })
}
loop {
react {
case message =>
if (lastMessageSender == None) {
forwardMessage(message, sender)
} else {
queue.enqueue((message, sender))
// Restrict the queue size
if (queue.length > maxQueueLength) {
val dropped = queue.dequeue
println("!!!!!!!! Dropped message " + dropped._1)
}
}
}
}
}
// Helper method
def boundActor(maxQueueLength: Int)(body: => Unit): Actor = boundActorQueue(actor(body), maxQueueLength)
}

Related

how to watch akka actor and capture its context in the Terminated message receive

I want to supervise child actors using context.watch(childActor) API
I see that when the child actor has an uncaught exception, indeed the receive message "Terminated" is called on the parent.
But since I am creating many children actors, each with different contexts and messages, I don't know which particular child actor (and surrounding context) actually failed.
For example:
class ParentActor extends Actor {
override def receive = {
case "delegateWorkToChildrenActors" => {
(0 to 100) foreach {
i =>
val child = context.actorOf(Props(new ChildActor(i, "child-actor")
context.watch(child)
val message = SomeSortOfComplexMessageWithManyParameters(...)
val res = child ? message
res.onComplete {
x: Try[Any] => x match {
case Failure(exception: Throwable) => // can only get timeout exception here
case Success(value) => // continue as usual
}
}
}
}
case m # Terminated(actor) => {
// a child actor was terminated but what is its context ? i.e. which message did it try to handle ?
}
}
override val supervisorStrategy = OneForOneStrategy(maxNrOfRetries = 1, withinTimeRange = 1 minute) {
case m # _ => {
Stop // Assume that the actor was stopped or crashed etc.
}
}
}
class ChildActor(number : Int) extends Actor {
override def receive: Receive = {
case SomeSortOfComplexMessageWithManyParameters => {
// might of got some sort of exception here...
// the parent which is monitoring it received the "Terminate" message
}
}
}
How can I get the message that the child actor worked upon?

Update state in actor from within a future

Consider the following code sample:
class MyActor (httpClient: HttpClient) {
var canSendMore = true
override def receive: Receive = {
case PayloadA(name: String) => send(urlA)
case PayloadB(name: String) => send(urlB)
def send(url: String){
if (canSendMore)
httpClient.post(url).map(response => canSendMore = response.canSendMore)
else {
Thread.sleep(5000) //this will be done in a more elegant way, it's just for the example.
httpClient.post(url).map(response => canSendMore = response.canSendMore)
}
}
}
Each message handling will result in an async http request. (post return value is a Future[Response])
My problem is that I want to safely update counter ( At the moment there is a race condition)
BTW, I must somehow update counter in the same thread, or at least before any other message is processed by this actor.
Is this possible?
You can use become + stash combination to keep on stashing messages when the http request future is in process.
object FreeToProcess
case PayloadA(name: String)
class MyActor (httpClient: HttpClient) extends Actor with Stash {
def canProcessReceive: Receive = {
case PayloadA(name: String) => {
// become an actor which just stashes messages
context.become(canNotProcessReceive, discardOld = false)
httpClient.post(urlA).onComplete({
case Success(x) => {
// Use your result
self ! FreeToProcess
}
case Failure(e) => {
// Use your failure
self ! FreeToProcess
}
})
}
}
def canNotProcessReceive: Receive = {
case CanProcess => {
// replay stash to mailbox
unstashAll()
// start processing messages
context.unbecome()
}
case msg => {
stash()
}
}
}

Apache Spark Receiver Scheduling

I have implemented a receiver that is supposed to connect to a WebSocket stream and get the messages for processing. Here is the implementation that I have done so far:
class WebSocketReader (wsConfig: WebSocketConfig, stringMessageHandler: String => Option[String],
storageLevel: StorageLevel) extends Receiver[String] (storageLevel) {
// TODO: avoid using a var
private var wsClient: WebSocketClient = _
def sendRequest(isRequest: Boolean, msgCount: Int) = {
while (isRequest) {
wsClient.send(msgCount.toString)
Thread.sleep(1000)
}
}
// TODO: avoid using Synchronization...
private def connect(): Unit = {
Try {
wsClient = createWsClient
} match {
case Success(_) =>
wsClient.connect().map {
case result if result.isSuccess =>
sendRequest(true, 10)
case _ =>
connect()
}
case Failure(ex) =>
// TODO: how to signal a failure so that it is tried the next time....
ex.printStackTrace()
}
}
def onStart(): Unit = {
new Thread(getClass.getSimpleName) {
override def run() { connect() }
}.start()
}
override def onStop(): Unit =
if (wsClient != null) wsClient.disconnect()
private def createWsClient = {
new DefaultHookupClient(new HookupClientConfig(new URI(wsConfig.wsUrl))) {
override def receive: Receive = {
case Disconnected(_) =>
// TODO: use Logging framework, try reconnecting....
println(s"the web socket is disconnected")
case TextMessage(message) =>
stringMessageHandler(message).foreach(store)
case JsonMessage(jsValue) =>
stringMessageHandler(jsValue.toString).foreach(store)
}
}
}
}
How is this Receiver being run? Does this Receiver run on the worker nodes or on the driver node? Is this way of sleeping a thread a correct approach?
The reason why I want to do this is that the server that is exposing the WebSocket end point would need a count on the messages that I want to receive. Say if I ask the server for 100 messages, it would give me 100 messages and so on. So I need a way to periodically schedule this request to the server. Currently, I'm using the Thread.sleep mechanism. Is this advisable? What could be the alternative?

Akka persistentChannel does not delete message from Journal upon confirm

I am writing a piece of code that uses PersistentChannel to send a message to an actor that does some IO. Upon completion it confirms the ConfirmablePersistent message.
The document says that upon confirmation the message shall be deleted in a PersistentChannel. But in my case my files stays in the journal with out getting deleted.
My requirement is that as soon as I get a successful result for the IO or the deadline has exceeded the persisted message should be deleted from the journal.
class IOWorker(config: Config, ref: ActorRef)
extends Actor with ActorLogging {
import IOWorker._
val channel = context.actorOf(PersistentChannel.props(
PersistentChannelSettings(redeliverInterval = 1.minute,
pendingConfirmationsMax = 1,pendingConfirmationsMin = 0)))
val doIOActor = context.actorOf(DOIOActor(config))
def receive = {
case payload # (msg, deadline)=>
channel ! Deliver(Persistent(payload), doIOActor.path)
}
}
object DOIOActor {
def apply(config: Config) = Props(classOf[DOIOActor], config)
}
class DOIOActor(config: Config) extends Actor
with ActorLogging {
def receive = {
case p # ConfirmablePersistent(payload, sequenceNr, redeliveries) =>
payload match {
case (msg, deadline: Deadline) =>
deadline.hasTimeLeft match {
case false => p.confirm()
case true =>
sender ! SAVED(msg)
Try{DOIO}
match
{
case Success(v) =>
sender ! SUCCESS(msg)
p.confirm()
case Failure(doioException) =>
log.warning(s"Could not complete DOIO. $doioException")
throw doioException
}
}
}
}
def DOIO(ftpClient: FTPClient, destination: String, file: AISData) = {
SOMEIOTASK match {
case true => log.info(s"Storing file to $destination.")
case false =>
throw new Exception(s"Could not DOIO to destination $destination")
}
}
}
Deletions are performed asynchronously by most journal implementations, as discussed on the mailing list.

Scala actors left hanging

I am spawning a small number of actors to fetch, process and save RSS feed items to a database. This is done through a main method of an object running on cron. I create these actors and dole out jobs to them as they complete the previous job assigned to them. My main class spawns a single actor, the one that doles out jobs to a pool of actors. Eventually the main method seems to hang. It doesn't exit, but execution halts on all the actors. My CTO believes the main is exiting before the actors complete their work and leaving them, but I am not convinced that's the case. I receive no success exit on main (no exit at all).
Essentially I'm wondering how to debug these actors, and what possible reason could cause this to happen. Will main exit before actors have completed their execution (and if it does, does that matter?) From what I can tell actors using receive are mapped 1-to-1 to threads, correct? Code is below. Please ask any follow-up questions, help is greatly appreciated. I know I may not have provided sufficient detail, I'm new to scala and actors and will update as needed.
object ActorTester {
val poolSize = 10
var pendingQueue :Set[RssFeed] = RssFeed.pendingQueue
def main(args :Array[String]) {
val manager = new SpinnerManager(poolSize, pendingQueue)
manager.start
}
}
case object Stop
class SpinnerManager(poolSize :Int = 1, var pendingQueue :Set[RssFeed]) extends Actor {
val pool = new Array[Spinner](poolSize)
override def start() :Actor = {
for (i <- 0 to (poolSize - 1)) {
val spinner = new Spinner(i)
spinner.start()
pool(i) = spinner
}
super.start
}
def act() {
for {
s <- pool
if (!pendingQueue.isEmpty)
} {
s ! pendingQueue.head
pendingQueue = pendingQueue.tail
}
while(true) {
receive {
case id :Int => {
if (!pendingQueue.isEmpty) {
pool(id) ! pendingQueue.head
pendingQueue = pendingQueue.tail
} else if ((true /: pool) { (done, s) => {
if (s.getState != Actor.State.Runnable) {
val exited = future {
s ! Stop
done && true
}
exited()
} else {
done && false
}
}}) {
exit
}
}
}
}
}
}
class Spinner(id :Int) extends Actor {
def act() {
while(true) {
receive {
case dbFeed :RssFeed => {
//process rss feed
//this has multiple network requests, to the original blogs, bing image api
//our instance of solr - some of these spawn their own actors
sender ! id
}
case Stop => exit
}
}
}
}
For one thing you're making a tiny but important mistake when you're folding left in order to determine whether all Spinner actors have "terminated" or not. What you should do is evaluate to done && true resp. done && false at the end of the if cases, but currently you just say true resp. false without respect to done.
For example, imagine having 4 Spinner actors where the first and second ones were Runnable, the third one not, and the fourth one Runnable again. In that case the result of your foldleft would be true in spite of the fact that the third actor hasn't finished yet. If you were using a logical &&, you'd get the correct result.
This is possibily also what causes your application to hang.
EDIT: There also was an issue wrt a race condition. The following code works now, hope it helps. Anyway, I was wondering, doesn't Scala's actor implementation automatically make use of worker threads?
import actors.Actor
import scala.collection.mutable.Queue
case class RssFeed()
case class Stop()
class Spinner(id: Int) extends Actor {
def act() {
loop {
react {
case dbFeed: RssFeed => {
// Process RSS feed
sender ! id
}
case Stop => exit()
}
}
}
}
class SpinnerManager(poolSize: Int, pendingQueue: Queue[RssFeed]) extends Actor {
val pool = Array.tabulate(poolSize)(new Spinner(_).start())
def act() {
for (s <- pool; if (!pendingQueue.isEmpty)) {
pendingQueue.synchronized {
s ! pendingQueue.dequeue()
}
}
loop {
react {
case id: Int => pendingQueue.synchronized {
if (!pendingQueue.isEmpty) {
Console println id
pool(id) ! pendingQueue.dequeue()
} else {
if (pool forall (_.getState != Actor.State.Runnable)) {
pool foreach (_ ! Stop)
exit()
}
}
}
}
}
}
}
object ActorTester {
def main(args: Array[String]) {
val poolSize = 10
val pendingQueue: Queue[RssFeed] = Queue.tabulate(100)(_ => RssFeed())
new SpinnerManager(poolSize, pendingQueue).start()
}
}
So after several days of debugging I've solved this issue. fotNelton's code suggestions were very helpful in doing so, so I've given him a vote. However, they didn't address the problem itself. What I've found is that if you are running this in a main method then if the parent actors exit before their child actors then the program will hang forever and never exit, still holding all of its memory. In the process of handling the RSS feed, a Fetcher would spawn actors and send them messages to do things involving network requests. These actors need to complete their work before the parent actor quits. Fetcher wouldn't wait for these actors to finish though, once he sent the message he would just move on. So he would tell manager he was finished before his child actors had finished all their work. To deal with this, one option would be to use futures and wait until the actors are done (pretty slow). My solution was to create services accessible via URL (POST to a service that has an actor waiting to react). The service would respond right away, and send a message to its own actor. Thus the actors can quit once they send the request to the service, and don't need to spawn any other actors.
object FeedFetcher {
val poolSize = 10
var pendingQueue :Queue[RssFeed] = RssFeed.pendingQueue
def main(args :Array[String]) {
new FetcherManager(poolSize, pendingQueue).start
}
}
case object Stop
class FetcherManager(poolSize :Int = 1, var pendingQueue :Queue[RssFeed]) extends Actor {
val pool = new Array[Fetcher](poolSize)
var numberProcessed = 0
override def start() :Actor = {
for (i <- 0 to (poolSize - 1)) {
val fetcher = new Fetcher(i)
fetcher.start()
pool(i) = fetcher
}
super.start
}
def act() {
for {
f <- pool
if (!pendingQueue.isEmpty)
} {
pendingQueue.synchronized {
f ! pendingQueue.dequeue
}
}
loop {
reactWithin(10000L) {
case id :Int => pendingQueue.synchronized {
numberProcessed = numberProcessed + 1
if (!pendingQueue.isEmpty) {
pool(id) ! pendingQueue.dequeue
} else if ((true /: pool) { (done, f) => {
if (f.getState == Actor.State.Suspended) {
f ! Stop
done && true
} else if (f.getState == Actor.State.Terminated) {
done && true
} else {
false
}
}}) {
pool foreach { f => {
println(f.getState)
}}
println("Processed " + numberProcessed + " feeds total.")
exit
}
}
case TIMEOUT => {
if (pendingQueue.isEmpty) {
println("Manager just woke up from timeout with all feeds assigned.")
pool foreach { f => {
if (f.getState == Actor.State.Suspended) {
println("Sending Stop to Fetcher " + f.id)
f ! Stop
}
}}
println("Checking state of all Fetchers for termination.")
if ((true /: pool) { (done, f) => {
done && (f.getState == Actor.State.Terminated)
}}) {
exit
}
}
}
}
}
}
}
class Fetcher(val id :Int) extends Actor {
var feedsIveDone = 0
def act() {
loop {
react {
case dbFeed :RssFeed => {
println("Fetcher " + id + " starting feed")
//process rss feed here
feedsIveDone = feedsIveDone + 1
sender ! id
}
case Stop => {
println(id + " exiting")
println(feedsIveDone)
exit
}
}
}
}