Akka actors' state not monitored properly - scala

Monitoring Akka actors' state is said to be possible by using .underlyingActor.
In the example below, there is an ActorWithState using an Integer counter. The process of counter's incrementation and decrementation is tested by an ActorWithStateTest. The incrementation step always passes. However, counter's decrementation does not seem to work in that the 2nd assert always fails. The error message is the following. What is wrong?
[Error message of this test]
[info] ActorWithStateTest
[info] - should validate counter incrementation and decrementation *** FAILED ***
[info] 1 did not equal 0, Expected counter to be 0 after 'Terminated' message
[Actor with counter]
object ActorWithState {
case class Inc;
}
class ActorWithState(snooper: ActorRef) extends Actor with ActorLogging {
var counter = 0
def receive = {
case Inc => counter += 1
case Terminated(ref) => counter -= 1
}
}
[ActorTest shall test counter's behaviour]
class ActorWithStateTest extends TestKit(ActorSystem("SimpleTestSpec")) {
"Test" should {
"validate counter incrementation and decrementation" in {
val aws = TestActorRef(Props(classOf[ActorWithState], testActor), name = "aws")
val awsA: ActorWithState = aws.underlyingActor
// tell aws to increment its counter
aws ! ActorWithState.Inc
// this assert always passes
assert(awsA.counter == 1, ", Expected counter to be 1 after 'Inc' message")
// triggers a 'Terminated' message being sent to aws
val tp = TestProbe()
aws watch tp.ref
system stop tp.ref
// [EDIT] the following assert passes if some - time consuming - processing is added here
// the following assert does NOT pass, WHY?
assert(awsA.counter == 0, ", Expected counter to be 0 after 'Terminated' message")
}
}
[akka-actor: 2.3.2, akka-testkit: 2.3.2, scalatest: 2.0]

As mentioned in my comment, your issue stems from the stopping of an actor being async. Your assertion is happening before the Terminated message hits your actor that you are testing. One quick and dirty way to fix this is to introduce another probe that also listens for the terminated event and perform an assertion on that first. That will wait until the terminated is received by the second watcher which should mean that your actor has also received the terminated event and you can perform your assertion. This worked for me:
val tp = TestProbe()
val tp2 = TestProbe()
aws watch tp.ref
tp2 watch tp.ref
system stop tp.ref
tp2.expectTerminated(tp.ref)
assert(awsA.counter == 0, ", Expected counter to be 0 after 'Terminated' message")

Related

Stop Flink Kafka consumer task programmatically

I'm using Kafka consumer with Flink 1.9 (in Scala 2.12), and facing the following problem (similar to this question): the consumer should stop fetching data (and finish the task) when no new messages are received for a specific amount of time (since the stream is potentially infinite, so there is no "end-of-stream" message in the topic itself).
I've tried to use ProcessFunction which calls consumer.close(), but this did not help (consumer continues to run). Throwing an exception in ProcessFunction kills the job completely, which is not what I want (since the job consists of several stages, which are canceled after throwing an exception). Here is my ProcessFunction:
class TimeOutFunction( // delay after which an alert flag is thrown
val timeOut: Long, consumer: FlinkKafkaConsumer[Row]
) extends ProcessFunction[Row, Row] {
// state to remember the last timer set
private var lastTimer: ValueState[Long] = _
override def open(conf: Configuration): Unit = { // setup timer state
val lastTimerDesc = new ValueStateDescriptor[Long]("lastTimer", classOf[Long])
lastTimer = getRuntimeContext.getState(lastTimerDesc)
}
override def processElement(value: Row, ctx: ProcessFunction[Row, Row]#Context, out: Collector[Row]): Unit = { // get current time and compute timeout time
val currentTime = ctx.timerService.currentProcessingTime
val timeoutTime = currentTime + timeOut
// register timer for timeout time
ctx.timerService.registerProcessingTimeTimer(timeoutTime)
// remember timeout time
lastTimer.update(timeoutTime)
// throughput the event
out.collect(value)
}
override def onTimer(timestamp: Long, ctx: ProcessFunction[Row, Row]#OnTimerContext, out: Collector[Row]): Unit = {
// check if this was the last timer we registered
if (timestamp == lastTimer.value) {
// it was, so no data was received afterwards.
// stop the consumer.
consumer.close()
}
}
}
The isEndOfStream() method on a deserialization schema is also no good, since it requires nextElement (and my case is kind of vice-versa, since the stream should stop when there is no next element for some time).
So, there is a way to do this (preferably without subclassing FlinkKafkaConsumer and/or using reflection)?

Omitting all Scala Actor messages except the last

I want omit all the same type of messages except the last one:
def receive = {
case Message(type:MessageType, data:Int) =>
// remove previous and get only last message of passed MessageType
}
for example when I send:
actor ! Message(MessageType.RUN, 1)
actor ! Message(MessageType.RUN, 2)
actor ! Message(MessageType.FLY, 1)
then I want to recevie only:
Message(MessageType.RUN, 2)
Message(MessageType.FLY, 1)
Of course if they will be send very fast, or on high CPU load
You could wait a very short amount of time, storing the most recent messages that arrive, and then process only those most recent ones. This can be accomplished by sending messages to yourself, and scheduleOnce. See the second example under the Akka HowTo: Common Patterns, Scheduling Periodic Messages. Instead of scheduling ticks whenever the last tick ends, you can wait until new messages arrive. Here's an example of something like that:
case class ProcessThis(msg: Message)
case object ProcessNow
var onHold = Map.empty[MessageType, Message]
var timer: Option[Cancellable] = None
def receive = {
case msg # Message(t, _) =>
onHold += t -> msg
if (timer.isEmpty) {
import context.dispatcher
timer = Some(context.system.scheduler.scheduleOnce(1 millis, self, ProcessNow))
}
case ProcessNow =>
timer foreach { _.cancel() }
timer = None
for (m <- onHold.values) self ! ProcessThis(m)
onHold = Map.empty
case ProcessThis(Message(t, data)) =>
// really process the message
}
Incoming Messages are not actually processed right away, but are stored in a Map that keeps only the last of each MessageType. On the ProcessNow tick message, they are really processed.
You can change the length of time you wait (in my example set to 1 millisecond) to strike a balance between responsivity (length of time from a message arriving to response) and efficiency (CPU or other resources used or held up).
type is not a good name for a field, so let's use messageType instead. This code should do what you want:
var lastMessage: Option[Message] = None
def receive = {
case m => {
if (lastMessage.fold(false)(_.messageType != m.messageType)) {
// do something with lastMessage.get
}
lastMessage = Some(m)
}
}

How can one verify messages sent to self are delivered when testing Akka actors?

I have an Actor that is similar to the following Actor in function.
case class SupervisingActor() extends Actor {
protected val processRouter = //round robin router to remote workers
override def receive = {
case StartProcessing => { //sent from main or someplace else
for (some specified number of process actions ){
processRouter ! WorkInstructions
}
}
case ProcessResults(resultDetails) => { //sent from the remote workers when they complete their work
//do something with the results
if(all of the results have been received){
//*********************
self ! EndProcess //This is the line in question
//*********************
}
}
case EndProcess {
//do some reporting
//shutdown the ActorSystem
}
}
}
}
How can I verify the EndProcess message is sent to self in tests?
I'm using scalatest 2.0.M4, Akka 2.0.3 and Scala 1.9.2.
An actor sending to itself is very much an intimiate detail of how that actor performs a certain function, hence I would rather test the effect of that message than whether or not that message has been delivered. I’d argue that sending to self is the same as having a private helper method on an object in classical OOP: you also do not test whether that one is invoked, you test whether the right thing happened in the end.
As a side note: you could implement your own message queue type (see https://doc.akka.io/docs/akka/snapshot/mailboxes.html#creating-your-own-mailbox-type) and have that allow the inspection or tracing of message sends. The beauty of this approach is that it can be inserted purely by configuration into the actor under test.
In the past, I have overridden the implementation for ! so that I could add debug/logging. Just call super.! when you're done, and be extra careful not to do anything that would throw an exception.
I had the same issue with an FSM actor. I tried setting up a custom mailbox as per the accepted answer but a few minutes didn't get it working. I also attempted to override the tell operator as per another answer but that was not possible as self is a final val. Eventually I just replaced:
self ! whatever
with:
sendToSelf(whatever)
and added that method into the actor as:
// test can override this
protected def sendToSelf(msg: Any) {
self ! msg
}
then in the test overrode the method to capture the self sent message and sent it back into the fsm to complete the work:
#transient var sent: Seq[Any] = Seq.empty
val fsm = TestFSMRef(new MyActor(x,yz) {
override def sendToSelf(msg: Any) {
sent = sent :+ msg
}
})
// yes this is clunky but it works
var wait = 100
while( sent.isEmpty && wait > 0 ){
Thread.sleep(10)
wait = wait - 10
}
fsm ! sent.head

How can I retrieve the first-completed Actor in a group of Actors in Scala?

I have a moderate number of long-running Actors and I wish to write a synchronous function that returns the first one of these that completes. I can do it with a spin-wait on futures (e.g.,:
while (! fs.exists(f => f.isSet) ) {
Thread.sleep(100)
}
val completeds = fs.filter(f => f.isSet)
completeds.head()
), but that seems very "un-Actor-y"
The scala.actors.Futures class has two methods awaitAll() and awaitEither() that seem awfully close; if there were an awaitAny() I'd jump on it. Am I missing a simple way to do this or is there a common pattern that is applicable?
A more "actorish" way of waiting for completion is creating an actor in charge of handling completed result (lets call it ResultHandler)
Instead of replying, workers send their answer to ResultHandler in fire-and-forget manner. The latter will continue processing the result while other workers complete their job.
The key for me was the discovery that every (?) Scala object is, implicitly, an Actor, so you can use Actor.react{ } to block. Here is my source code:
import scala.actors._
import scala.actors.Actor._
//Top-level class that wants to return the first-completed result from some long-running actors
class ConcurrentQuerier() {
//Synchronous function; perhaps fulfilling some legacy interface
def synchronousQuery : String = {
//Instantiate and start the monitoring Actor
val progressReporter = new ProgressReporter(self) //All (?) objects are Actors
progressReporter.start()
//Instantiate the long-running Actors, giving each a handle to the monitor
val lrfs = List (
new LongRunningFunction(0, 2000, progressReporter), new LongRunningFunction(1, 2500, progressReporter), new LongRunningFunction(3, 1500, progressReporter),
new LongRunningFunction(4, 1495, progressReporter), new LongRunningFunction(5, 1500, progressReporter), new LongRunningFunction(6, 5000, progressReporter) )
//Start 'em
lrfs.map{ lrf =>
lrf.start()
}
println("All actors started...")
val start = System.currentTimeMillis()
/*
This blocks until it receives a String in the Inbox.
Who sends the string? A: the progressReporter, which is monitoring the LongRunningFunctions
*/
val s = receive {
case s:String => s
}
println("Received " + s + " after " + (System.currentTimeMillis() - start) + " ms")
s
}
}
/*
An Actor that reacts to a message that is a tuple ("COMPLETED", someResult) and sends the
result to this Actor's owner. Not strictly necessary (the LongRunningFunctions could post
directly to the owner's mailbox), but I like the idea that monitoring is important enough
to deserve its own object
*/
class ProgressReporter(val owner : Actor) extends Actor {
def act() = {
println("progressReporter awaiting news...")
react {
case ("COMPLETED", s) =>
println("progressReporter received a completed signal " + s);
owner ! s
case s =>
println("Unexpected message: " + s ); act()
}
}
}
/*
Some long running function
*/
class LongRunningFunction(val id : Int, val timeout : Int, val supervisor : Actor) extends Actor {
def act() = {
//Do the long-running query
val s = longRunningQuery()
println(id.toString + " finished, sending results")
//Send the results back to the monitoring Actor (the progressReporter)
supervisor ! ("COMPLETED", s)
}
def longRunningQuery() : String = {
println("Starting Agent " + id + " with timeout " + timeout)
Thread.sleep(timeout)
"Query result from agent " + id
}
}
val cq = new ConcurrentQuerier()
//I don't think the Actor semantics guarantee that the result is absolutely, positively the first to have posted the "COMPLETED" message
println("Among the first to finish was : " + cq.synchronousQuery)
Typical results look like:
scala ActorsNoSpin.scala
progressReporter awaiting news...
All actors started...
Starting Agent 1 with timeout 2500
Starting Agent 5 with timeout 1500
Starting Agent 3 with timeout 1500
Starting Agent 4 with timeout 1495
Starting Agent 6 with timeout 5000
Starting Agent 0 with timeout 2000
4 finished, sending results
progressReporter received a completed signal Query result from agent 4
Received Query result from agent 4 after 1499 ms
Among the first to finish was : Query result from agent 4
5 finished, sending results
3 finished, sending results
0 finished, sending results
1 finished, sending results
6 finished, sending results

Scala program exiting before the execution and completion of all Scala Actor messages being sent. How to stop this?

I am sending my Scala Actor its messages from a for loop. The scala actor is receiving the
messages and getting to the job of processing them. The actors are processing cpu and disk intensive tasks such as unzipping and storing files. I deduced that the Actor part is working fine by putting in a delay Thread.sleep(200) in my message passing code in the for loop.
for ( val e <- entries ) {
MyActor ! new MyJob(e)
Thread.sleep(100)
}
Now, my problem is that the program exits with a code 0 as soon as the for loop finishes execution. Thus preventing my Actors to finish there jobs. How do I get over this? This may be really a n00b question. Any help is highly appreciated!
Edit 1:
This solved my problem for now:
while(MyActor.getState != Actor.State.Terminated)
Thread.sleep(3000)
Is this the best I can do?
Assume you have one actor you're want to finish its work. To avoid sleep you can create a SyncVar and wait for it to be initialized in the main thread:
val sv = new SyncVar[Boolean]
// start the actor
actor {
// do something
sv.set(true)
}
sv.take
The main thread will wait until some value is assigned to sv, and then be woken up.
If there are multiple actors, then you can either have multiple SyncVars, or do something like this:
class Ref(var count: Int)
val numactors = 50
val cond = new Ref(numactors)
// start your actors
for (i <- 0 until 50) actor {
// do something
cond.synchronized {
cond.count -= 1
cond.notify()
}
}
cond.synchronized {
while (cond.count != 0) cond.wait
}