Akka/Scala detect closed TCP connection? - scala

I'm trying to write a TCP server using Akka and Scala, which will instantiate actors and stop the actors when a client connects and disconnects respectively. I have a TCP binding actor,
class Server extends Actor{
import Tcp._
import context.system
IO(Tcp) ! Bind(self, new InetSocketAddress("localhost", 9595))
def receive = {
case Bound(localAddress) =>
println("Server Bound")
println(localAddress)
case CommandFailed(_: Bind) => context stop self
case Connected(remote, local)=>
val handler = context.system.actorOf(Props[ConnHandler])
val connection = sender()
connection ! Register(handler)
}
}
The above instantiates a TCP listener on localhost:9595 and registers handler actors to each connection.
I then have the receive def in my ConnHandler class defined as such, abbreviated with ... where the code behaves correctly.
case received => {...}
case PeerClosed => {
println("Stopping")
//Other actor stopping, cleanup.
context stop self
}
(See http://doc.akka.io/docs/akka/snapshot/scala/io-tcp.html for the docs I used to write this - it uses more or less the same code in the PeerClosed case)
However, when I close the socket client-side, the actor remains running, and no "Stopping" message is printed.
I have no nearby configured non-Windows machines to test this on, since I believe this to do with me running on Windows because after googling around, I found a still open bug - https://github.com/akka/akka/issues/17122 - which refers to some Close events being missed on Windows-based systems.
Have I made a silly error in my code, or would this be part of the bug linked above?
Although I could write in a case for Received(data) that would close the connection, however, a disconnect caused by the network disconnecting or something would leave the server in an unrecoverable state, requiring the application to be restarted since it would leave a secondary, shared actor in a state that says the client is still connected, and thus, the server would reject further connections from that user.
Edit:
I've worked around this by adding a watchdog timer actor that has a periodic action that fires after some amount of time. The ConnHandler actor will reset the watchdog timer every time an event happens on the connection. Although not ideal, it does what I want to do.

Edit 12/9/2016:
ConnHandler will receive a PeerClosed message even when a client Unexpectedly disconnect.

Related

Future[Source] pipeTo an Actor

There are two local actors (the remoting is not used). Actors were simplified for the example:
class ProcessorActor extends Actor {
override def receive: Receive = {
case src:Source[Int, NotUsed] =>
//TODO processing of `src` here
}
}
class FrontendActor extends Actor {
val processor = context.system.actorOf(Props[ProcessorActor])
...
override def receive: Receive = {
case "Hello" =>
val f:Future[Source[Int, NotUsed]] = Future (Source(1 to 100))
f pipeTo processor
}
}
// entry point:
val frontend = system.actorOf(Props[FrontendActor])
frontend ! "Hello"
Thus the FrontendActor sends Source to ProcessorActor. In the above example it is works successfully.
Is such approach okay?
Thus the FrontendActor sends Source to ProcessorActor. In the above example it is works successfully.
Is such approach okay?
It's unclear what your concern is.
Sending a Source from one actor to another actor on the same JVM is fine. Because inter-actor communication on the same JVM, as the documentation states, "is simply done via reference passing," there is nothing unusual about your example1. Essentially what is going on is that a reference to a Source is passed to ProcessorActor once the Future is completed. A Source is an object that defines part of a stream; you can send a Source from one actor to another actor locally just as you can any JVM object.
(However, once you cross the boundary of a single JVM, you have to deal with serialization.)
1 A minor, tangential observation: FrontendActor calls context.system.actorOf(Props[ProcessorActor]), which creates a top-level actor. Typically, top-level actors are created in the main program, not within an actor.
Yes, this is OK, but does not work quite how you describe it. FrontendActor does not send Future[Source], it just sends Source.
From the docs:
pipeTo installs an onComplete-handler on the future to affect the submission of the result to another actor.
In other words, pipeTo means "send the result of this Future to the actor when it becomes available".
Note that this will work even if remoting is being used because the Future is resolved locally and is not sent over the wire to a remote actor.

Why doesn't Play framework close the Akka Stream?

An actor initializes an Akka stream which connects to a websocket. This is done by using a Source.actorRef to which messages can be send, which are then processed by the webSocketClientFlow and consumed by a Sink.foreach. This can be seen in the following code (derived from akka docs):
class TestActor #Inject()(implicit ec: ExecutionContext) extends Actor with ActorLogging {
final implicit val system: ActorSystem = ActorSystem()
final implicit val materializer: ActorMaterializer = ActorMaterializer()
def receive = {
case _ =>
}
// Consume the incoming messages from the websocket.
val incoming: Sink[Message, Future[Done]] =
Sink.foreach[Message] {
case message: TextMessage.Strict =>
println(message.text)
case misc => println(misc)
}
// Source through which we can send messages to the websocket.
val outgoing: Source[TextMessage, ActorRef] =
Source.actorRef[TextMessage.Strict](bufferSize = 10, OverflowStrategy.fail)
// flow to use (note: not re-usable!)
val webSocketFlow = Http().webSocketClientFlow(WebSocketRequest("wss://ws-feed.gdax.com"))
// Materialized the stream
val ((ws,upgradeResponse), closed) =
outgoing
.viaMat(webSocketFlow)(Keep.both)
.toMat(incoming)(Keep.both) // also keep the Future[Done]
.run()
// Check whether the server has accepted the websocket request.
val connected = upgradeResponse.flatMap { upgrade =>
if (upgrade.response.status == StatusCodes.SwitchingProtocols) {
Future.successful(Done)
} else {
throw new RuntimeException(s"Failed: ${upgrade.response.status}")
}
}
// When the connection has been established.
connected.onComplete(println)
// When the stream has closed
closed.onComplete {
case Success(_) => println("Test Websocket closed gracefully")
case Failure(e) => log.error("Test Websocket closed with an error\n", e)
}
}
When play framework recompiles it closes the TestActor but does not close the Akka stream. Only when the websocket timeouts the stream is closed.
Does this mean that I need to close the stream manually by for example, sending the actor created with Source.actorRef a PoisonPill in the TestActor PostStop function?
Note: I also tried to inject the Materializer and the Actorsystem i.e:
#Inject()(implicit ec: ExecutionContext, implicit val mat: Materializer, implicit val system: ActorSystem)
When Play recompiles, the stream is closed, but also produces an error:
[error] a.a.ActorSystemImpl - Websocket handler failed with
Processor actor [Actor[akka://application/user/StreamSupervisor-62/flow-0-0-ignoreSink#989719582]]
terminated abruptly
In your first example, you're creating an actor system in your actor. You should not do that - actor systems are expensive, creating one means starting thread pools, starting schedulers, etc. Plus, you're never shutting it down, which means that you've got a much bigger problem than the stream not shutting down - you have a resource leak, the thread pools created by the actor system are never shut down.
So essentially, every time you receive a WebSocket connection, you're creating a new actor system with a new set of thread pools, and you're never shutting them down. In production, with even a small load (a few requests per second), your application is going to run out of memory within a few minutes.
In general in Play, you should never create your own actor system, but rather have one injected. From within an actor, you don't even need to have it injected because it automatically is - context.system gives you access to the actor system that created the actor. Likewise with materializers, these aren't as heavy weight, but if you create one per connection, you could also run out of memory if you don't shut it down, so you should have it injected.
So when you do have it injected, you get an error - this is hard to avoid, though not impossible. The difficulty is that Akka itself can't really know automatically what order things need to be shutdown in order to close things gracefully, should it shut your actor down first, so that it can shut the streams down gracefully, or should it shut the streams down, so that they can notify your actor that they are shutdown and respond accordingly?
Akka 2.5 has a solution for this, a managed shutdown sequence, where you can register things to be shutdown before the Actor system starts killing things in a somewhat random order:
https://doc.akka.io/docs/akka/2.5/scala/actors.html#coordinated-shutdown
You can use this in combination with Akka streams kill switches to shutdown your streams gracefully before the rest of the application is shut down.
But generally, the shutdown errors are fairly benign, so if it were me I wouldn't worry about them.

Identifying an Actor using an ActorSelection

I'm writing an Actor that should watch another Actor; let's call the latter one the target. My Actor should stop itself once its target is stopped. For this target I only have an ActorSelection. To watch it, I obviously need an ActorRef, so I figured I should send the ActorSelection an Identify message; when it replies back with ActorIdentity I would have its ActorRef. So far so good, but I can't get it to work.
Here's the spec:
// Arrange
val probe = TestProbe()
val target = TestProbe().ref
val sut = system.actorOf(MyActor.props(system.actorSelection(target.path)), "watch-target")
probe watch sut
// Act
target ! PoisonPill
// Assert
probe.expectTerminated(sut)
And the implementation (an FSM, details skipped):
log.debug("Asking target selection {} to identify itself; messageId={}", selection.toString(), messageId)
selection ! Identify(messageId)
when(Waiting) {
case Event(ActorIdentity(`messageId`, Some(ref)), Queue(q)) =>
log.info("Received identity for remote target: {}", ref)
context.watch(ref)
goto(NextState) using TargetFound(ref)
case Event(ActorIdentity(`messageId`, None), Queue(q)) =>
log.error("Could not find requested target {}", selection.toString())
stop()
}
initialize()
Now, when I run my test, it is green because the system under test is indeed stopped. But the problem is it stops itself because it can't find its target using the aforementioned steps. The log file says:
Asking target selection ActorSelection[Anchor(akka://default/), Path(/system/testProbe-3)] to identify itself; messageId=871823258
Could not find requested target ActorSelection[Anchor(akka://default/), Path(/system/testProbe-3)]
Am I missing something obvious here? Maybe a TestProbe should not reveal its real identity? I even tried by instantiating a dummy Actor as target but the results are the same. Any clue?
Turns out the answer is actually very simple: the test runs so fast that before MyActor sends the Identify message to the selection, the Actor behind the selection has already received its PoisonPill and thus is killed.
Adding a little Thread.sleep() before sending that PoisonPill fixed the issue.
The target actor is getting terminated before the identify request is being made. This is because Akka only guarantees order when sending messages between a given pair of actors.
If you add a thread.sleep above the following line, the identify request should succeed.
Thread.sleep(100)
// Act
target ! PoisonPill
Note that there may be better ways to code the test - sleeping the thread is not ideal.
Your watching actor should also handle the Terminated message of the target actor, as described here.

How can I gather state information from a set of actors using only the actorSystem?

I'm creating an actor system, which has a list of actors representing some kind of session state.
These session are created by a factory actor (which might, in the future, get replaced by a router, if performance requires that - this should be transparent to the rest of the system, however).
Now I want to implement an operation where I get some state information from each of my currently existing session actors.
I have no explicit session list, as I want to rely on the actor system "owning" the sessions. I tried to use the actor system to look up the current session actors. The problem is that I did not find a "get all actor refs with this naming pattern" method. I tried to use the "/" operator on the system, followed by resolveOne - but got lost in a maze of future types.
The basic idea I had was:
- Send a message to all current session actors (as given to my by my ActorSystem).
- Wait for a response from them (preferably by using just the "ask" pattern - the method calling this broadcaster request/response is just a monitoring resp. debugging method, so blocking is no probleme here.
- And then collect the responses into a result.
After a death match against Scala's type system I had to give up for now.
Is there really no way of doing something like this?
If I understand the question correctly, then I can offer up a couple of ways you can accomplish this (though there are certainly others).
Option 1
In this approach, there will be an actor that is responsible for waking up periodically and sending a request to all session actors to get their current stats. That actor will use ActorSelection with a wildcard to accomplish that goal. A rough outline if the code for this approach is as follows:
case class SessionStats(foo:Int, bar:Int)
case object GetSessionStats
class SessionActor extends Actor{
def receive = {
case GetSessionStats =>
println(s"${self.path} received a request to get stats")
sender ! SessionStats(1, 2)
}
}
case object GatherStats
class SessionStatsGatherer extends Actor{
context.system.scheduler.schedule(5 seconds, 5 seconds, self, GatherStats)(context.dispatcher)
def receive = {
case GatherStats =>
println("Waking up to gether stats")
val sel = context.system.actorSelection("/user/session*")
sel ! GetSessionStats
case SessionStats(f, b) =>
println(s"got session stats from ${sender.path}, values are $f and $b")
}
}
Then you could test this code with the following:
val system = ActorSystem("test")
system.actorOf(Props[SessionActor], "session-1")
system.actorOf(Props[SessionActor], "session-2")
system.actorOf(Props[SessionStatsGatherer])
Thread.sleep(10000)
system.actorOf(Props[SessionActor], "session-3")
So with this approach, as long as we use a naming convention, we can use an actor selection with a wildcard to always find all of the session actors even though they are constantly coming (starting) and going (stopping).
Option 2
A somewhat similar approach, but in this one, we use a centralized actor to spawn the session actors and act as a supervisor to them. This central actor also contains the logic to periodically poll for stats, but since it's the parent, it does not need an ActorSelection and can instead just use its children list. That would look like this:
case object SpawnSession
class SessionsManager extends Actor{
context.system.scheduler.schedule(5 seconds, 5 seconds, self, GatherStats)(context.dispatcher)
var sessionCount = 1
def receive = {
case SpawnSession =>
val session = context.actorOf(Props[SessionActor], s"session-$sessionCount")
println(s"Spawned session: ${session.path}")
sessionCount += 1
sender ! session
case GatherStats =>
println("Waking up to get session stats")
context.children foreach (_ ! GetSessionStats)
case SessionStats(f, b) =>
println(s"got session stats from ${sender.path}, values are $f and $b")
}
}
And could be tested as follows:
val system = ActorSystem("test")
val manager = system.actorOf(Props[SessionsManager], "manager")
manager ! SpawnSession
manager ! SpawnSession
Thread.sleep(10000)
manager ! SpawnSession
Now, these examples are extremely trivialized, but hopefully they paint a picture for how you could go about solving this issue with either ActorSelection or a management/supervision dynamic. And a bonus is that ask is not needed in either and also no blocking.
There have been many additional changes in this project, so my answer/comments have been delayed quite a bit :-/
First, the session stats gathering should not be periodical, but on request. My original idea was to "mis-use" the actor system as my map of all existing session actors, so that I would not need a supervisor actor knowing all sessions.
This goal has shown to be elusive - session actors depend on shared state, so the session creator must watch sessions anyways.
This makes Option 2 the obvious answer here - the session creator has to watch all children anyways.
The most vexing hurdle with option 1 was "how to determine when all (current) answers are there" - I wanted the statistics request to take a snapshot of all currently existing actor names, query them, ignore failures (if a session dies before it can be queried, it can be ignored here) - the statistics request is only a debugging tool, i.e. something like a "best effort".
The actor selection api tangled me up in a thicket of futures (I am a Scala/Akka newbie), so I gave up on this route.
Option 2 is therefore better suited to my needs.

Scala how to use akka actors to handle a timing out operation efficiently

I am currently evaluating javascript scripts using Rhino in a restful service. I wish for there to be an evaluation time out.
I have created a mock example actor (using scala 2.10 akka actors).
case class Evaluate(expression: String)
class RhinoActor extends Actor {
override def preStart() = { println("Start context'"); super.preStart()}
def receive = {
case Evaluate(expression) ⇒ {
Thread.sleep(100)
sender ! "complete"
}
}
override def postStop() = { println("Stop context'"); super.postStop()}
}
Now I run use this actor as follows:
def run {
val t = System.currentTimeMillis()
val system = ActorSystem("MySystem")
val actor = system.actorOf(Props[RhinoActor])
implicit val timeout = Timeout(50 milliseconds)
val future = (actor ? Evaluate("10 + 50")).mapTo[String]
val result = Try(Await.result(future, Duration.Inf))
println(System.currentTimeMillis() - t)
println(result)
actor ! PoisonPill
system.shutdown()
}
Is it wise to use the ActorSystem in a closure like this which may have simultaneous requests on it?
Should I make the ActorSystem global, and will that be ok in this context?
Is there a more appropriate alternative approach?
EDIT: I think I need to use futures directly, but I will need the preStart and postStop. Currently investigating.
EDIT: Seems you don't get those hooks with futures.
I'll try and answer some of your questions for you.
First, an ActorSystem is a very heavy weight construct. You should not create one per request that needs an actor. You should create one globally and then use that single instance to spawn your actors (and you won't need system.shutdown() anymore in run). I believe this covers your first two questions.
Your approach of using an actor to execute javascript here seems sound to me. But instead of spinning up an actor per request, you might want to pool a bunch of the RhinoActors behind a Router, with each instance having it's own rhino engine that will be setup during preStart. Doing this will eliminate per request rhino initialization costs, speeding up your js evaluations. Just make sure you size your pool appropriately. Also, you won't need to be sending PoisonPill messages per request if you adopt this approach.
You also might want to look into the non-blocking callbacks onComplete, onSuccess and onFailure as opposed to using the blocking Await. These callbacks also respect timeouts and are preferable to blocking for higher throughput. As long as whatever is way way upstream waiting for this response can handle the asynchronicity (i.e. an async capable web request), then I suggest going this route.
The last thing to keep in mind is that even though code will return to the caller after the timeout if the actor has yet to respond, the actor still goes on processing that message (performing the evaluation). It does not stop and move onto the next message just because a caller timed out. Just wanted to make that clear in case it wasn't.
EDIT
In response to your comment about stopping a long execution there are some things related to Akka to consider first. You can call stop the actor, send a Kill or a PosionPill, but none of these will stop if from processing the message that it's currently processing. They just prevent it from receiving new messages. In your case, with Rhino, if infinite script execution is a possibility, then I suggest handling this within Rhino itself. I would dig into the answers on this post (Stopping the Rhino Engine in middle of execution) and setup your Rhino engine in the actor in such a way that it will stop itself if it has been executing for too long. That failure will kick out to the supervisor (if pooled) and cause that pooled instance to be restarted which will init a new Rhino in preStart. This might be the best approach for dealing with the possibility of long running scripts.