How to set a socket read timeout in IOManager - scala

In Akka, IOManager "is the recommended entry point to creating sockets for performing IO." I'm looking at the API and was wondering how to set a read timeout? Of course, I can just schedule an actor to fire a message in n seconds to close the socket, but it may have already received all the read in that time and is now working on processing the read-in data. So, it's not truly a read timeout. Any ideas how to do this? Or must I somehow introduce some state to my actor?

Ok, Derek Williams gave me a hint on akka-user. Here's the code, just in case anyone else needs to do something similar.
When we accept a new client, we set a timer for 5 seconds to close the connection.
def receive = {
case IO.NewClient(server) =>
val socket = server.accept()
val readTimeout = context.system.scheduler.scheduleOnce(5 seconds, self, Timeout(socket))
state(socket) flatMap (_ => MyServer.processRequest(socket, readTimeout))
case IO.Read(socket, bytes) =>
state(socket)(IO Chunk bytes)
case IO.Closed(socket, cause) =>
state(socket)(IO EOF None)
state -= socket
case Timeout(socket) =>
socket.close()
}
And to cancel the timeout after we've read, we call cancel() on the Cancellable schedule.
object MyServer {
def processRequest(socket: IO.SocketHandle, readTimeout: Cancellable): IO.Iteratee[Unit] =
for {
request <- readRequest
} yield {
readTimeout.cancel()
request match {
val response = ...
socket write ByteString(response).compact
socket.close()
}
}
}

Related

Scala Akka Typed - send request inside behavior with ask

I'm kinda new to akka typed and I was trying to send a message which requires an answer within a given time.
I found the request-response pattern with ask which seemed interesting but is there a way to implement it inside of an already defined Behaviours.receive?
Here the idea is to call nextPlayerTurn each time a player answers or after a timeout
override def refereeTurn(): Behavior[Msg] = Behaviors.receive {
case (_, msg: GuessMsg) =>
if(currentPlayer.isDefined && currentPlayer.get == msg.getSender) {
controller ! msg
} else {
println("Player tried to guess after Timeout")
}
Behaviors.same
case (context, msg: ReceivedResponseMsg) =>
if(currentPlayer.isDefined && currentPlayer.get == msg.getSender)
nextPlayerTurn(context)
Behaviors.same
...
}
...
/**
* Tells to a player to start his turn and sets a timer that defines time in which a player has to make a guess.
* If such guess isn't made, sends that user an end turn message, fails the promise of his turn and allows next
* player to play his turn
*/
override def nextPlayerTurn(ctx: ActorContext[Msg]): Unit = {
implicit val timeout: Timeout = Timeout.timeout
currentPlayer = Option(turnManager.nextPlayer)
ctx.ask[Msg,Msg](currentPlayer.get, ref => YourTurnMsg(ref)) {
case Success(msg: GuessMsg) => println("\n SUCCESS"); msg
case Failure(_) => println(currentPlayer.get +" didn't guess in time"); TurnEnd(currentPlayer.get)
case _ => TurnEnd(currentPlayer.get)
}
}
In this case after the YourTurnMsg is sent the player is supposed to respond with a GuessMsg which stops the timer, this never happens due to the case matching inside refereeTurn Begaviour being executed instead of the Success (which instead always gives a Failure after the Timeout).
Did i get the wrong idea about the ask patter and should just make a new Behaviour with a timer?
If you want to use the ask pattern then the code that handles the result needs to send a message to the main actor rather than trying to do any processing directly. You can send a different message based on the result or just send the raw result and process it in the actor, but you must not do anything that depends on actor state in that code because it could be run on a different thread.
But ask is not cheap so in this case it seems better to just set a timer and see which message comes back first.

Alpakka S3 connector stream won't handle the load, throwing akka.stream.BufferOverflowException

I have an akka-http service and I am trying out the alpakka s3 connector for uploading files. Previously I was using a temporary file and then uploading with Amazon SDK. This approach required some adjustments for Amazon SDK to make it more scala like, but it could handle even a 1000 requests at once. Throughput wasn't amazing, but all of the requests went through eventually. Here is the code before changes, with no alpakka:
```
path("uploadfile") {
withRequestTimeout(20.seconds) {
storeUploadedFile("csv", tempDestination) {
case (metadata, file) =>
val uploadFuture = upload(file, file.toPath.getFileName.toString)
onComplete(uploadFuture) {
case Success(_) => complete(StatusCodes.OK)
case Failure(_) => complete(StatusCodes.FailedDependency)
}
}
}
}
}
case class S3UploaderException(msg: String) extends Exception(msg)
def upload(file: File, key: String): Future[String] = {
val s3Client = AmazonS3ClientBuilder.standard()
.withCredentials(new DefaultAWSCredentialsProviderChain())
.withRegion(Regions.EU_WEST_3)
.build()
val promise = Promise[String]()
val listener = new ProgressListener() {
override def progressChanged(progressEvent: ProgressEvent): Unit = {
(progressEvent.getEventType: #unchecked) match {
case ProgressEventType.TRANSFER_FAILED_EVENT => promise.failure(S3UploaderException(s"Uploading a file with a key: $key"))
case ProgressEventType.TRANSFER_COMPLETED_EVENT |
ProgressEventType.TRANSFER_CANCELED_EVENT => promise.success(key)
}
}
}
val request = new PutObjectRequest("S3_BUCKET", key, file)
request.setGeneralProgressListener(listener)
s3Client.putObject(request)
promise.future
}
```
When I changed this to use alpakka connector, the code looks much nicer as we can just connect the ByteSource and alpakka Sink together. However this approach cannot handle such a big load. When I execute 1000 requests at once (10 kb files) less than 10% go through and the rest fails with exception:
akka.stream.alpakka.s3.impl.FailedUpload: Exceeded configured
max-open-requests value of [32]. This means that the request queue of
this pool
(HostConnectionPoolSetup(bargain-test.s3-eu-west-3.amazonaws.com,443,ConnectionPoolSetup(ConnectionPoolSettings(4,0,5,32,1,30
seconds,ClientConnectionSettings(Some(User-Agent: akka-http/10.1.3),10
seconds,1
minute,512,None,WebSocketSettings(,ping,Duration.Inf,akka.http.impl.settings.WebSocketSettingsImpl$$$Lambda$4787/1279590204#4d809f4c),List(),ParserSettings(2048,16,64,64,8192,64,8388608,256,1048576,Strict,RFC6265,true,Set(),Full,Error,Map(If-Range
-> 0, If-Modified-Since -> 0, If-Unmodified-Since -> 0, default -> 12, Content-MD5 -> 0, Date -> 0, If-Match -> 0, If-None-Match -> 0,
User-Agent ->
32),false,true,akka.util.ConstantFun$$$Lambda$4534/1539966798#69c23cd4,akka.util.ConstantFun$$$Lambda$4534/1539966798#69c23cd4,akka.util.ConstantFun$$$Lambda$4535/297570074#6b426c59),None,TCPTransport),New,1
second),akka.http.scaladsl.HttpsConnectionContext#7e0f3726,akka.event.MarkerLoggingAdapter#74f3a78b)))
has completely filled up because the pool currently does not process
requests fast enough to handle the incoming request load. Please retry
the request later. See
http://doc.akka.io/docs/akka-http/current/scala/http/client-side/pool-overflow.html
for more information.
Here is how the summary of a Gatling test looks like:
---- Response Time Distribution ----------------------------------------
t < 800 ms 0 ( 0%)
800 ms < t < 1200 ms 0 ( 0%)
t > 1200 ms 90 ( 9%)
failed 910 ( 91%)
When I execute 100 of simultaneous requests, half of it fails. So, still close to satisfying.
This is a new code:
```
path("uploadfile") {
withRequestTimeout(20.seconds) {
extractRequestContext { ctx =>
implicit val materializer = ctx.materializer
extractActorSystem { actorSystem =>
fileUpload("csv") {
case (metadata, byteSource) =>
val uploadFuture = byteSource.runWith(S3Uploader.sink("s3FileKey")(actorSystem, materializer))
onComplete(uploadFuture) {
case Success(_) => complete(StatusCodes.OK)
case Failure(_) => complete(StatusCodes.FailedDependency)
}
}
}
}
}
}
def sink(s3Key: String)(implicit as: ActorSystem, m: Materializer) = {
val regionProvider = new AwsRegionProvider {
def getRegion: String = Regions.EU_WEST_3.getName
}
val settings = new S3Settings(MemoryBufferType, None, new DefaultAWSCredentialsProviderChain(), regionProvider, false, None, ListBucketVersion2)
val s3Client = new S3Client(settings)(as, m)
s3Client.multipartUpload("S3_BUCKET", s3Key)
}
```
The complete code with both endpoints can be seen here
I have a couple of questions.
1) Is this a feature? Is this what we can call a backpressure?
2) If I would like this code to behave like the old approach with a temporary file (no failed requests and all of them finish at some point) what do I have to do? I was trying to implement a queue for the stream (link to the source below), but this made no difference. The code can be seen here.
(* DISCLAIMER * I am still a scala newbie trying to quickly understand akka streams and find some workaround for the issue. There are big chances that there is something simple wrong in this code. * DISCLAIMER *)
It’s a backpressure feature.
Exceeded configured max-open-requests value of [32] In the config max-open-requests is set to 32 by default.
Streaming is used to work with big amount of data, not to handle many many requests per second.
Akka developers had to put something for max-open-requests. They choose 32 for some reason for sure. And they had no idea what it will be used for. May it be sending 1000 32KB files or 1000 1GB files at once? They don’t know. But they still want to make sure that by default (and 80% of people use defaults probably) the apps will be handled gracefully and safely. So they had to limit processing power.
You asked to do 1000 “now” but I am pretty sure AWS did not send 1000 files simultaneously but used some queue, which may be a good case for you too if you have many small files to upload.
But it is perfectly fine to tune it to your case!
If you know your machine and the target will take care of more simultaneous connections, you can change the number to a higher value.
Also, for a lot of HTTP calls use cached host connection pool.

Can I safely create a Thread in an Akka Actor?

I have an Akka Actor that I want to send "control" messages to.
This Actor's core mission is to listen on a Kafka queue, which is a polling process inside a loop.
I've found that the following simply locks up the Actor and it won't receive the "stop" (or any other) message:
class Worker() extends Actor {
private var done = false
def receive = {
case "stop" =>
done = true
kafkaConsumer.close()
// other messages here
}
// Start digesting messages!
while (!done) {
kafkaConsumer.poll(100).iterator.map { cr: ConsumerRecord[Array[Byte], String] =>
// process the record
), null)
}
}
}
I could wrap the loop in a Thread started by the Actor, but is it ok/safe to start a Thread from inside an Actor? Is there a better way?
Basically you can but keep in mind that this actor will be blocking and a thumb of rule is to never block inside actors. If you still want to do this, make sure that this actor runs in a separate thread pool than the native one so you don't affect Actor System performances. One another way to do it would be to send messages to itself to poll new messages.
1) receive a order to poll a message from kafka
2) Hand over the
message to the relevant actor
3) Send a message to itself to order
to pull a new message
4) Hand it over...
Code wise :
case object PollMessage
class Worker() extends Actor {
private var done = false
def receive = {
case PollMessage ⇒ {
poll()
self ! PollMessage
}
case "stop" =>
done = true
kafkaConsumer.close()
// other messages here
}
// Start digesting messages!
def poll() = {
kafkaConsumer.poll(100).iterator.map { cr: ConsumerRecord[Array[Byte], String] =>
// process the record
), null)
}
}
}
I am not sure though that you will ever receive the stop message if you continuously block on the actor.
Adding #Louis F. answer; depending on the configuration of your actors they will either drop all messages that they receive if at the given moment they are busy or put them in a mailbox aka queue and the messages will be processed later (usually in FIFO manner). However, in this particular case you are flooding the actor with PollMessage and you have no guarantee that your message will not be dropped - which appears to happen in your case.

Omitting all Scala Actor messages except the last

I want omit all the same type of messages except the last one:
def receive = {
case Message(type:MessageType, data:Int) =>
// remove previous and get only last message of passed MessageType
}
for example when I send:
actor ! Message(MessageType.RUN, 1)
actor ! Message(MessageType.RUN, 2)
actor ! Message(MessageType.FLY, 1)
then I want to recevie only:
Message(MessageType.RUN, 2)
Message(MessageType.FLY, 1)
Of course if they will be send very fast, or on high CPU load
You could wait a very short amount of time, storing the most recent messages that arrive, and then process only those most recent ones. This can be accomplished by sending messages to yourself, and scheduleOnce. See the second example under the Akka HowTo: Common Patterns, Scheduling Periodic Messages. Instead of scheduling ticks whenever the last tick ends, you can wait until new messages arrive. Here's an example of something like that:
case class ProcessThis(msg: Message)
case object ProcessNow
var onHold = Map.empty[MessageType, Message]
var timer: Option[Cancellable] = None
def receive = {
case msg # Message(t, _) =>
onHold += t -> msg
if (timer.isEmpty) {
import context.dispatcher
timer = Some(context.system.scheduler.scheduleOnce(1 millis, self, ProcessNow))
}
case ProcessNow =>
timer foreach { _.cancel() }
timer = None
for (m <- onHold.values) self ! ProcessThis(m)
onHold = Map.empty
case ProcessThis(Message(t, data)) =>
// really process the message
}
Incoming Messages are not actually processed right away, but are stored in a Map that keeps only the last of each MessageType. On the ProcessNow tick message, they are really processed.
You can change the length of time you wait (in my example set to 1 millisecond) to strike a balance between responsivity (length of time from a message arriving to response) and efficiency (CPU or other resources used or held up).
type is not a good name for a field, so let's use messageType instead. This code should do what you want:
var lastMessage: Option[Message] = None
def receive = {
case m => {
if (lastMessage.fold(false)(_.messageType != m.messageType)) {
// do something with lastMessage.get
}
lastMessage = Some(m)
}
}

How should I handle blocking operations when using scala actors?

I started learning the scala actors framework about two days ago. To make the ideas concrete in my mind, I decided to implement a TCP based echo server that could handle multiple simultaneous connections.
Here is the code for the echo server (error handling not included):
class EchoServer extends Actor {
private var connections = 0
def act() {
val serverSocket = new ServerSocket(6789)
val echoServer = self
actor { while (true) echoServer ! ("Connected", serverSocket.accept) }
while (true) {
receive {
case ("Connected", connectionSocket: Socket) =>
connections += 1
(new ConnectionHandler(this, connectionSocket)).start
case "Disconnected" =>
connections -= 1
}
}
}
}
Basically, the server is an Actor that handles the "Connected" and "Disconnected" messages. It delegates the connection listening to an anonymous actor that invokes the accept() method (a blocking operation) on the serverSocket. When a connection arrives it informs the server via the "Connected" message and passes it the socket to use for communication with the newly connected client. An instance of the ConnectionHandler class handles the actual communication with the client.
Here is the code for the connection handler (some error handling included):
class ConnectionHandler(server: EchoServer, connectionSocket: Socket)
extends Actor {
def act() {
for (input <- getInputStream; output <- getOutputStream) {
val handler = self
actor {
var continue = true
while (continue) {
try {
val req = input.readLine
if (req != null) handler ! ("Request", req)
else continue = false
} catch {
case e: IOException => continue = false
}
}
handler ! "Disconnected"
}
var connected = true
while (connected) {
receive {
case ("Request", req: String) =>
try {
output.writeBytes(req + "\n")
} catch {
case e: IOException => connected = false
}
case "Disconnected" =>
connected = false
}
}
}
close()
server ! "Disconnected"
}
// code for getInputStream(), getOutputStream() and close() methods
}
The connection handler uses an anonymous actor that waits for requests to be sent to the socket by calling the readLine() method (a blocking operation) on the input stream of the socket. When a request is received a "Request" message is sent to the handler which then simply echoes the request back to the client. If the handler or the anonymous actor experiences problems with the underlying socket then the socket is closed and a "Disconnect" message is sent to the echo server indicating that the client has been disconnected from the server.
So, I can fire up the echo server and let it wait for connections. Then I can open a new terminal and connect to the server via telnet. I can send it requests and it responds correctly. Now, if I open another terminal and connect to the server the server registers the connection but fails to start the connection handler for this new connection. When I send it messages via any of the existing connections I get no immediate response. Here's the interesting part. When I terminate all but one of the existing client connections and leave client X open, then all the responses to the request I sent via client X are returned. I've done some tests and concluded that the act() method is not being called on subsequent client connections even though I call the start() method on creating the connection handler.
I suppose I'm handling the blocking operations incorrectly in my connection handler. Since a previous connection is handled by a connection handler that has an anonymous actor blocked waiting for a request I'm thinking that this blocked actor is preventing the other actors (connection handlers) from starting up.
How should I handle blocking operations when using scala actors?
Any help would be greatly appreciated.
From the scaladoc for scala.actors.Actor:
Note: care must be taken when invoking thread-blocking methods other than those provided by the Actor trait or its companion object (such as receive). Blocking the underlying thread inside an actor may lead to starvation of other actors. This also applies to actors hogging their thread for a long time between invoking receive/react.
If actors use blocking operations (for example, methods for blocking I/O), there are several options:
The run-time system can be configured to use a larger thread pool size (for example, by setting the actors.corePoolSize JVM property).
The scheduler method of the Actor trait can be overridden to return a ResizableThreadPoolScheduler, which resizes its thread pool to avoid starvation caused by actors that invoke arbitrary blocking methods.
The actors.enableForkJoin JVM property can be set to false, in which case a ResizableThreadPoolScheduler is used by default to execute actors.