Exceeded configured max-open-requests - scala

recently I started to build some small web processing service using akka streams. It's quite simple, I'm pulling urls from redis, then I'm downloading those urls(they are images) later I'm processing images, and pushing them to s3 and some json to redis.
I'm downloading lot of different kinds of images from multiple sites, I'm getting whole bunch of errors like 404, Unexpected disconnect , Response Content-Length 17951202 exceeds the configured limit of 8388608, EntityStreamException: Entity stream truncation and redirects. With redirects I'm invoking requestWithRedirects with address founded in location header of response.
Part responsible for downloading is pretty much like this:
override lazy val http: HttpExt = Http()
def requestWithRedirects(request: HttpRequest, retries: Int = 10)(implicit akkaSystem: ActorSystem, materializer: FlowMaterializer): Future[HttpResponse] = {
TimeoutFuture(timeout, msg = "Download timed out!") {
http.singleRequest(request)
}.flatMap {
response => handleResponse(request, response, retries)
}.recoverWith {
case e: Exception if retries > 0 =>
requestWithRedirects(request, retries = retries - 1)
}
}
TimeoutFuture is quite simple it takes future and timeout. If future takes longer than timeout it returns other future with timeout exception.
The problem I'm having is: after some time I'm getting an error:
Message: RuntimeException: Exceeded configured max-open-requests value of [128] akka.http.impl.engine.client.PoolInterfaceActor$$anonfun$receive$1.applyOrElse in PoolInterfaceActor.scala::109
akka.actor.Actor$class.aroundReceive in Actor.scala::467
akka.http.impl.engine.client.PoolInterfaceActor.akka$stream$actor$ActorSubscriber$$super$aroundReceive in PoolInterfaceActor.scala::46
akka.stream.actor.ActorSubscriber$class.aroundReceive in ActorSubscriber.scala::208
akka.http.impl.engine.client.PoolInterfaceActor.akka$stream$actor$ActorPublisher$$super$aroundReceive in PoolInterfaceActor.scala::46
akka.stream.actor.ActorPublisher$class.aroundReceive in ActorPublisher.scala::317
akka.http.impl.engine.client.PoolInterfaceActor.aroundReceive in PoolInterfaceActor.scala::46
akka.actor.ActorCell.receiveMessage in ActorCell.scala::516
akka.actor.ActorCell.invoke in ActorCell.scala::487
akka.dispatch.Mailbox.processMailbox in Mailbox.scala::238
akka.dispatch.Mailbox.run in Mailbox.scala::220
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec in AbstractDispatcher.scala::397
scala.concurrent.forkjoin.ForkJoinTask.doExec in ForkJoinTask.java::260
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask in ForkJoinPool.java::1339
scala.concurrent.forkjoin.ForkJoinPool.runWorker in ForkJoinPool.java::1979
scala.concurrent.forkjoin.ForkJoinWorkerThread.run in ForkJoinWorkerThread.java::107
I'm not sure what could be the problem but I think I have some downloads that were not finished properly and they stay in some global pool of connections after a while causing mentioned error. Any ideas what could be causing the problem? Or how to try find root of the problem: I already tested 404 responses, and Response Content-Length exceeds... errors, and they doesn't seem to be my troublemakers.
EDIT:
Most likely the problem is with my TimeoutFuture. I'm filling it with error as described here https://stackoverflow.com/a/29330010/2963977 but in my opinion future that is actually downloading an image never completes and it's taking my connection pool resources.
I wonder why those settings doesn't have any impact in my case :
akka.http.client.connecting-timeout = 1 s
akka.http.client.idle-timeout = 1 s
akka.http.host-connection-pool.idle-timeout = 1 s
EDIT2:
Apparently timeouts are not supported yet. Here is my bug report
https://github.com/akka/akka/issues/17732#issuecomment-112315953

Related

Akka HTTP / Error Response entity was not subscribed after 1 second

I searched the other StackOverflow question/answers towards this error, but couldn't find a hint for solving this problem.
The Akka HTTP application runs for like 5 hours under high workload without problems, and than I start to get multiple:
Response entity was not subscribed after 1 second. Make sure to read the response `entity` body or call `entity.discardBytes()` on it -- in case you deal with `HttpResponse`, use the shortcut `response.discardEntityBytes()`. GET /api/name123 Empty -> 200 OK Default(142 bytes)
and later
The connection actor has terminated. Stopping now.
The actor is only sending out API requests and afterwards forwards those responses to another actor if successfully, in case of failure, that request is added back to the todo stack and retried later. This is the main code:
private def makeApiRequest(id: String): Unit = {
val url = UrlBuilder(id)
val request = HttpRequest(method = HttpMethods.GET, uri = url)
val f: Future[(StatusCode, String)] = Http(context.system)
.singleRequest(request)
.flatMap(_.toStrict(2.seconds))
.flatMap { resp =>
Unmarshal(resp.entity).to[String].map((resp.status, _))
}
context.pipeToSelf(f) {
case Success(response) =>
API_HandleResponseSuccess(id, response._1, response._2)
case Failure(e) =>
API_HandleResponseFailure(id, e.getMessage)
}
}
I don't really understand why I get the "Response entity was not subscribed..." error, as I do Unmarshal(resp.entity).to[String] and thereby would think, that no .DiscardEntityBytes() is needed, or does it needs to be still included somehow?
Side information: Also confusing to me, why the CPU performance doesn't stay constant.
Within the actor do I track the response times of each request and calculate the amount of max. parallel requests possible to handle with the given hardware conditions (restricted to a max max of 120 though) on a regular basis to account for API response time fluctuations, so there should be always enough room to make the requests without starving for that actor. In addition would that be the respective application.conf:
dispatcher-worker-io {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 120
keep-alive-time = 60s
allow-core-timeout = off
}
shutdown-timeout = 60s
throughput = 1
}
...
akka.http.client.host-connection-pool.max-connections = 180
akka.http.client.host-connection-pool.max-open-requests = 256
akka.http.client.host-connection-pool.max-retries = 0
Any ideas on why I after 5 hours without problems start to get those exceptions mentioned above?
or
Has an idea of which part of above shared code might leads to this non-linear CPU performance?
I also made multiple of those long lasting hour runs, and it always ends out like this, somehow it's starving after 5 to 6 hours.
val AkkaVersion = "2.6.15"
val AkkaHttpVersion = "10.2.6"
Directly from the docs (https://doc.akka.io/docs/akka-http/current/client-side/request-level.html):
Always make sure you consume the response entity streams (of type
Source[ByteString,Unit]). Connect the response entity Source to a
Sink, or call response.discardEntityBytes() if you don’t care about
the response entity.
Read the Implications of the streaming nature of Request/Response
Entities section for more details.
If the application doesn’t subscribe to the response entity within
akka.http.host-connection-pool.response-entity-subscription-timeout,
the stream will fail with a TimeoutException: Response entity was not
subscribed after ....
You need to .discardEntityBytes() in case of failure. Right now you only consume it on success.
Perhaps high CPU load is caused by all these unfreed resources on the JVM + retries of all the failures.

Akka Stream Exception Thrown When Downloading File From S3

I am trying to download a file from S3 using the following code:
wsClient
.url(url)
.withMethod("GET")
.withHttpHeaders(my_headers: _*)
.withRequestTimeout(timeout)
.stream()
.map {
case AhcWSResponse(underlying) =>
underlying.bodyAsBytes
}
When I run this I get the following exception:
akka.stream.StreamLimitReachedException: limit of 13 reached
Is this because I am using bodyAsBytes? What does this error mean ? I also see this warning message which is probably related:
blockingToByteString is a blocking and unsafe operation!
This happens because if you use stream(), you need to consume the source using bodyAsSource. It is important to do so or it would otherwise backpressure the connection. body or bodyAsBytes are implemented and do consume the source but for some reason the implementor decided to let you know that you should have used execute() instead of stream() by limiting the body to 13 ByteStrings and 50ms timeout.
You are getting StreamLimitReachedExpcetion because the number of incoming elements is larger than max.
val MAX_ALLOWED_SIZE = 100
// OK. Future will fail with a `StreamLimitReachedException`
// if the number of incoming elements is larger than max
val limited: Future[Seq[String]] =
mySource.limit(MAX_ALLOWED_SIZE).runWith(Sink.seq)
// OK. Collect up until max-th elements only, then cancel upstream
val ignoreOverflow: Future[Seq[String]] =
mySource.take(MAX_ALLOWED_SIZE).runWith(Sink.seq)
You can find more information about streaming process here

Play ws scala server hangs up on request after 120seconds - which options to use?

I am pretty sure that this is a config problem, so I'll post my code and the relevant application.conf options of my play app.
I have a play server that needs to interact with another server "B" (basically multi-file upload to B). The interaction happens inside an async -Action which should result in an OK with B's response on the upload. This is the reduced code:
def authenticateAndUpload( url: String) = Action.async( parse.multipartFormData) { implicit request =>
val form = authForm.bindFromRequest.get
val (user, pass) = (form.user, form.pass)
//the whole following interaction with the other server happens in a future, i.e. login returns a Future[Option[WSCookie]] which is then used
login(user, pass, url).flatMap {
case Some(cookie) => //use the cookie to upload the files and collect the result, i.e. server responses
//this may take a few minutes and happens in yet another future, which eventually produces the result
result.map(cc => Ok(s"The server under url $url responded with $cc"))
case None =>
Future.successful(Forbidden(s"Unable to log into $url, please go back and try again with other credentials."))
}
}
I am pretty sure that the code itself works since I can see my server log which nicely prints B's responses every few seconds and proceeds until everything is correctly uploaded. The only problem is that the browser hangs up with a server overloaded message after 120s which should be a play default value - but for which config parameter?
I tried to get rid of it by setting every play.server.http. timeout option I could get my hands on and even decided to use play.ws, specific akka, and other options of which I am quite sure that they are not necessary... however the problem remains, here is my current application.config part:
ws.timeout.idle="3600s"
ws.timeout.request ="3600s"
ws.timeout.response="3600s"
play.ws.timeout.idle="3600s"
play.ws.timeout.request="3600s"
play.ws.timeout.response="3600s"
play.server.http.connectionTimeout="3600s"
play.server.http.idleTimeout="3600s"
play.server.http.requestTimeout="3600s"
play.server.http.responseTimeout="3600s"
play.server.http.keepAlive="true"
akka.http.host-connection-pool.idle-timeout="3600s"
akka.http.host-connection-pool.client.idle-timeout= "3600s"
The browser hang up happened both on Safari and Chrome, where Chrome additionally started a second communication with B after about 120 seconds - also both of these communications succeeded and produced the expected logs, only the browsers had both hang up.
I am using Scala 2.12.2 with play 2.6.2 in an SBT environment, the server is under development, pre-compiled but then started via run - I read that it may not pick up the application.conf options - but it did on some file size customizing. Can someone tell me the correct config options or my mistake on the run process?

Spray.io log leaks sensitive information

I'm using Spray client to consume a third-party API. Unfortunately, the API I'm consuming is not very secure and utilizes an authentication method using GET query parameters.
Sometimes we're getting timeouts or connection issues which we know to deal with applicatively. The problem is that Spray logs this at a WARN log-level, and the URL including the sensitive query parameters () are being written in our log files.
Here's an example of the log file.
2015-05-19 12:23:17,024 WARN HttpHostConnectionSlot - Connection attempt to 10.10.10.10:443 failed in response to GET request to /api/?type=keygen&user=test_user&password=S3kret! with 2 retries left, retrying...
2015-05-19 12:23:17,084 WARN HttpHostConnectionSlot - Connection attempt to 10.10.10.10:443 failed in response to GET request to /api/?type=keygen&user=test_user&password=S3kret! with 1 retries left, retrying...
Is there any way to filter this? (Maybe in Akka?)
Spray reuses akka-logging for doing all logging groundwork.
In akka you can redeclare a custom event logger in application config:
akka {
# event-handlers = ["akka.event.Logging$DefaultLogger"] // default one
event-handlers = ["com.example.PrivacyLogger"] // custom one
# Options: ERROR, WARNING, INFO, DEBUG
loglevel = "DEBUG"
}
It may look like this:
class PrivacyLogger extends DefaultLogger {
override def receive: Receive = {
case InitializeLogger(_) ⇒ sender() ! LoggerInitialized
case event: LogEvent ⇒ print(stripSecret(event))
}
private def stripSecret(event:LogEvent) = ...
}
But you always can implement your own message processing logic here instead of simple printing.
PS. If you use slf4j for logging, the solution will mostly look the same, but with some minor differences like overriding akka.event.slf4j.Slf4jEventHandler instead of DefaultLogger.

Exception caught in RequestBodyHandler

below is the code when user uploads a video from mobile application to S3
def uploadVideo = Action(parse.multipartFormData) { implicit request =>
try {
var height = 0
var width = 0
request.body.files.map { mov =>
var videoName = System.currentTimeMillis() + ".mpeg"
amazonS3Client.putObject(bucketVideos, videoName, mov.ref.file)
}
val map = Map("result" -> "success")
Ok(write(map))
} catch {
case e: Exception =>
Ok(write(Map("result" -> "error")))
}
}
the above code work fine but in case user cancel while uploading of video then error occurs
[error] play - Exception caught in RequestBodyHandler
java.nio.channels.ClosedChannelException: null
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:434) ~[netty.jar:na]
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:129) ~[netty.jar:na]
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:99) ~[netty.jar:na]
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:36) ~[netty.jar:na]
at org.jboss.netty.channel.Channels.write(Channels.java:725) ~[netty.jar:na]
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:71) ~[netty.jar:na]
and this doesn't go to catch block!!
1.can this is harmfull to server or not?(because it is not needed any response if error occours)
2.if yes, how to handle?
This is all happening in Play's internals that are handling parsing the body of the Request. In fact, during the upload to your server, you haven't even reached the try block yet because the file hasn't finished uploading. Only once the upload is complete do you have the TemporaryFile available.
So no, you can't catch this error, and why would you want to? The user closed the connection. They're not even waiting for a response, so why send one? Let Play handle it.
This is also not a good way of handling an upload, though. For small files, it's passable, but if someone is proxying a huge video upload through your server to S3, it's going to:
Take almost twice is long to serve the response (which will cause the user to hang while you upload to S3).
Block one of Play's threads for handling requests for the entire time that file is uploading to S3, and given enough of these uploads (not many at all), you will no longer be able to process requests until an upload has completed.
Consider at least creating a separate ExecutionContext to use for handling uploads, or even better, look into having the user upload directly to S3 via a signed form, which would remove the need to proxy the upload at all.