akka-http chunked response concatenation - scala

I'm using akka-http to make a request to a http service which sends back chunked response. This is how the relevant bit of code looks like:
val httpRequest: HttpRequest = //build the request
val request = Http().singleRequest(httpRequest)
request.flatMap { response =>
response.entity.dataBytes.runForeach { chunk =>
println("-----")
println(chunk.utf8String)
}
}
and the output produced in the command line looks something like this:
-----
{"data":
-----
"some text"}
-----
{"data":
-----
"this is a longer
-----
text"}
-----
{"data": "txt"}
-----
...
The logical piece of data - a json in this case ends with an end of line symbol \r\n, but the problem is, that the json doesn't always fit in a single http response chunk as clearly visible in the example above.
My question is - how do I concatenate the incoming chunked data into full jsons so that the resulting container type would still remain either Source[Out,M1] or Flow[In,Out,M2]? I'd like to follow the idealogy of akka-stream.
UPDATE: It's worth mentioning also, that the response is endless and the aggregation must be done in real time

Found a solution:
val request: HttpRequest = //build the request
request.flatMap { response =>
response.entity.dataBytes.scan("")((acc, curr) => if (acc.contains("\r\n")) curr.utf8String else acc + curr.utf8String)
.filter(_.contains("\r\n"))
.runForeach { json =>
println("-----")
println(json)
}
}

The akka stream documentation has an entry in the cookbook for this very problem: "Parsing lines from a stream of ByteString". Their solution is quite verbose but can also handle the situation where a single chunk can contain multiple lines. This seems more robust since the chunk size could change to be big enough to handle multiple json messages.

response.entity.dataBytes
.via(Framing.delimiter(ByteString("\n"), maximumFrameLength = 8096))
.mapAsyncUnordered(Runtime.getRuntime.availableProcessors()) { data =>
if (response.status == OK) {
val event: Future[Event] = Unmarshal(data).to[Event]
event.foreach(x => log.debug("Received event: {}.", x))
event.map(Right(_))
} else {
Future.successful(data.utf8String)
.map(Left(_))
}
}
The only requirement is you know the maximum size of one record. If you start with something small, the default behavior is to fail if the record is larger than the limit. You can set it to truncate instead of failing, but piece of a JSON makes no sense.

Related

Akka-HTTP return response in chunks

I am using Akka-HTTP to return a response in form of String [30 - 40 MB].
When I deploy my Akka-HTTP server and make a request to fetch the data from some URI then every time it gets stuck after several MB's and stop fetching the complete response.
Is there any way that I return my whole large response without getting stuck in between.
HttpResponse(StatusCodes.OK, entity = myLargeResponseAsString)
Thanks
Maybe this could help:
path("yourpath") {
get {
complete {
val str2 = scala.io.Source.fromFile("/tmp/t.log", "UTF8").mkString
val str = Source.single(ByteString(str2))
HttpResponse(entity = HttpEntity.Chunked.fromData(ContentTypes.`text/plain(UTF-8)`, str))
}
}

Chain Akka-http-client requests in a Stream

I would like to chain http request using akka-http-client as Stream. Each http request in a chain depends on a success/response of a previous requests and uses it to construct a new request. If a request is not successful, the Stream should return the response of the unsuccessful request.
How can I construct such a stream in akka-http?
which akka-http client level API should I use?
If you're making a web crawler, have a look at this post. This answer tackles a more simple case, such as downloading paginated resources, where the link to the next page is in a header of the current page response.
You can create a chained source - where one item leads to the next - using the Source.unfoldAsync method. This takes a function which takes an element S and returns Future[Option[(S, E)]] to determine if the stream should continue emitting elements of type E, passing the state to the next invocation.
In your case, this is kind of like:
taking an initial HttpRequest
producing a Future[HttpResponse]
if the response points to another URL, returning Some(request -> response), otherwise None
However, there's a wrinkle, which is that this will not emit a response from the stream if it doesn't contain a pointer to the next request.
To get around this, you can make the function passed to unfoldAsync return Future[Option[(Option[HttpRequest], HttpResponse)]]. This allows you to handle the following situations:
the current response is an error
the current response points to another request
the current response doesn't point to another request
What follows is some annotated code which outlines this approach, but first a preliminary:
When streaming HTTP requests to responses in Akka streams, you need to ensure that the response body is consumed otherwise bad things will happen (deadlocks and the like.) If you don't need the body you can ignore it, but here we use a function to convert the HttpEntity from a (potential) stream into a strict entity:
import scala.concurrent.duration._
def convertToStrict(r: HttpResponse): Future[HttpResponse] =
r.entity.toStrict(10.minutes).map(e => r.withEntity(e))
Next, a couple of functions to create an Option[HttpRequest] from an HttpResponse. This example uses a scheme like Github's pagination links, where the Links header contains, e.g: <https://api.github.com/...> rel="next":
def nextUri(r: HttpResponse): Seq[Uri] = for {
linkHeader <- r.header[Link].toSeq
value <- linkHeader.values
params <- value.params if params.key == "rel" && params.value() == "next"
} yield value.uri
def getNextRequest(r: HttpResponse): Option[HttpRequest] =
nextUri(r).headOption.map(next => HttpRequest(HttpMethods.GET, next))
Next, the real function we'll pass to unfoldAsync. It uses the Akka HTTP Http().singleRequest() API to take an HttpRequest and produce a Future[HttpResponse]:
def chainRequests(reqOption: Option[HttpRequest]): Future[Option[(Option[HttpRequest], HttpResponse)]] =
reqOption match {
case Some(req) => Http().singleRequest(req).flatMap { response =>
// handle the error case. Here we just return the errored response
// with no next item.
if (response.status.isFailure()) Future.successful(Some(None -> response))
// Otherwise, convert the response to a strict response by
// taking up the body and looking for a next request.
else convertToStrict(response).map { strictResponse =>
getNextRequest(strictResponse) match {
// If we have no next request, return Some containing an
// empty state, but the current value
case None => Some(None -> strictResponse)
// Otherwise, pass on the request...
case next => Some(next -> strictResponse)
}
}
}
// Finally, there's no next request, end the stream by
// returning none as the state.
case None => Future.successful(None)
}
Note that if we get an errored response, the stream will not continue since we return None in the next state.
You can invoke this to get a stream of HttpResponse objects like so:
val initialRequest = HttpRequest(HttpMethods.GET, "http://www.my-url.com")
Source.unfoldAsync[Option[HttpRequest], HttpResponse](
Some(initialRequest)(chainRequests)
As for returning the value of the last (or errored) response, you simply need to use Sink.last, since the stream will end either when it completes successfully or on the first errored response. For example:
def getStatus: Future[StatusCode] = Source.unfoldAsync[Option[HttpRequest], HttpResponse](
Some(initialRequest))(chainRequests)
.map(_.status)
.runWith(Sink.last)

Consuming a service using WS in Play

I was hoping someone can briefly go over the various ways of consuming a service (this one just returns a string, normally it would be JSON but I just want to understand the concepts here).
My service:
def ping = Action {
Ok("pong")
}
Now in my Play (2.3.x) application, I want to call my client and display the response.
When working with Futures, I want to display the value.
I am a bit confused, what are all the ways I could call this method i.e. there are some ways I see that use Success/Failure,
val futureResponse: Future[String] = WS.url(url + "/ping").get().map { response =>
response.body
}
var resp = ""
futureResponse.onComplete {
case Success(str) => {
Logger.trace(s"future success $str")
resp = str
}
case Failure(ex) => {
Logger.trace(s"future failed")
resp = ex.toString
}
}
Ok(resp)
I can see the trace in STDOUT for success/failure, but my controller action just returns "" to my browser.
I understand that this is because it returns a FUTURE and my action finishes before the future returns.
How can I force it to wait?
What options do I have with error handling?
If you really want to block until feature is completed look at the Future.ready() and Future.result() methods. But you shouldn't.
The point about Future is that you can tell it how to use the result once it arrived, and then go on, no blocks required.
Future can be the result of an Action, in this case framework takes care of it:
def index = Action.async {
WS.url(url + "/ping").get()
.map(response => Ok("Got result: " + response.body))
}
Look at the documentation, it describes the topic very well.
As for the error-handling, you can use Future.recover() method. You should tell it what to return in case of error and it gives you new Future that you should return from your action.
def index = Action.async {
WS.url(url + "/ping").get()
.map(response => Ok("Got result: " + response.body))
.recover{ case e: Exception => InternalServerError(e.getMessage) }
}
So the basic way you consume service is to get result Future, transform it in the way you want by using monadic methods(the methods that return new transformed Future, like map, recover, etc..) and return it as a result of an Action.
You may want to look at Play 2.2 -Scala - How to chain Futures in Controller Action and Dealing with failed futures questions.

How to get InputStream from request in Play

Ithink this used to be possible in Play 1.x, but I can't find how to do it in Play 2.x
I know that Play is asynchronous and uses Iteratees. However, there is generally much better support for InputStreams.
(In this case, I will be using a streaming JSON parser like Jackson to process the request body.)
How can I get an InputStream from a chunked request body?
I was able to get this working with the following code:
// I think all of the parens and braces line up -- copied/pasted from code
val pos = new PipedOutputStream()
val pis = new PipedInputStream(pos)
val result = Promise[Either[Errors, String]]()
def outputStreamBodyParser = {
BodyParser("outputStream") {
requestHeader =>
val length = requestHeader.headers(HeaderNames.CONTENT_LENGTH).toLong
Future {
result.completeWith(saveFile(pis, length)) // save returns Future[Either[Errors, String]]
}
Iteratee.fold[Array[Byte], OutputStream](pos) {
(os, data) =>
os.write(data)
os
}.map {
os =>
os.close()
Right(os)
}
}
}
Action.async(parse.when(
requestHeaders => {
val maybeContentLength = requestHeaders.headers.get(HeaderNames.CONTENT_LENGTH)
maybeContentLength.isDefined && allCatch.opt(maybeContentLength.get.toLong).isDefined
},
outputStreamBodyParser,
requestHeaders => Future.successful(BadRequest("Missing content-length header")))) {
request =>
result.future.map {
case Right(fileRef) => Ok(fileRef)
case Left(errors) => BadRequest(errors)
}
}
Play 2 is meant to be fully asynchronous, so this isn't easily possible or desirable. The problem with InputStream is there is no push back, there is no way for the reader of the InputStream to communicate to the input that it wants more data without blocking on read. Technically it is possible to write an Iteratee that could read data and put it into an InputStream, and would wait for a call to read on the InputStream before asking the Enumerator for more data, but it would be dangerous. You would have to make sure that the InputStream was closed properly, or the Enumerator would sit waiting forever (or until it times out) and the call to read must be made from a thread that is not running on the same ExecutionContext as the Enumerator and Iteratee or the application could deadlock.

Play Framework WebSocket Async

I'm using a WebSocket end point exposed by my Play Framework controller. My client will however send a large byte array and I'm a bit confused on how to handle this in my Iteratee. Here is what I have:
def myWSEndPoint(f: String => String) = WebSocket.async[Array[Byte]] {
request =>
Akka.future {
val (out, chan) = Concurrent.broadcast[Array[Byte]]
val in: Iteratee[Array[Byte], Unit] = Iteratee.foreach[Array[Byte]] {
// How do I get the entire file?
}
(null, null)
}
}
As it can be seen in the code above, I'm stuck on the line on how to handle the Byte array as one request and send the response back as a String? My confusion is on the Iteratee.foreach call. Is this foreach a foreach on the byte array or the entire content of the request that I send as a byte array from my client? It is confusing!
Any suggestions?
Well... It depends. Is your client sending all binaries at once, or is it (explicitly) chunk by chunk?
-> If it's all at once, then everything will be in the first chunk (therefore why a websocket? Why an Iteratee? Actions with BodyParser will probably be more efficient for that).
-> If it's chunk by chunk you have to keep every chunks you receive, and concatenate them on close (on close, unless you have another way for the client to say: "Hey I'm done!").