Forwarding (Downloading/Uploading) Large File via Akka HTTP / Akka Streams - scala

I have a service that takes an HttpRequest from a client to get a file from another server via REST and then forward the file to the client as an HttpResponse.
Don't ask me why the client doesn't ask for the file him/herself because that is a long story.
I compiled a strategy to download the file to the file system and then send the file to the client. This is using extracts from other stackoveflow responses from #RamonJRomeroyVigil.
def downloadFile(request: HttpRequest, fileName: String): Future[IOResult] = {
Http().singleRequest(request).flatMap { response =>
val source = response.entity.dataBytes
source.runWith(FileIO.toPath(filePath))
}
}
def buildResponse(fileName: String)
val bufferedSrc = scala.io.Source.fromFile(fileName)
val source = Source
.fromIterator(() => bufferedSrc.getLines())
.map(ChunkStreamPart.apply)
HttpResponse(entity = HttpEntity.Chunked(ContentTypes.`application/octet-stream`, source))
}
However, I would like to do this in one step without saving the file system and taking advantage of the streaming abilities.
I also would like to limit the amount of request the client can serve at the same time to 5.
Thanks

As you are already getting the file as a stream from the second server, you can forward it directly to the client. You only need to build your HttpResponse on the fly :
def downloadFile(request: HttpRequest) : Future[HttpResponse] = {
Http().singleRequest(request).map {
case okResponse # HttpResponse(StatusCodes.OK, _, _, _) =>
HttpResponse(
entity = HttpEntity.Chunked(ContentTypes.`application/octet-stream`,
okResponse
.entity
.dataBytes
.map(ChunkStreamPart.apply)
))
case nokResponse # HttpResponse(_, _, _, _) =>
nokResponse
}
}
To change the maximum number of concurrent requests allowed for the client, you would need to set akka.http.client.host-connection-pool.max-connections and
akka.http.client.host-connection-pool.max-open-requests. More details can be found here.

Related

Why does Source.tick stop after one hundred HttpRequests?

Using akka stream and akka HTTP, I have created a stream which polls an api every 3 seconds, Unmarshalls the result to a JsValue object and sends this result to an actor. As can be seen in the following code:
// Source wich performs an http request every 3 seconds.
val source = Source.tick(0.seconds,
3.seconds,
HttpRequest(uri = Uri(path = Path("/posts/1"))))
// Processes the result of the http request
val flow = Http().outgoingConnectionHttps("jsonplaceholder.typicode.com").mapAsync(1) {
// Able to reach the API.
case HttpResponse(StatusCodes.OK, _, entity, _) =>
// Unmarshal the json response.
Unmarshal(entity).to[JsValue]
// Failed to reach the API.
case HttpResponse(code, _, entity, _) =>
entity.discardBytes()
Future.successful(code.toString())
}
// Run stream
source.via(flow).runWith(Sink.actorRef[Any](processJsonActor,akka.actor.Status.Success(("Completed stream"))))
This works, however the stream closes after 100 HttpRequests (ticks).
What is the cause of this behaviour?
Definitely something to do with outgoingConnectionHttps. This is a low level DSL and there could be some misconfigured setting somewhere which is causing this (although I couldn't figure out which one).
Usage of this DSL is actually discouraged by the docs.
Try using a higher level DSL like cached connection pool
val flow = Http().cachedHostConnectionPoolHttps[NotUsed]("akka.io").mapAsync(1) {
// Able to reach the API.
case (Success(HttpResponse(StatusCodes.OK, _, entity, _)), _) =>
// Unmarshal the json response.
Unmarshal(entity).to[String]
// Failed to reach the API.
case (Success(HttpResponse(code, _, entity, _)), _) =>
entity.discardBytes()
Future.successful(code.toString())
case (Failure(e), _) ⇒
throw e
}
// Run stream
source.map(_ → NotUsed).via(flow).runWith(...)
A potential issue is that there is no backpressure signal with Sink.actorRef, so the actor's mailbox could be getting full. If the actor, whenever it receives a JsValue object, is doing something that could take a long time, use Sink.actorRefWithAck instead. For example:
val initMessage = "start"
val completeMessage = "done"
val ackMessage = "ack"
source
.via(flow)
.runWith(Sink.actorRefWithAck[Any](
processJsonActor, initMessage, ackMessage, completeMessage))
You would need to change the actor to handle an initMessage and reply to the stream for every stream element with an ackMessage (with sender ! ackMessage). More information on Sink.actorRefWithAck is found here.

How to get error from akka-stream propagate to akka-http to both be logged and notify the client properly?

Right now I am using akka-stream and akka-HTTP to build a file streaming API. As such I am injecting a streaming source into an entity to have data streamed directly to the HTTP client like so:
complete(HttpEntity(ContentTypes.`application/octet-stream`, source))
However, if for some reason the stream fails, the connection gets closed by akka-http without further explanation or logging.
I would need 2 things:
How can I get the exception logs?
How can I notify my client with a message before closing the connection?
Thank you
As mentioned in comment HTTP protocol does not allow to signal error to the client side.
As to logging:
For me it boils down to missing proper access log directive in akka http.
In my current project we have decorator which register onComplete handler for http entity before giving it to akka http for rendering.
private def onResponseStreamEnd(response: HttpResponse)(action: StatusCode => Unit): HttpResponse =
if (!response.status.allowsEntity() || response.entity.isKnownEmpty()) {
action(response.status)
response
} else {
val dataBytes =
onStreamEnd(response.entity) { result =>
val overallStatusCode =
result match {
case Success(_) =>
response.status
case Failure(e) =>
logger.error(e, s"error streaming response [${e.getMessage}]")
StatusCodes.InternalServerError
}
action(overallStatusCode)
}
response.withEntity(response.entity.contentLengthOption match {
case Some(length) => HttpEntity(response.entity.contentType, length, dataBytes)
case None => HttpEntity(response.entity.contentType, dataBytes)
})
}
private def onStreamEnd(entity: HttpEntity)(onComplete: Try[Done] ⇒ Unit): Source[ByteString, _] =
entity.dataBytes.alsoTo { Sink.onComplete(onComplete) }
Usage:
complete(onResponseStreamEnd(HttpResponse(StatusCodes.OK, HttpEntity(ContentTypes.`application/octet-stream`, source))){ statusCode => .... })
Similar approach but using custom graph stage you can find here

Download media file from twilio, using the media URI

I have been having issues with downloading media from the media uri provided on the mms messages.
val url = https://api.twilio.com/2010-04-01/Accounts/xx/Messages/xx/Media/xx
the media url provided is in the above structure,
new URL(url) #> new File("file.png") !! //this fails, due to multiple redirects
When I open the URI in browser the redirect ends up in
http://media.twiliocdn.com.s3-external-1.amazonaws.com/xx/xx
1st url -> 2nd url -> above url ;so,all in all 2 redirects
And if I try the snippet posted above with the new url, it works. I am sure its because of the multiple redirects, the snippet didnt work in the first place.
Been using play framework with scala, can I get any source example to download the file. Any help or pointers is appreciated. Tried various examples but still could not solve the issue.
Some findings =>
Accessing Twilio MMS images
anything similar for scala?
Update: #millhouse
def fileDownloader(urls: String, location: String) = {
import play.api.Play.current
import scala.concurrent.ExecutionContext.Implicits.global
// Make the request
val futureResponse: Future[(WSResponseHeaders, Enumerator[Array[Byte]])] =
WS.url(urls).withFollowRedirects(true).getStream()
futureResponse.flatMap {
case (headers, body) =>
val file = new File(location)
val outputStream = new FileOutputStream(file)
// The iteratee that writes to the output stream
val iteratee = Iteratee.foreach[Array[Byte]] { bytes =>
outputStream.write(bytes)
}
// Feed the body into the iteratee
(body |>>> iteratee).andThen {
case result =>
// Close the output stream whether there was an error or not
outputStream.close()
// Get the result or rethrow the error
result.get
}.map(_ => file)
}
}
This is the approach I had been using till now(works), as explained in the play docs. But I needed a sync approach, meaning I would need to carry out another step on successful file download. Sorry, for not clarifying out ahead.
Update 2 : Solved in this manner,
def fileDownloader(urls: String, location: String) = {
import play.api.Play.current
import scala.concurrent.ExecutionContext.Implicits.global
// Make the request
val futureResponse: Future[(WSResponseHeaders, Enumerator[Array[Byte]])] =
WS.url(urls).withFollowRedirects(true).getStream()
val downloadedFile: Future[File] = futureResponse.flatMap {
case (headers, body) =>
val file = new File(location)
val outputStream = new FileOutputStream(file)
// The iteratee that writes to the output stream
val iteratee = Iteratee.foreach[Array[Byte]] { bytes =>
outputStream.write(bytes)
}
// Feed the body into the iteratee
(body |>>> iteratee).andThen {
case result =>
// Close the output stream whether there was an error or not
outputStream.close()
// Get the result or rethrow the error
result.get
}.map(_ => file)
}
downloadedFile.map{ fileIn =>
//things needed to do
}
}
Thanks,
I haven't used the Twilio MMS API but it should be very straightforward to get the Play Framework HTTP client to follow redirects, using the documented option to the client:
val url = "https://api.twilio.com/2010-04-01/Accounts/xx/Messages/xx/Media/xx"
ws.url(url).withFollowRedirects(true).get().map { response =>
val theBytes:Array[Byte] = response.bodyAsBytes // Play 2.4 and lower
// ... save it
}
Note that the above code works for Play 2.4.x and lower; the bodyAsBytes method of WSResponse returns an Array[Byte]. If you're on the current cutting-edge and using Play 2.5.x, bodyAsBytes gives you an Akka ByteString with lots of nice functional methods, but you probably just want to call toArray on it if all you want is to store the data:
ws.url(url).withFollowRedirects(true).get().map { response =>
val theBytes:Array[Byte] = response.bodyAsBytes.toArray // Play 2.5
// ... save it
}

With play and akka-https: How to properly chain multiple requests to the incoming request to create a response?

So I tried to get a small play app communicating with another rest service.
The idea is, to receive a request on the play side and then do a request to the rest api and feed parts of the result to another local actor before displaying the response from the local actor and the rest service in the browser.
This image shows how
And I tried to do it with streams. I got it all working, but I am absolutely not happy with the part where I talk to my local actor and create a Future[(Future[String],Future[String]) tuple , so I would be happy if you could point me in the direction, how to do this in an elegant and clean way.
So here is my code. The input is a csv file.
My local actor creates an additional graphic I want to put into the response.
def upload = Action.async(parse.multipartFormData) { request =>
request.body.file("input").map { inputCsv =>
//csv to list of strings
val inputList: List[String] = convertFileToList(inputCsv)
//http request to rest service
val responseFuture: Future[HttpResponse] = httpRequest(inputList, "/path",4321 ,"0.0.0.0")
//pattern match response and ask local actor
val formattedResult = responseFuture.flatMap { response =>
response.status match {
case akka.http.scaladsl.model.StatusCodes.OK =>
val resultTeams = Unmarshal(response.entity).to[CustomResultCaseClass]
//the part I'd like to improve
val tupleFuture = resultTeams.map(result =>
(Future(result.teams.reduce(_ + "," + _)),
plotter.ask(PlotData(result.eval)).mapTo[ChartPath].flatMap(plotAnswer => Future(plotAnswer.path))))
tupleFuture.map(tuple => tuple._1.map(teams =>
p._2.map(chartPath => Ok(views.html.upload(teams))(chartPath))))).flatMap(a => a).flatMap(b => b)
}
}
formattedResult
}.getOrElse(Future(play.api.mvc.Results.BadRequest))
}
For comprehensions are useful for this type of use cases. A basic example which demonstrates the refactoring involved:
val teamFut = Future(result.teams.reduce(_ + "," + _))
//I think the final .flatMap(Future(_.path)) is unnecessary it should be
// .map(_.path), but I wanted to replicate the question code functionality
val pathFut = plotter.ask(PlotData(result.eval))
.mapTo[ChartPath]
.flatMap(Future(_.path))
val okFut =
for {
teams <- teamFut
chartPath <- pathFut
} yield Ok(views.html.upload(teams))(chartPath)
Note: the Initial Futures should be instantiated outside of the for otherwise parallel execution won't occur.

Akka Streams with Akka HTTP Server and Client

I'm trying to create an endpoint on my Akka Http Server which tells the users it's IP address using an external service (I know this can be performed way easier but I'm doing this as a challenge).
The code that doesn't make use of streams on the upper most layer is this:
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
val requestHandler: HttpRequest => Future[HttpResponse] = {
case HttpRequest(GET, Uri.Path("/"), _, _, _) =>
Http().singleRequest(HttpRequest(GET, Uri("http://checkip.amazonaws.com/"))).flatMap { response =>
response.entity.dataBytes.runFold(ByteString(""))(_ ++ _) map { string =>
HttpResponse(entity = HttpEntity(MediaTypes.`text/html`,
"<html><body><h1>" + string.utf8String + "</h1></body></html>"))
}
}
case _: HttpRequest =>
Future(HttpResponse(404, entity = "Unknown resource!"))
}
Http().bindAndHandleAsync(requestHandler, "localhost", 8080)
and it is working fine. However, as a challenge, I wanted to limit myself to only using streams (no Future's).
This is the layout I thought I'd use for this kind of an approach:
Source[Request] -> Flow[Request, Request] -> Flow[Request, Response] ->Flow[Response, Response] and to accommodate the 404 route, also Source[Request] -> Flow[Request, Response]. Now, if my Akka Stream knowledge serves me well, I need to use a Flow.fromGraph for such a thing, however, this is where I'm stuck.
In a Future I can do an easy map and flatMap for the various endpoints but in streams that would mean dividing up the Flow into multiple Flow's and I'm not quite sure how I'd do that. I thought about using UnzipWith and Options or a generic Broadcast.
Any help on this subject would be much appreciated.
I don't if this would be necessary? -- http://doc.akka.io/docs/akka-stream-and-http-experimental/2.0-M2/scala/stream-customize.html
You do not need to use Flow.fromGraph. Instead, a singular Flow that uses flatMapConcat will work:
//an outgoing connection flow
val checkIPFlow = Http().outgoingConnection("checkip.amazonaws.com")
//converts the final html String to an HttpResponse
def byteStrToResponse(byteStr : ByteString) =
HttpResponse(entity = new Default(MediaTypes.`text/html`,
byteStr.length,
Source.single(byteStr)))
val reqResponseFlow = Flow[HttpRequest].flatMapConcat[HttpResponse]( _ match {
case HttpRequest(GET, Uri.Path("/"), _, _, _) =>
Source.single(HttpRequest(GET, Uri("http://checkip.amazonaws.com/")))
.via(checkIPFlow)
.mapAsync(1)(_.entity.dataBytes.runFold(ByteString(""))(_ ++ _))
.map("<html><body><h1>" + _.utf8String + "</h1></body></html>")
.map(ByteString.apply)
.map(byteStrToResponse)
case _ =>
Source.single(HttpResponse(404, entity = "Unknown resource!"))
})
This Flow can then be used to bind to incoming requests:
Http().bindAndHandle(reqResponseFlow, "localhost", 8080)
And all without Futures...