Play Framework Scala: How to Stream Request Body - scala

I'm building a micro-service using Play Framework 2.3.x using Scala (I'm a beginner in both) but I can't figure out a way to stream my request body.
Here is the problem:
I need an endpoint /transform where I can receive a huge TSV file that I will parse and render in another format: simple transformation. The problem is that every single command in my controller is ran "too late". It waits to receive the full file before starting the code.
Example:
def transform = Action.async {
Future {
Logger.info("Too late")
Ok("A response")
}
}
I want to be able to read line-by-line the request body during its upload and process already the request without having to wait for the file to be received completely.
Any hint would be welcome.

This answer applies to Play 2.5.x and higher since it uses the Akka streams API that replaced Play's Iteratee-based streaming in that version.
Basically, you can create a body parser that returns a Source[T] that you can pass to Ok.chunked(...). One way to do this is to use Accumulator.source[T] in the body parser. For example, an action that just returned data sent to it verbatim might look like this:
def verbatimBodyParser: BodyParser[Source[ByteString, _]] = BodyParser { _ =>
// Return the source directly. We need to return
// an Accumulator[Either[Result, T]], so if we were
// handling any errors we could map to something like
// a Left(BadRequest("error")). Since we're not
// we just wrap the source in a Right(...)
Accumulator.source[ByteString]
.map(Right.apply)
}
def stream = Action(verbatimBodyParser) { implicit request =>
Ok.chunked(request.body)
}
If you want to do something like transform a TSV file you can use a Flow to transform the source, e.g:
val tsvToCsv: BodyParser[Source[ByteString, _]] = BodyParser { req =>
val transformFlow: Flow[ByteString, ByteString, NotUsed] = Flow[ByteString]
// Chunk incoming bytes by newlines, truncating them if the lines
// are longer than 1000 bytes...
.via(Framing.delimiter(ByteString("\n"), 1000, allowTruncation = true))
// Replace tabs by commas. This is just a silly example and
// you could obviously do something more clever here...
.map(s => ByteString(s.utf8String.split('\t').mkString(",") + "\n"))
Accumulator.source[ByteString]
.map(_.via(transformFlow))
.map(Right.apply)
}
def convert = Action(tsvToCsv) { implicit request =>
Ok.chunked(request.body).as("text/csv")
}
There may be more inspiration in the Directing the Body Elsewhere section of the Play docs.

Related

Transforming Slick Streaming data and sending Chunked Response using Akka Http

The aim is to stream data from a database, perform some computation on this chunk of data(this computation returns a Future of some case class) and send this data as chunked response to the user. Currently I am able to stream data and send the response without performing any computation. However, I am unable to perform this computation and then stream the result.
This is the route I have implemented.
def streamingDB1 =
path("streaming-db1") {
get {
val src = Source.fromPublisher(db.stream(getRds))
complete(src)
}
}
The function getRds returns the rows of a table mapped into a case class(Using slick). Now consider the function compute which takes each row as an input and returns a Future of another case class. Something like
def compute(x: Tweet) : Future[TweetNew] = ?
How can I implement this function on variable src and send the chunked response(as a stream) of this computation to the user.
You could transform the source using mapAsync:
val src =
Source.fromPublisher(db.stream(getRds))
.mapAsync(parallelism = 3)(compute)
complete(src)
Adjust the level of parallelism as needed.
Note that you might need to configure a few settings as mentioned in the Slick documentation:
Note: Some database systems may require session parameters to be set in a certain way to support streaming without caching all data at once in memory on the client side. For example, PostgreSQL requires both .withStatementParameters(rsType = ResultSetType.ForwardOnly, rsConcurrency = ResultSetConcurrency.ReadOnly, fetchSize = n) (with the desired page size n) and .transactionally for proper streaming.
So if you're using PostgreSQL, for example, then your Source might look something like the following:
val src =
Source.fromPublisher(
db.stream(
getRds.withStatementParameters(
rsType = ResultSetType.ForwardOnly,
rsConcurrency = ResultSetConcurrency.ReadOnly,
fetchSize = 10
).transactionally
)
).mapAsync(parallelism = 3)(compute)
You need to have a way to marshall TweetNew and also if you send a chunk with length 0 client may close connection.
This code works with curl:
case class TweetNew(str: String)
def compute(string: String) : Future[TweetNew] = Future {
TweetNew(string)
}
val route = path("hello") {
get {
val byteString: Source[ByteString, NotUsed] = Source.apply(List("t1", "t2", "t3"))
.mapAsync(2)(compute)
.map(tweet => ByteString(tweet.str + "\n"))
complete(HttpEntity(ContentTypes.`text/plain(UTF-8)`, byteString))
}
}

Converting WebSockets in Play framework from version 2.4 to 2.6

I'm trying to convert this code, that uses the Play version 2.4 to the current version (2.6) and I'm having some issues because I'm still a noob in Scala.
def wsWeatherIntervals = WebSocket.using[String] {
request =>
val url = "http://api.openweathermap.org/data/2.5/weather?q=Amsterdam,nl"
val outEnumerator = Enumerator.repeatM[String]({
Thread.sleep(3000)
ws.url(url).get().map(r => s"${new java.util.Date()}\n ${r.body}")
})
(Iteratee.ignore[String], outEnumerator)
}
I followed this guide, but now I'm stuck on the stuff that I should return on the method.
This is the code that I'm trying to run using the version 2.6:
import play.api.mvc._
import scala.concurrent.Future
import akka.stream.scaladsl._
def wsWeatherIntervals = WebSocket.accept[String, Future[String]] { res =>
val url = "http://api.openweathermap.org/data/2.5/weather?q=Amsterdam,nl"
val source = Source.repeat({
Thread.sleep(3000)
ws.url(url).get().map(r => s"${new java.util.Date()}\n ${r.body}")
})
Flow.fromSinkAndSource(Sink.ignore, source)
}
But I'm getting this error when running the server, that points to the first line of the method:
could not find implicit value for parameter transformer: play.api.mvc.WebSocket.MessageFlowTransformer[String,scala.concurrent.Future[String]]
Note: I also tried to call WebSocket.apply instead of WebSocket.accept and I did some search about the differences between the two but didn't find anything useful. Can someone explain the difference between the two? Thanks.
The superficial error is that Play doesn't know how to turn a Future[String] into a Websocket message, for which you'd normally use an implicit transformer. However, in this case you don't want to return a Future[String] anyway but just a plain string which can be automatically marshalled (using the provided stringMessageFlowTransformer as it happens.) Here's something that should work:
def wsWeatherIntervals = WebSocket.accept[String, String] { res =>
val url = "http://api.openweathermap.org/data/2.5/weather?q=Amsterdam,nl"
def f = ws.url(url).get().map(r => s"${new java.util.Date()}\n ${r.body}")
val source = Source.unfoldAsync(f)(last => {
Thread.sleep(3000)
f.map(next => Some((last, next)))
})
Flow.fromSinkAndSource(Sink.ignore, source)
}
The unfoldAsync source lets us repeated run a function returning a future of the next element in the stream. (Since we want the stream to go on forever we return the value wrapped as Some.)
The Websocket.apply method is basically a more complicated version of accept which allows you to reject a websocket connection for some reason by returning a response, but if you need to do this it's better to use acceptOrResult, which handles transforming whatever your flow emits into websocket messages.

Download media file from twilio, using the media URI

I have been having issues with downloading media from the media uri provided on the mms messages.
val url = https://api.twilio.com/2010-04-01/Accounts/xx/Messages/xx/Media/xx
the media url provided is in the above structure,
new URL(url) #> new File("file.png") !! //this fails, due to multiple redirects
When I open the URI in browser the redirect ends up in
http://media.twiliocdn.com.s3-external-1.amazonaws.com/xx/xx
1st url -> 2nd url -> above url ;so,all in all 2 redirects
And if I try the snippet posted above with the new url, it works. I am sure its because of the multiple redirects, the snippet didnt work in the first place.
Been using play framework with scala, can I get any source example to download the file. Any help or pointers is appreciated. Tried various examples but still could not solve the issue.
Some findings =>
Accessing Twilio MMS images
anything similar for scala?
Update: #millhouse
def fileDownloader(urls: String, location: String) = {
import play.api.Play.current
import scala.concurrent.ExecutionContext.Implicits.global
// Make the request
val futureResponse: Future[(WSResponseHeaders, Enumerator[Array[Byte]])] =
WS.url(urls).withFollowRedirects(true).getStream()
futureResponse.flatMap {
case (headers, body) =>
val file = new File(location)
val outputStream = new FileOutputStream(file)
// The iteratee that writes to the output stream
val iteratee = Iteratee.foreach[Array[Byte]] { bytes =>
outputStream.write(bytes)
}
// Feed the body into the iteratee
(body |>>> iteratee).andThen {
case result =>
// Close the output stream whether there was an error or not
outputStream.close()
// Get the result or rethrow the error
result.get
}.map(_ => file)
}
}
This is the approach I had been using till now(works), as explained in the play docs. But I needed a sync approach, meaning I would need to carry out another step on successful file download. Sorry, for not clarifying out ahead.
Update 2 : Solved in this manner,
def fileDownloader(urls: String, location: String) = {
import play.api.Play.current
import scala.concurrent.ExecutionContext.Implicits.global
// Make the request
val futureResponse: Future[(WSResponseHeaders, Enumerator[Array[Byte]])] =
WS.url(urls).withFollowRedirects(true).getStream()
val downloadedFile: Future[File] = futureResponse.flatMap {
case (headers, body) =>
val file = new File(location)
val outputStream = new FileOutputStream(file)
// The iteratee that writes to the output stream
val iteratee = Iteratee.foreach[Array[Byte]] { bytes =>
outputStream.write(bytes)
}
// Feed the body into the iteratee
(body |>>> iteratee).andThen {
case result =>
// Close the output stream whether there was an error or not
outputStream.close()
// Get the result or rethrow the error
result.get
}.map(_ => file)
}
downloadedFile.map{ fileIn =>
//things needed to do
}
}
Thanks,
I haven't used the Twilio MMS API but it should be very straightforward to get the Play Framework HTTP client to follow redirects, using the documented option to the client:
val url = "https://api.twilio.com/2010-04-01/Accounts/xx/Messages/xx/Media/xx"
ws.url(url).withFollowRedirects(true).get().map { response =>
val theBytes:Array[Byte] = response.bodyAsBytes // Play 2.4 and lower
// ... save it
}
Note that the above code works for Play 2.4.x and lower; the bodyAsBytes method of WSResponse returns an Array[Byte]. If you're on the current cutting-edge and using Play 2.5.x, bodyAsBytes gives you an Akka ByteString with lots of nice functional methods, but you probably just want to call toArray on it if all you want is to store the data:
ws.url(url).withFollowRedirects(true).get().map { response =>
val theBytes:Array[Byte] = response.bodyAsBytes.toArray // Play 2.5
// ... save it
}

Play / Logging / Print Response Body / Run over enumerator / buffer the body

I'm looking for a way to print the response body in Play framework, I have a code like this:
object AccessLoggingAction extends ActionBuilder[Request] {
def invokeBlock[A](request: Request[A], block: (Request[A]) => Future[Result]) = {
Logger.info(s"""Request:
id=${request.id}
method=${request.method}
uri=${request.uri}
remote-address=${request.remoteAddress}
body=${request.body}
""")
val ret = block(request)
/*
ret.map {result =>
Logger.info(s"""Response:
id=${request.id}
body=${result.body}
""")
}
*/ //TODO: find out how to print result.body (be careful not to consume the enumerator)
ret
}
}
Currently the commented-out code is not working as I wanted, I mean, it would print:
Response:
id=1
body=play.api.libs.iteratee.Enumerator$$anon$18#39e6c1a2
So, I need to find a way to get a String out of Enumerator[Array[Byte]]. I tried to grasp the concept of Enumerator by reading this: http://mandubian.com/2012/08/27/understanding-play2-iteratees-for-normal-humans/
So..., if I understand it correctly:
I shouldn't dry-up the enumerator in the process of converting it to String. Otherwise, the client would receive nothing.
Let's suppose I figure out how to implement the T / filter mechanism. But then... wouldn't it defeat the purpose of Play framework as non-blocking streaming framework (because I would be building up the complete array of bytes in the memory, before calling toString on it, and finally log it)?
So, what's the correct way to log the response?
Thanks in advance,
Raka
This code works:
object AccessLoggingAction extends ActionBuilder[Request] {
def invokeBlock[A](request: Request[A], block: (Request[A]) => Future[Result]) = {
val start = System.currentTimeMillis
Logger.info(s"""Request:
id=${request.id}
method=${request.method}
uri=${request.uri}
remote-address=${request.remoteAddress}
body=${request.body}
""")
val resultFut = block(request)
resultFut.map {result =>
val time = System.currentTimeMillis - start
Result(result.header, result.body &> Enumeratee.map(arrOfBytes => {
val body = new String(arrOfBytes.map(_.toChar))
Logger.info(s"""Response:
id=${request.id}
method=${request.method}
uri=${request.uri}
delay=${time}ms
status=${result.header.status}
body=${body}""")
arrOfBytes
}), result.connection)
}
}
}
I partly learned it from here (on how to get the byte array out of enumerator): Scala Play 2.1: Accessing request and response bodies in a filter.
I'm using Play 2.3.7 while the link I gave uses 2.1 (and still uses PlainResult, which no longer exists in 2.3).
As it appears to me, if you do logging inside result.body &> Enumeratee.map (as suggested in https://stackoverflow.com/a/27630208/1781549) and the result body is presented in more than one chunk, then each chunk will be logged independently. You probably don't want this.
I'd implement it like this:
val ret = block(request).flatMap { result =>
val consume = Iteratee.consume[Array[Byte]]()
val bodyF = Iteratee.flatten(result.body(consume)).run
bodyF.map { bodyBytes: Array[Byte] =>
//
// Log the body
//
result.copy(body = Enumerator(bodyBytes))
}
}
But be warned: the whole idea of this is to consume all the data from the result.body Enumerator before logging (and return the new Enumerator). So, if the response is big, or you rely on streaming, then it's probably also the thing you don't want.
I used the above answer as a starting point, but noticed that it will only log responses if a body is present. We've adapted it to this:
var responseBody = None:Option[String]
val captureBody = Enumeratee.map[Array[Byte]](arrOfBytes => {
val body = new String(arrOfBytes.map(_.toChar))
responseBody = Some(body)
arrOfBytes
})
val withLogging = (result.body &> captureBody).onDoneEnumerating({
logger.debug(.. create message here ..)
})
result.copy(body=withLogging)

Creating Enumerator from client-sent data

I have a REST service (Play Framework 2.0 w/Scala) that receives messages via a POST request.
I want to allow a user to see the queue of messages received in a webpage. I wanted to create a SSE channel between browser and server, so the server pushes new messages to the browser.
To create that SSE stream, as per documentation, I'm using a chain of Enumerator/Enumeratee/Iteratee.
My problem is: how do I inject the messages received from the POST request to the enumerator. So given a code like follows:
def receive(msg: String) = Action {
sendToEnumerator()
Ok
}
val enumerator = Enumerator.fromCallback( ??? )
def sseStream() = Action {
Ok.stream(enumerator &> anotherEnumeratee ><> EventStrem()).as("text/evetn-stream")
}
What should I put in both sendToEnumerator and enumerator (where the ??? are). Or should I just use WebSockets and Actors instead? (I favour SEE due to broader compatibility, so would like to use SSE if possible)
Ok, found a way:
// The enum for pushing data to spread to all connected users
val hubEnum = Enumerator.imperative[String]()
// The hub used to get multiple output of a common input (the hubEnum)
val hub = Concurrent.hub[String](hubEnum)
// Converts message to Json for the web version
private val asJson: Enumeratee[String, JsValue] = Enumeratee.map[String] {
text => JsObject(
List(
"eventName" -> JsString("eventName"),
"text" -> JsString(text)
)
)
}
// loads data into hubEnum
def receiveData(msg: String) = Action { implicit request =>
hubEnum push msg
}
// read the Hub iterator and pushes back to clients
def stream = Action { implicit request =>
Ok.stream(hub.getPatchCord &> asJson ><> EventSource()).as("text/event-stream")
}
The trick is to create an imperative Enumerator. This enumerator allows you to push data into it when it becomes available. With this then you can follow the standard procedure: create a Hub based on the enumerator, convert it with some Enumeratee and send it back to browsers via SSE.
Thanks to this website for giving me the solution :)