Decode a streaming GZIP response in Scala Dispatch? - scala

Receiving a Gzipped response from an API, but Dispatch 0.9.5 doesn't appear to have any methods to decode the response. Any ideas?
Here's my current implementation, the println only prints out string representations of bytes.
Http(
host("stream.gnip.com")
.secure
.addHeader("Accept-Encoding", "gzip")
/ gnipUrl
> as.stream.Lines(println))()
Tried to look at implementing my own handler, but not sure where to begin. Here's the relevant file for Lines: https://github.com/dispatch/reboot/blob/master/core/src/main/scala/as/stream/lines.scala
Thanks!

Simply abandoned Dispatch and used Java APIs directly. Disappointing, but it got the job done.
val GNIP_URL = isDev match {
case true => "https://url/apath/track/dev.json"
case false => "https://url/path/track/prod.json"
}
val GNIP_CHARSET = "UTF-8"
override def preStart() = {
log.info("[tracker] Starting new Twitter PowerTrack connection to %s" format GNIP_URL)
val connection = getConnection(GNIP_URL, GNIP_USER, GNIP_PASSWORD)
val inputStream = connection.getInputStream()
val reader = new BufferedReader(new InputStreamReader(new StreamingGZIPInputStream(inputStream), GNIP_CHARSET))
var line = reader.readLine()
while(line != null){
println(line)
line = reader.readLine()
}
}
private def getConnection(urlString: String, user: String, password: String): HttpURLConnection = {
val url = new URL(urlString)
val connection = url.openConnection().asInstanceOf[HttpURLConnection]
connection.setReadTimeout(1000 * 60 * 60)
connection.setConnectTimeout(1000 * 10)
connection.setRequestProperty("Authorization", createAuthHeader(user, password));
connection.setRequestProperty("Accept-Encoding", "gzip")
connection
}
private def createAuthHeader(username: String, password: String) = {
val encoder = new BASE64Encoder()
val authToken = username+":"+password
"Basic "+encoder.encode(authToken.getBytes())
}
Used GNIP's example: https://github.com/gnip/support/blob/master/Premium%20Stream%20Connection/Java/StreamingConnection.java

This isn't so much a solution as a workaround, but I ended up resorting to bypassing the Future-based stuff and doing:
val stream = Http(req OK as.Response(_.getResponseBodyAsStream)).apply
val result =
JsonParser.parse(
new java.io.InputStreamReader(
new java.util.zip.GZIPInputStream(stream)))
I'm using JsonParser here because in my case the data I'm receiving happens to be JSON; substitute with something else in your use case, if needed.

My solution just defined a response parser and also adopted json4s parser:
object GzipJson extends (Response => JValue) {
def apply(r: Response) = {
if(r.getHeader("content-encoding")!=null && r.getHeader("content-encoding").equals("gzip")){
(parse(new GZIPInputStream(r.getResponseBodyAsStream), true))
}else
(dispatch.as.String andThen (s => parse(StringInput(s), true)))(r)
}
}
So that I can use it to extract Gzip Json response as the following code:
import GzipJson._
Http(req OK GzipJson).apply

Try >> instead of >
See https://github.com/dispatch/dispatch/blob/master/core/src/main/scala/handlers.scala#L58

Related

How can i write the AWS lamda unit test cases in scala?

How we can implement the unit test cases for aws lamda serverless.
My code is
object Test1 extends RequestHandler[APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent] with ResponseObjProcess {
override def handleRequest(input: APIGatewayProxyRequestEvent, context: Context): APIGatewayProxyResponseEvent = {
var response = new APIGatewayProxyResponseEvent()
val gson = new Gson
val requestHttpMethod = input.getHttpMethod
val requestBody = input.getBody
val requestHeaders = input.getHeaders
val requestPath = input.getPath
val requestPathParameters = input.getPathParameters
val requestQueryStringParameters = input.getQueryStringParameters
val parsedBody = JSON.parseFull(requestBody).getOrElse(0).asInstanceOf[Map[String, String]]
println(" parsedBody is:: " + parsedBody)
val active = parsedBody.get("active").getOrElse("false")
val created = parsedBody.get("created").getOrElse("0").toLong
val updated = parsedBody.get("updated").getOrElse("0").toLong
requestHttpMethod match {
case "PUT" =>
println(" PUT Request method ")
// insertRecords("alert_summary_report", requestBody)
response.setStatusCode(200)
response.setBody(gson.toJson("PUT"))
case _ =>
println("")
response.setStatusCode(400)
response.setBody(gson.toJson("None"))
}
response
}
}
And I tried to implement unit test cases for the above code.
Below code is:
test("testing record success case") {
var request = new APIGatewayProxyRequestEvent();
request.setHttpMethod(Constants.PUTREQUESTMETHOD)
DELETEREQUESTBODY.put("id", "")
request.setBody(EMPTYREQUESTBODY)
request.setPathParameters(DELETEREQUESTBODY)
println(s"body = ${request.getBody}")
println(s"headers = ${request.getHeaders}")
val response = ProxyRequestMain.handleRequest(subject, testContext)
val assertEqual = response.getStatusCode.equals(200)
assertEqual
}
Actually, I'm getting response.getStatusCode=400 bad requests but test case passed how can I write handle this.
I am looking at your test code and it's not clear to me what you are trying to achieve with your assertions. I think you might have mixed quite a few things. In the code as it currently stands, you have a val, not assertion. I'd encourage you to have a look at the relevant docs and research the options available to you:
http://www.scalatest.org/user_guide/using_assertions
http://www.scalatest.org/user_guide/using_matchers

File Upload and processing using akka-http websockets

I'm using some sample Scala code to make a server that receives a file over websocket, stores the file temporarily, runs a bash script on it, and then returns stdout by TextMessage.
Sample code was taken from this github project.
I edited the code slightly within echoService so that it runs another function that processes the temporary file.
object WebServer {
def main(args: Array[String]) {
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val interface = "localhost"
val port = 3000
import Directives._
val route = get {
pathEndOrSingleSlash {
complete("Welcome to websocket server")
}
} ~
path("upload") {
handleWebSocketMessages(echoService)
}
val binding = Http().bindAndHandle(route, interface, port)
println(s"Server is now online at http://$interface:$port\nPress RETURN to stop...")
StdIn.readLine()
binding.flatMap(_.unbind()).onComplete(_ => actorSystem.shutdown())
println("Server is down...")
}
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val echoService: Flow[Message, Message, _] = Flow[Message].mapConcat {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
TextMessage(analyze(imgOutFile))
}
case BinaryMessage.Streamed(stream) => {
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future(Source.single(""))
}
TextMessage(analyze(imgOutFile))
}
private def analyze(imgfile: File): String = {
val p = Runtime.getRuntime.exec(Array("./run-vision.sh", imgfile.toString))
val br = new BufferedReader(new InputStreamReader(p.getInputStream, StandardCharsets.UTF_8))
try {
val result = Stream
.continually(br.readLine())
.takeWhile(_ ne null)
.mkString
result
} finally {
br.close()
}
}
}
}
During testing using Dark WebSocket Terminal, case BinaryMessage.Strict works fine.
Problem: However, case BinaryMessage.Streaming doesn't finish writing the file before running the analyze function, resulting in a blank response from the server.
I'm trying to wrap my head around how Futures are being used here with the Flows in Akka-HTTP, but I'm not having much luck outside trying to get through all the official documentation.
Currently, .mapAsync seems promising, or basically finding a way to chain futures.
I'd really appreciate some insight.
Yes, mapAsync will help you in this occasion. It is a combinator to execute Futures (potentially in parallel) in your stream, and present their results on the output side.
In your case to make things homogenous and make the type checker happy, you'll need to wrap the result of the Strict case into a Future.successful.
A quick fix for your code could be:
val echoService: Flow[Message, Message, _] = Flow[Message].mapAsync(parallelism = 5) {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
case BinaryMessage.Streamed(stream) =>
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
}

Scala Play Framework

How to get Response Body from End point ? I am Sending request to Endpoint, I want to know, how to get the response string.
val complexRequest = ws.url(serviceEndpoint).withHeaders("Content-Type" -> "application/xml")
val result = complexRequest.post(leadXml).map { response =>
logger.info(s"response $response")
if (response.status == 200) {
val res = response
logger.info(s"status passed.. $res")
}
else {
val res = response
logger.info(s"status failed.. $res")
}
}
response.body you can also use PlayJson to validate it and change it to a workable object!
You can use Helper class
import play.api.test.Helpers._
val result: Future[SimpleResult] = …
val bodyAsBytes: Array[Byte] = contentAsBytes(result)
Or JavaResultExtractor
akka.util.ByteString body = play.core.j.JavaResultExtractor.getBody(result, 10000l, mat);
Or JavaBodyPartser
https://www.playframework.com/documentation/2.5.x/JavaBodyParsers

Convert HttpEntity.Chunked to Array[String]

I have the following problem.
I am querying a server for some data and getting it back as HttpEntity.Chunked.
The response String looks like this with up to 10.000.000 lines like this:
[{"name":"param1","value":122343,"time":45435345},
{"name":"param2","value":243,"time":4325435},
......]
Now I want to get the incoming data into and Array[String] where each String is a line from the response, because later on it should be imported into an apache spark dataframe.
Currently I am doing it likes this:
//For the http request
trait StartHttpRequest {
implicit val system: ActorSystem
implicit val materializer: ActorMaterializer
def httpRequest(data: String, path: String, targetPort: Int, host: String): Future[HttpResponse] = {
val connectionFlow: Flow[HttpRequest, HttpResponse, Future[OutgoingConnection]] = {
Http().outgoingConnection(host, port = targetPort)
}
val responseFuture: Future[HttpResponse] =
Source.single(RequestBuilding.Post(uri = path, entity = HttpEntity(ContentTypes.`application/json`, data)))
.via(connectionFlow)
.runWith(Sink.head)
responseFuture
}
}
//result of the request
val responseFuture: Future[HttpResponse] = httpRequest(.....)
//convert to string
responseFuture.flatMap { response =>
response.status match {
case StatusCodes.OK =>
Unmarshal(response.entity).to[String]
}
}
//and then something like this, but with even more stupid stuff
responseFuture.onSuccess { str:String =>
masterActor! str.split("""\},\{""")
}
My question is, what would be a better way to get the result into an array?
How can I unmarshall the response entity directly? Because .to[Array[String]] for example did not work. And because there are so many lines coming, could I do it with a stream, to be more efficent?
Answering your questions out of order:
How can I unmarshall the response entity directly?
There is an existing question & answer related to unmarshalling an Array of case classes.
what would be a better way to get the result into an array?
I would take advantage of the Chunked nature and use streams. This allows you to do string processing and json parsing concurrently.
First you need a container class and parser:
case class Data(name : String, value : Int, time : Long)
object MyJsonProtocol extends DefaultJsonProtocol {
implicit val dataFormat = jsonFormat3(Data)
}
Then you have to do some manipulations to get the json objects to look right:
//Drops the '[' and the ']' characters
val dropArrayMarkers =
Flow[ByteString].map(_.filterNot(b => b == '['.toByte || b == ']'.toByte))
val preppendBrace =
Flow[String].map(s => if(!s.startsWith("{")) "{" + s else s)
val appendBrace =
Flow[String].map(s => if(!s.endsWith("}")) s + "}" else s)
val parseJson =
Flow[String].map(_.parseJson.convertTo[Data])
Finally, combine these Flows to convert a Source of ByteString into a Source of Data objects:
def strSourceToDataSource(source : Source[ByteString,_]) : Source[Data, _] =
source.via(dropArrayMarkers)
.via(Framing.delimiter(ByteString("},{"), 256, true))
.map(_.utf8String)
.via(prependBrace)
.via(appendBrace)
.via(parseJson)
This source can then be drained into an Seq of Data objects:
val dataSeq : Future[Seq[Data]] =
responseFuture flatMap { response =>
response.status match {
case StatusCodes.OK =>
strSourceToDataSource(response.entity.dataBytes).runWith(Sink.seq)
}
}

How to download a HTTP resource to a file with Akka Streams and HTTP?

Over the past few days I have been trying to figure out the best way to download a HTTP resource to a file using Akka Streams and HTTP.
Initially I started with the Future-Based Variant and that looked something like this:
def downloadViaFutures(uri: Uri, file: File): Future[Long] = {
val request = Get(uri)
val responseFuture = Http().singleRequest(request)
responseFuture.flatMap { response =>
val source = response.entity.dataBytes
source.runWith(FileIO.toFile(file))
}
}
That was kind of okay but once I learnt more about pure Akka Streams I wanted to try and use the Flow-Based Variant to create a stream starting from a Source[HttpRequest]. At first this completely stumped me until I stumbled upon the flatMapConcat flow transformation. This ended up a little more verbose:
def responseOrFail[T](in: (Try[HttpResponse], T)): (HttpResponse, T) = in match {
case (responseTry, context) => (responseTry.get, context)
}
def responseToByteSource[T](in: (HttpResponse, T)): Source[ByteString, Any] = in match {
case (response, _) => response.entity.dataBytes
}
def downloadViaFlow(uri: Uri, file: File): Future[Long] = {
val request = Get(uri)
val source = Source.single((request, ()))
val requestResponseFlow = Http().superPool[Unit]()
source.
via(requestResponseFlow).
map(responseOrFail).
flatMapConcat(responseToByteSource).
runWith(FileIO.toFile(file))
}
Then I wanted to get a little tricky and use the Content-Disposition header.
Going back to the Future-Based Variant:
def destinationFile(downloadDir: File, response: HttpResponse): File = {
val fileName = response.header[ContentDisposition].get.value
val file = new File(downloadDir, fileName)
file.createNewFile()
file
}
def downloadViaFutures2(uri: Uri, downloadDir: File): Future[Long] = {
val request = Get(uri)
val responseFuture = Http().singleRequest(request)
responseFuture.flatMap { response =>
val file = destinationFile(downloadDir, response)
val source = response.entity.dataBytes
source.runWith(FileIO.toFile(file))
}
}
But now I have no idea how to do this with the Future-Based Variant. This is as far as I got:
def responseToByteSourceWithDest[T](in: (HttpResponse, T), downloadDir: File): Source[(ByteString, File), Any] = in match {
case (response, _) =>
val source = responseToByteSource(in)
val file = destinationFile(downloadDir, response)
source.map((_, file))
}
def downloadViaFlow2(uri: Uri, downloadDir: File): Future[Long] = {
val request = Get(uri)
val source = Source.single((request, ()))
val requestResponseFlow = Http().superPool[Unit]()
val sourceWithDest: Source[(ByteString, File), Unit] = source.
via(requestResponseFlow).
map(responseOrFail).
flatMapConcat(responseToByteSourceWithDest(_, downloadDir))
sourceWithDest.runWith(???)
}
So now I have a Source that will emit one or more (ByteString, File) elements for each File (I say each File since there is no reason the original Source has to be a single HttpRequest).
Is there anyway to take these and route them to a dynamic Sink?
I'm thinking something like flatMapConcat, such as:
def runWithMap[T, Mat2](f: T => Graph[SinkShape[Out], Mat2])(implicit materializer: Materializer): Mat2 = ???
So that I could complete downloadViaFlow2 with:
def destToSink(destination: File): Sink[(ByteString, File), Future[Long]] = {
val sink = FileIO.toFile(destination, true)
Flow[(ByteString, File)].map(_._1).toMat(sink)(Keep.right)
}
sourceWithDest.runWithMap {
case (_, file) => destToSink(file)
}
The solution does not require a flatMapConcat. If you don't need any return values from the file writing then you can use Sink.foreach:
def writeFile(downloadDir : File)(httpResponse : HttpResponse) : Future[Long] = {
val file = destinationFile(downloadDir, httpResponse)
httpResponse.entity.dataBytes.runWith(FileIO.toFile(file))
}
def downloadViaFlow2(uri: Uri, downloadDir: File) : Future[Unit] = {
val request = HttpRequest(uri=uri)
val source = Source.single((request, ()))
val requestResponseFlow = Http().superPool[Unit]()
source.via(requestResponseFlow)
.map(responseOrFail)
.map(_._1)
.runWith(Sink.foreach(writeFile(downloadDir)))
}
Note that the Sink.foreach creates Futures from the writeFile function. Therefore there's not much back-pressure involved. The writeFile could be slowed down by the hard drive but the stream would keep generating Futures. To control this you can use Flow.mapAsyncUnordered (or Flow.mapAsync) :
val parallelism = 10
source.via(requestResponseFlow)
.map(responseOrFail)
.map(_._1)
.mapAsyncUnordered(parallelism)(writeFile(downloadDir))
.runWith(Sink.ignore)
If you want to accumulate the Long values for a total count you need to combine with a Sink.fold:
source.via(requestResponseFlow)
.map(responseOrFail)
.map(_._1)
.mapAsyncUnordered(parallelism)(writeFile(downloadDir))
.runWith(Sink.fold(0L)(_ + _))
The fold will keep a running sum and emit the final value when the source of requests has dried up.
Using the play Web Services client injected in ws, remmebering to import scala.concurrent.duration._:
def downloadFromUrl(url: String)(ws: WSClient): Future[Try[File]] = {
val file = File.createTempFile("my-prefix", new File("/tmp"))
file.deleteOnExit()
val futureResponse: Future[WSResponse] =
ws.url(url).withMethod("GET").withRequestTimeout(5 minutes).stream()
futureResponse.flatMap { res =>
res.status match {
case 200 =>
val outputStream = java.nio.file.Files.newOutputStream(file.toPath)
val sink = Sink.foreach[ByteString] { bytes => outputStream.write(bytes.toArray) }
res.bodyAsSource.runWith(sink).andThen {
case result =>
outputStream.close()
result.get
} map (_ => Success(file))
case other => Future(Failure[File](new Exception("HTTP Failure, response code " + other + " : " + res.statusText)))
}
}
}