Spray: Bringing RequestContext in scope results in timeout - scala

Hi scala and spray people!
I have a small annoying issue with extracting the HTTP 'Accept' header from the RequestContext and matching on it. On a normal route like so:
get {
respondWithMediaType(`text/plain`) {
complete ( "Hello World!" )
}
}
It works like a charm. But whenever I bring the context into scope like so (as suggested in the documentation for directives):
get { context => {
respondWithMediaType(`text/plain`) {
complete ( "Hello World!" )
}
} }
The result becomes the following error message:
The server was not able to produce a timely response to your request.
I am fairly new to Spray, but it looks really odd to me that bringing an (otherwise implicit) object into scope can have such a weird sideeffect. Does any of you have a clue on what is going on?

Direct access to the RequestContext is rarely needed. In fact, you only need it if you want to write custom directives. Common tasks and extracting the usual bits of data can normally be handled using one of the predefined directives.
It seems what you want to do is manual content type negotiation. In fact, you don't have to do it manually as spray does content type automatically for common data structures. Your example can be shortened to
get {
complete("Hello World!")
}
When complete is called with a string the response will always be of type text/plain. If the client would send a request with an Accept header that doesn't accept text/plain the request would already be rejected by the server.
If you want to customize the kinds of content types that can be provided from a Scala data-type you need to provide a custom Marshaller. See the documentation on how to achieve that.
Answering your original question why adding context => makes the request timeout: This is because the predefined directives already are of type RequestContext => Unit. So, writing
respondWithMediaType(`text/plain`) {
complete("Hello World!")
}
is exactly equivalent to (i.e. automatically expanded to)
ctx => respondWithMediaType(`text/plain`) {
complete("Hello World!")
}.apply(ctx)
So, if you add only ctx => manually, but don't add the apply call, an incoming request is never fed into the inner route and therefore never completed. The compiler doesn't catch this kind of error because the type of a route is RequestContext => Unit and so the variant with and the variant without the apply invocation are both valid. We are going to improve this in the future.
See the documentation for more info about how routes are built.
Finally, if you need to extract a header or its value you can use one of the predefined HeaderDirectives that simplify working with request headers a lot.

Related

Configure gatling base url from initial http request

I'm completely new to Scala and Gatling, please forgive the basic question!
I want to create an http protocol with baseUrl specified by the result of an initial http request. Or in other words:
Get remote config as JSON, let's say from https://example.com/config.json
Parse JSON, retrieve a specified property endpoint
Pass that value to http.baseUrl()
I can do this on every single scenario manually but this quickly becomes tedious (and is unnecessarily repetitive). I'd like to find a solution where I can perform this setup once at the beginning of the test run.
My instinct is to go for something like this:
object Environment {
val config = "https://example.com/config.json"
}
val httpProtocol = http("config")
.get(Environment.config)
.check(
jsonPath("$.endpoint").saveAs("endpoint")
)
.baseUrl("${endpoint}")
// ... and then later on
setUp(
// scenario.inject()…
).protocols(httpProtocol)
... but that doesn't compile.
Thanks very much for any help.
What you're proposing won't work.
.protocols takes a HttpProtocolBuilder (documented on the gatling site), whereas you're trying to pass a HttpRequestBuilder
Furthermore, the baseUrl parameter of HttpProtocolBuilder only takes a string, so you won't be able to pass a gatling session value into it.
The only way I can think to do this would be to make the request that returns the base url in the 'before' block, but you won't be able to use gatling dsl constructs to make that request: you'll have to do it with raw scala

Akka Streams - Backpressure for Source.unfoldAsync

I'm currently trying to read a paginated HTTP resource. Each page is a Multipart Document and the response for the page include a next link in the headers if there is a page with more content. An automated parser can then start at the oldest page and then read page by page using the headers to construct the request for the next page.
I'm using Akka Streams and Akka Http for the implementation, because my goal is to create a streaming solution. I came up with this (I will include only the relevant parts of the code here, feel free to have a look at this gist for the whole code):
def read(request: HttpRequest): Source[HttpResponse, _] =
Source.unfoldAsync[Option[HttpRequest], HttpResponse](Some(request))(Crawl.crawl)
val parse: Flow[HttpResponse, General.BodyPart, _] = Flow[HttpResponse]
.flatMapConcat(r => Source.fromFuture(Unmarshal(r).to[Multipart.General]))
.flatMapConcat(_.parts)
....
def crawl(reqOption: Option[HttpRequest]): Future[Option[(Option[HttpRequest], HttpResponse)]] = reqOption match {
case Some(req) =>
Http().singleRequest(req).map { response =>
if (response.status.isFailure()) Some((None, response))
else nextRequest(response, HttpMethods.GET)
}
case None => Future.successful(None)
}
So the general idea is to use Source.unfoldAsync to crawl through the pages and to do the HTTP requests (The idea and implementation are very close to what's described in this answer. This will create a Source[HttpResponse, _] that can then be consumed (Unmarshal to Multipart, split up into the individual parts, ...).
My problem now is that the consumption of the HttpResponses might take a while (Unmarshalling takes some time if the pages are large, maybe there will be some database requests at the end to persist some data, ...). So I would like the Source.unfoldAsync to backpressure if the downstream is slower. By default, the next HTTP request will be started as soon as the previous one finished.
So my question is: Is there some way to make Source.unfoldAsync backpressure on a slow downstream? If not, is there an alternative that makes backpressuring possible?
I can imagine a solution that makes use of the Host-Level Client-Side API that akka-http provides, as described here together with a cyclic graph where the response of first request will be used as input to generate the second request, but I haven't tried that yet and I'm not sure if this could work or not.
EDIT: After some days of playing around and reading the docs and some blogs, I'm not sure if I was on the right track with my assumption that the backpressure behavior of Source.unfoldAsync is the root cause. To add some more observations:
When the stream is started, I see several requests going out. This is no problem in the first place, as long as the resulting HttpResponse is consumed in a timely fashion (see here for a description)
If I don't change the default response-entity-subscription-timeout, I will run into the following error (I stripped out the URLs):
[WARN] [03/30/2019 13:44:58.984] [default-akka.actor.default-dispatcher-16] [default/Pool(shared->http://....)] [1 (WaitingForResponseEntitySubscription)] Response entity was not subscribed after 1 seconds. Make sure to read the response entity body or call discardBytes() on it. GET ... Empty -> 200 OK Chunked
This leads to an IllegalStateException that terminates the stream: java.lang.IllegalStateException: Substream Source cannot be materialized more than once
I observed that the unmarshalling of the response is the slowest part in the stream, which might make sense because the response body is a Multipart document and thereby relatively large. However, I would expect this part of the stream to signal less demand to the upstream (which is the Source.unfoldAsync part in my case). This should lead to the fact that less requests are made.
Some googling lead me to a discussion about an issue that seems to describe a similar problem. They also discuss the problems that occur when a response is not processed fast enough. The associated merge request will bring documentation changes that propose to completely consume the HttpResponse before continuing with the stream. In the discussion to the issue there are also doubts about whether or not it's a good idea to combine Akka Http with Akka Streams. So maybe I would have to change the implementation to directly do the unmarshalling inside the function that's being called by unfoldAsync.
According to the implementation of the Source.unfoldAsync the passed in function is only called when the source is pulled:
def onPull(): Unit = f(state).onComplete(asyncHandler)(akka.dispatch.ExecutionContexts.sameThreadExecutionContext)
So if the downstream is not pulling (backpressuring) the function passed in to the source is not called.
In your gist you use runForeach (which is the same as runWith(Sink.foreach)) that pulls the upstream as soon as the println is finished. So it is hard to notice backpressure here.
Try changing your example to runWith(Sink.queue) which will give you an SinkQueueWithCancel as the materialized value. Then, unless you call pull on the queue, the stream will be backpressured and will not issue requests.
Note that there could be one or more initial requests until the backpressure propagates through all of the stream.
I think I figured it out. As I already mentioned in the edit of my question, I found this comment to an issue in Akka HTTP, where the author says:
...it is simply not best practice to mix Akka http into a larger processing stream. Instead, you need a boundary around the Akka http parts of the stream that ensures they always consume their response before allowing the outer processing stream to proceed.
So I went ahead and tried it: Instead of doing the HTTP request and the unmarshalling in different stages of the stream, I directly unmarshal the response by flatMaping the Future[HttpResponse] into a Future[Multipart.General]. This makes sure that the HttpResponse is directly consumed and avoids the Response entity was not subscribed after 1 second errors. The crawl function looks slightly different now, because it has to return the unmarshalled Multipart.General object (for further processing) as well as the original HttpResponse (to be able to construct the next request out of the headers):
def crawl(reqOption: Option[HttpRequest])(implicit actorSystem: ActorSystem, materializer: Materializer, executionContext: ExecutionContext): Future[Option[(Option[HttpRequest], (HttpResponse, Multipart.General))]] = {
reqOption match {
case Some(request) =>
Http().singleRequest(request)
.flatMap(response => Unmarshal(response).to[Multipart.General].map(multipart => (response, multipart)))
.map {
case tuple#(response, multipart) =>
if (response.status.isFailure()) Some((None, tuple))
else nextRequest(response, HttpMethods.GET).map { case (req, res) => (req, (res, multipart)) }
}
case None => Future.successful(None)
}
}
The rest of the code has to change because of that. I created another gist that contains equivalent code like the gist from the original question.
I was expecting the two Akka projects to integrate better (the docs don't mention this limitation at the moment, and instead the HTTP API seems to encourage the user to use Akka HTTP and Akka Streams together), so this feels a bit like a workaround, but it solves my problem for now. I still have to figure out some other problems I encounter when integrating this part into my larger use case, but this is not part of this question here.

How to programmatically call Route in Akka Http

In Akka Http, it is possible to define the route system to manage a REST infrastructure in this way, as stated here: https://doc.akka.io/docs/akka-http/current/routing-dsl/overview.html
val route =
get {
pathSingleSlash {
complete(HttpEntity(ContentTypes.`text/html(UTF-8)`,"<html><body>Hello world!</body></html>"))
} ~
path("ping") {
complete("PONG!")
} ~
path("crash") {
sys.error("BOOM!")
}
}
Is there a way to programmatically invoke one of the route inside the same application, in a way that could be similar to the following statement?
val response = (new Invoker(route = route, method = "GET", url = "/ping", body = null)).Invoke()
where Response would be the same result of a remote HTTP call to the service?
The aforementioned API it's only to give an idea of what I have in mind, I would expect the capability to set the content type, headers, and so on.
In the end I managed to find out the answer to my own question by digging a bit more in Akka HTTP documentation.
As stated here: https://doc.akka.io/docs/akka-http/current/routing-dsl/routes.html, the Route is a type defined as follows:
type Route = RequestContext => Future[RouteResult]
where RequestContext is a wrapper for the HttpRequest. But is true as well that a Route can be converted, implicitly or not, to other function types, like this:
def asyncHandler(route: Route)(...): HttpRequest ⇒ Future[HttpResponse]
Hence, it is indeed possible to "call" a route by converting it to another function type, and then simply passing a HttpRequest build ad hoc, receiving a Future containing the desired response. The conversion required a little more time than the rest of the operations, but it's something that could be done while bootrstrapping the application.
Note: the conversion requires these imports, as stated here: https://doc.akka.io/docs/akka-http/current/introduction.html
implicit val system = ActorSystem("my-system")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
But these imports are already mandatory for the create of the service itself.
If this is for unit tests, you can use akka-http's test kit.
If this is for the application itself, you should not go through the route, you should just invoke the relevant services that the controller would use directly. If that is inconvenient (too much copy-pasta), refactor until it becomes possible.
As for the reason, I want my application to be wrapped inside both a web server (then use the route the “normal” way) and a daemon that responds to a message broker inbound message.
I have an application that does something like that actually.
But I came at this from the other way: I consider the broker message to be the "primary" format. It is "routed" inside of the consumer based purely on properties of the message itself (body contents, message key, topic name). The HTTP gateway is built on top of that: It has only a very limited number of API endpoints and routes (mostly for caller convenience, might as well have just a single one) and constructs a message that it then passes off to the message consumer (in my case, via the broker actually, so that the HTTP gateway does not even have to be on the same host as the consumer).
As a result, I don't have to "re-use" the HTTP route because that does not really do anything. All the shared processing logic happens at the lower level (inside the service, inside the consumer).

How to write unit test when you use Future?

I've wrote a class with some functions that does HTTP calls and returns a Future[String]. I use those functions inside a method that I need to write some tests:
def score(rawEvent: Json) = {
httpService
.get("name", formatJsonAttribute(rawEvent.name))
.onComplete { op =>
op.map { json =>
//What must be tested
}
}
}
The function onComplete doesn't have a return type - it returns Unit. How can I replace that onComplete to make my function return something to be tested?
I completely agree with #Michal, that you should always prefer map to onComplete with Futures. However I'd like to point out that, as you said yourself, what you wish to test is not the HTTP call itself (which relies on an HTTP client you probably don't need to test, a response from a server on which you may have no control, ...), but what you do with its answer.
So why not write a test, not on the function score, but on the function you wrote in your onComplete (or map, if you decided to change it)?
That way you will be able to test it with precise values for json, that you may wish to define as the result you will get from the server, but that you can control completely (for instance, you could test border cases without forcing your server to give unusual responses).
Testing that the two (HTTP call and callback function) sit well together is not a unit-test question, but an integration-test question, and should be done only once you know that your function does what is expected of it.
At that time, you will effectively need to check the value of a Future, in which case, you can use Await.result as #Michal suggested, or use the relevant constructs that your test framework gives. For instance, scalatest has an AsyncTestSuite trait for this kind of issue.
Use map instead of onComplete. It will also provide you with resolved value inside mapping function. The return type of score function will be Future[T] where T will be the result type of your processing.
In the tests you can use scala.concurrent.Await.result() function.

How do I return a File from an Akka Worker to Spray Routing

I have a akka actor that generates a bunch of files then zips them up, and returns that zip file. I want to then pass this file back to the complete method of spray routing to allow downloading of the zip file.
I would really like to be able to use getFileFrom, but I can't seem to figure out a way to utilize that with the future returned. I have also tried passing the file back to the complete function, but get a very cryptic error.
How do I chain futures to get the result I want to be downloaded in the browser?
Here is some of the code I am working with
get {
path("test") {
respondWithMediaType(`application/zip`) {
complete {
val pi = new PackageInfo("Test")
doCreate(pi)
}
}
}
}
def doCreate(pi: PackageInfo) = {
val response = (worker ? Create(pi))
.mapTo[Ok]
.map(result => result.zipFile: File)
.recover { case _ => "error" }
response
}
One of the many errors recieved while trying different things
could not find implicit value for parameter marshaller: spray.httpx.marshalling.ToResponseMarshaller[scala.concurrent.Future[Comparable[_ >: java.io.File with String <: Comparable[_ >: java.io.File with String <: java.io.Serializable] with java.io.Serializable] with java.io.Serializable]]
There are a few things to address here.
One is that after creating that file and sending it to the client you probably want to delete it. If you simply send it with getFromFile (doc) then you will not know when the client finished downloading the file and won't be able to delete that file. Here is how you can implement that functionality: Is it possible to install a callback after request processing is finished in Spray?.
The error with missing implicit means that Spray needs to know how to marshal your data back to the client. By default it can marshal only basic types like Strings, but you can implement your own custom marshallers. If you use approach described in the link I provided then you don't need to worry about that.
Finally that answer also provides a chunked response which is useful if your files are large enough.
If you decide to go with getFromFile look at this answer: Spray send xls file to client. I think that's exactly what you need.