waiting for ws future response in play framework - scala

I am trying to build a service that grab some pages from another web service and process the content and return results to users. I am using Play 2.2.3 Scala.
val aas = WS.url("http://localhost/").withRequestTimeout(1000).withQueryString(("mid", mid), ("t", txt)).get
val result = aas.map {
response =>
(response.json \ "status").asOpt[Int].map {
st => status = st
}
(response.json \ "msg").asOpt[String].map {
txt => msg = txt
}
}
val rs1 = Await.result(result, 5 seconds)
if (rs1.isDefined) {
Ok("good")
}
The problem is that the service will wait 5 seconds to return "good" even the WS request takes 100 ms. I also cannot set Await time to 100ms because the other web service I am requesting may take between 100ms to 1 second to respond.
My question is: is there a way to process and serve the results as soon as they are ready instead of wait a fixed amount of time?

#wingedsubmariner already provided the answer. Since there is no code example, I will just post what it should be:
def wb = Action.async{ request =>
val aas = WS.url("http://localhost/").withRequestTimeout(1000).get
aas.map(response =>{
Ok("responded")
})
}
Now you don't need to wait until the WS to respond and then decide what to do. You can just tell play to do something when it responds.

Related

Akka Streams for server streaming (gRPC, Scala)

I am new to Akka Streams and gRPC, I am trying to build an endpoint where client sends a single request and the server sends multiple responses.
This is my protobuf
syntax = "proto3";
option java_multiple_files = true;
option java_package = "customer.service.proto";
service CustomerService {
rpc CreateCustomer(CustomerRequest) returns (stream CustomerResponse) {}
}
message CustomerRequest {
string customerId = 1;
string customerName = 2;
}
message CustomerResponse {
enum Status {
No_Customer = 0;
Creating_Customer = 1;
Customer_Created = 2;
}
string customerId = 1;
Status status = 2;
}
I am trying to achieve this by sending customer request then the server will first check and respond No_Customer then it will send Creating_Customer and finally server will say Customer_Created.
I have no idea where to start for it implementation, looked for hours but still clueless, I will be very thankful if anyone can point me in the right direction.
The place to start is the Akka gRPC documentation and, in particular, the service WalkThrough. It is pretty straightforward to get the samples working in a clean project.
The relevant server sample method is this:
override def itKeepsReplying(in: HelloRequest): Source[HelloReply, NotUsed] = {
println(s"sayHello to ${in.name} with stream of chars...")
Source(s"Hello, ${in.name}".toList).map(character => HelloReply(character.toString))
}
The problem is now to create a Source that returns the right results, but that depends on how you are planning to implement the server so it is difficult to answer. Check the Akka Streams documentation for various options.
The client code is simpler, just call runForeach on the Source that gets returned by CreateCustomer as in the sample:
def runStreamingReplyExample(): Unit = {
val responseStream = client.itKeepsReplying(HelloRequest("Alice"))
val done: Future[Done] =
responseStream.runForeach(reply => println(s"got streaming reply: ${reply.message}"))
done.onComplete {
case Success(_) =>
println("streamingReply done")
case Failure(e) =>
println(s"Error streamingReply: $e")
}
}

How to do some cleanup after the client closes the connection

I'm creating a proxy API using akka that does some preparations before forwarding the request to the actual API. For one of the endpoints, the response is streaming json data and the client may close the connection at any time. Akka seems to handle this automatically, but the issue is I need to do some cleanup after the client closes the connection.
path("query") {
post {
decodeRequest {
entity(as[Query]) { query =>
// proxy does some preparations
val json: String = query.prepared.toJson.toString()
// proxy sends request to actual server
val request = HttpRequest(
method = HttpMethods.POST,
uri = serverUrl + "/query",
entity = HttpEntity(ContentTypes.`application/json`, json)
)
val responseFuture = Http().singleRequest(request)
val response: HttpResponse = Await.result(responseFuture, PROXY_TIMEOUT)
// proxy forwards server's response to user
complete(response)
}
}
}
}
I've tried doing something like
responseFuture.onComplete(_ => doCleanup())
But that doesn't work because responseFuture completes immediately even though the server continues to send data until the client closes the connection. complete(response) also returns immediately.
So I'm wondering how I can make a call to doCleanup() only after the client has closed the connection.
Edit: The cleanup I need to do is because the proxy creates some data streams that are meant to be temporary and only persist until the last message is sent by the server. Once that happens these streams need to be deleted.
You can do it with minimal changes to you code like that:
val responseFuture = Http().singleRequest(request)
val response: HttpResponse = try {
Await.result(responseFuture, PROXY_TIMEOUT)
} finally {
doCleanup()
}
complete(response)
or you can do it without blocking:
val responseFuture = Http().singleRequest(request)
val cleaned = responseFuture.andThen{case _ => doCleanUp()}
complete(cleaned) //it's possible to complete response with Future

Alpakka S3 connector stream won't handle the load, throwing akka.stream.BufferOverflowException

I have an akka-http service and I am trying out the alpakka s3 connector for uploading files. Previously I was using a temporary file and then uploading with Amazon SDK. This approach required some adjustments for Amazon SDK to make it more scala like, but it could handle even a 1000 requests at once. Throughput wasn't amazing, but all of the requests went through eventually. Here is the code before changes, with no alpakka:
```
path("uploadfile") {
withRequestTimeout(20.seconds) {
storeUploadedFile("csv", tempDestination) {
case (metadata, file) =>
val uploadFuture = upload(file, file.toPath.getFileName.toString)
onComplete(uploadFuture) {
case Success(_) => complete(StatusCodes.OK)
case Failure(_) => complete(StatusCodes.FailedDependency)
}
}
}
}
}
case class S3UploaderException(msg: String) extends Exception(msg)
def upload(file: File, key: String): Future[String] = {
val s3Client = AmazonS3ClientBuilder.standard()
.withCredentials(new DefaultAWSCredentialsProviderChain())
.withRegion(Regions.EU_WEST_3)
.build()
val promise = Promise[String]()
val listener = new ProgressListener() {
override def progressChanged(progressEvent: ProgressEvent): Unit = {
(progressEvent.getEventType: #unchecked) match {
case ProgressEventType.TRANSFER_FAILED_EVENT => promise.failure(S3UploaderException(s"Uploading a file with a key: $key"))
case ProgressEventType.TRANSFER_COMPLETED_EVENT |
ProgressEventType.TRANSFER_CANCELED_EVENT => promise.success(key)
}
}
}
val request = new PutObjectRequest("S3_BUCKET", key, file)
request.setGeneralProgressListener(listener)
s3Client.putObject(request)
promise.future
}
```
When I changed this to use alpakka connector, the code looks much nicer as we can just connect the ByteSource and alpakka Sink together. However this approach cannot handle such a big load. When I execute 1000 requests at once (10 kb files) less than 10% go through and the rest fails with exception:
akka.stream.alpakka.s3.impl.FailedUpload: Exceeded configured
max-open-requests value of [32]. This means that the request queue of
this pool
(HostConnectionPoolSetup(bargain-test.s3-eu-west-3.amazonaws.com,443,ConnectionPoolSetup(ConnectionPoolSettings(4,0,5,32,1,30
seconds,ClientConnectionSettings(Some(User-Agent: akka-http/10.1.3),10
seconds,1
minute,512,None,WebSocketSettings(,ping,Duration.Inf,akka.http.impl.settings.WebSocketSettingsImpl$$$Lambda$4787/1279590204#4d809f4c),List(),ParserSettings(2048,16,64,64,8192,64,8388608,256,1048576,Strict,RFC6265,true,Set(),Full,Error,Map(If-Range
-> 0, If-Modified-Since -> 0, If-Unmodified-Since -> 0, default -> 12, Content-MD5 -> 0, Date -> 0, If-Match -> 0, If-None-Match -> 0,
User-Agent ->
32),false,true,akka.util.ConstantFun$$$Lambda$4534/1539966798#69c23cd4,akka.util.ConstantFun$$$Lambda$4534/1539966798#69c23cd4,akka.util.ConstantFun$$$Lambda$4535/297570074#6b426c59),None,TCPTransport),New,1
second),akka.http.scaladsl.HttpsConnectionContext#7e0f3726,akka.event.MarkerLoggingAdapter#74f3a78b)))
has completely filled up because the pool currently does not process
requests fast enough to handle the incoming request load. Please retry
the request later. See
http://doc.akka.io/docs/akka-http/current/scala/http/client-side/pool-overflow.html
for more information.
Here is how the summary of a Gatling test looks like:
---- Response Time Distribution ----------------------------------------
t < 800 ms 0 ( 0%)
800 ms < t < 1200 ms 0 ( 0%)
t > 1200 ms 90 ( 9%)
failed 910 ( 91%)
When I execute 100 of simultaneous requests, half of it fails. So, still close to satisfying.
This is a new code:
```
path("uploadfile") {
withRequestTimeout(20.seconds) {
extractRequestContext { ctx =>
implicit val materializer = ctx.materializer
extractActorSystem { actorSystem =>
fileUpload("csv") {
case (metadata, byteSource) =>
val uploadFuture = byteSource.runWith(S3Uploader.sink("s3FileKey")(actorSystem, materializer))
onComplete(uploadFuture) {
case Success(_) => complete(StatusCodes.OK)
case Failure(_) => complete(StatusCodes.FailedDependency)
}
}
}
}
}
}
def sink(s3Key: String)(implicit as: ActorSystem, m: Materializer) = {
val regionProvider = new AwsRegionProvider {
def getRegion: String = Regions.EU_WEST_3.getName
}
val settings = new S3Settings(MemoryBufferType, None, new DefaultAWSCredentialsProviderChain(), regionProvider, false, None, ListBucketVersion2)
val s3Client = new S3Client(settings)(as, m)
s3Client.multipartUpload("S3_BUCKET", s3Key)
}
```
The complete code with both endpoints can be seen here
I have a couple of questions.
1) Is this a feature? Is this what we can call a backpressure?
2) If I would like this code to behave like the old approach with a temporary file (no failed requests and all of them finish at some point) what do I have to do? I was trying to implement a queue for the stream (link to the source below), but this made no difference. The code can be seen here.
(* DISCLAIMER * I am still a scala newbie trying to quickly understand akka streams and find some workaround for the issue. There are big chances that there is something simple wrong in this code. * DISCLAIMER *)
It’s a backpressure feature.
Exceeded configured max-open-requests value of [32] In the config max-open-requests is set to 32 by default.
Streaming is used to work with big amount of data, not to handle many many requests per second.
Akka developers had to put something for max-open-requests. They choose 32 for some reason for sure. And they had no idea what it will be used for. May it be sending 1000 32KB files or 1000 1GB files at once? They don’t know. But they still want to make sure that by default (and 80% of people use defaults probably) the apps will be handled gracefully and safely. So they had to limit processing power.
You asked to do 1000 “now” but I am pretty sure AWS did not send 1000 files simultaneously but used some queue, which may be a good case for you too if you have many small files to upload.
But it is perfectly fine to tune it to your case!
If you know your machine and the target will take care of more simultaneous connections, you can change the number to a higher value.
Also, for a lot of HTTP calls use cached host connection pool.

Play framework Scala run job in background

Is there any way I can trigger a job from the controller (to not to wait for its completion) and display the message to the user that job will be running in the background?
I have one controller method which takes quite long time to run. So I want to make that run offline and display the message to the user that it will be running in the background.
I tried Action.async as shown below. But the processing of the Future object is still taking more time and getting timed out.
def submit(id: Int) = Action.async(parse.multipartFormData) { implicit request =>
val result = Future {
//process the data
}
result map {
res =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be ruuning in background."))
}
}
You can also return a result without waiting for the result of the future in a "fire and forget" way
def submit(id: Int) = Action(parse.multipartFormData) { implicit request =>
Future {
//process the data
}
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be running in background."))
}
The docs state:
By giving a Future[Result] instead of a normal Result, we are able to quickly generate the result without blocking. Play will then serve the result as soon as the promise is redeemed.
The web client will be blocked while waiting for the response, but nothing will be blocked on the server, and server resources can be used to serve other clients.
You can configure your client code to use ajax request and display a Waiting for data message for some part of the page without blocking the rest of the web page from loading.
I also tried the "Futures.timeout" option. It seems to work fine. But I'm not sure its correct way to do it or not.
result.withTimeout(20.seconds)(futures).map { res =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be updated in background."))
}.recover {
case e: scala.concurrent.TimeoutException =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be updated in background."))
}

continuously fetch database results with scalaz.stream

I'm new to scala and extremely new to scalaz. Through a different stackoverflow answer and some handholding, I was able to use scalaz.stream to implement a Process that would continuously fetch twitter API results. Now i'd like to do the same thing for the Cassandra DB where the twitter handles are stored.
The code for fetching the twitter results is here:
def urls: Seq[(Handle,URL)] = {
Await.result(
getAll(connection).map { List =>
List.map(twitterToGet =>
(twitterToGet.handle, urlBoilerPlate + twitterToGet.handle + parameters + twitterToGet.sinceID)
)
},
5 seconds)
}
val fetchUrl = channel.lift[Task, (Handle, URL), Fetched] {
url => Task.delay {
val finalResult = callTwitter(url)
if (finalResult.tweets.nonEmpty) {
connection.updateTwitter(finalResult)
} else {
println("\n" + finalResult.handle + " does not have new tweets")
}
s"\ntwitter Fetch & database update completed"
}
}
val P = Process
val process =
(time.awakeEvery(3.second) zipWith P.emitAll(urls))((b, url) => url).
through(fetchUrl)
val fetched = process.runLog.run
fetched.foreach(println)
What I'm planning to do is use
def urls: Seq[(Handle,URL)] = {
to continuously fetch Cassandra results (with an awakeEvery) and send them off to an actor to run the above twitter fetching code.
My question is, what is the best way to implement this with scalaz.stream? Note that i'd like it to get ALL the database results, then have a delay before getting ALL the database results again. Should i use the same architecture as the twitter fetching code above? If so, how would I create a channel.lift that doesn't require input? Is there a better way in scalaz.stream?
Thanks in advance
Got this working today. The cleanest way to do it would be to emit the database results as a stream and attach a sink to the end of the stream to do the twitter processing. What I actually have is a bit more complex as it retrieves the database results continuously and sends them off to an actor for the twitter processing. The style of retrieving the results follows my original code from my question:
val connection = new simpleClient(conf.getString("cassandra.node"))
implicit val threadPool = new ScheduledThreadPoolExecutor(4)
val system = ActorSystem("mySystem")
val twitterFetch = system.actorOf(Props[TwitterFetch], "twitterFetch")
def myEffect = channel.lift[Task, simpleClient, String]{
connection: simpleClient => Task.delay{
val results = Await.result(
getAll(connection).map { List =>
List.map(twitterToGet =>
(twitterToGet.handle, urlBoilerPlate + twitterToGet.handle + parameters + twitterToGet.sinceID)
)
},
5 seconds)
println("Query Successful, results= " +results +" at " + format.print(System.currentTimeMillis()))
twitterFetch ! fetched(connection, results)
s"database fetch completed"
}
}
val P = Process
val process =
(time.awakeEvery(3.second).flatMap(_ => P.emit(connection).
through(myEffect)))
val fetching = process.runLog.run
fetching.foreach(println)
Some notes:
I had asked about using channel.lift without input, but it became clear that the input should be the cassandra connection.
The line
val process =
(time.awakeEvery(3.second).flatMap(_ => P.emit(connection).
through(myEffect)))
Changed from zipWith to flatMap because I wanted to retrieve the results continuously instead of once.