Play framework action response delayed when creating multiple futures - scala

In the following action it should return response immediately after hitting URL but instead it waits till all the Future blocks are started and then only sends response. It waits till "Starting for group 10" is logged in console even though "Returning from action" is logged immediately after hitting URL.
def test = Action { implicit request =>
Future.successful(0 to 150).foreach { items =>
items.grouped(15).zipWithIndex.foreach{ itemGroupWithIndex =>
val (itemGroup, index) = itemGroupWithIndex
Future {
logger.info("************** Starting for group " + index)
itemGroup.foreach { item =>
Thread.sleep(1000)
logger.info("Completed for item " + item)
}
}
}
}
logger.info("************** Returning from action **************")
Ok(views.html.test("test page"))
}
I am not able to understand reason behind this delay and how can i make this action send response immediately.
Play framework version 2.5.9

Your Action is not Async. You have a synchronous endpoint which is why you see the Returning from action printed immediately on the console. You should probably use the Action.async as your processing type. Using async Actions will drastically improve the overall performance of your application and is highly recommended when building high throughput and responsive web applications.

Two points in your code needs to change
Asynchronous Action: Because you are using Future, the action should be asynchronous: Action.async{...}.
No Blocking Code: The whole point of using Future and asynchronous programming is not to have a code that "blocks" the execution. So I suggest to remove the Thread.sleep(1000) part of the code.
Note that if you write your code non-blocking way; whenever the action method get the result; it will perform the required action(s), such as logging or providing the view.

This is because there are race conditions in your Futures.
You need to ensure that you are returning a single Future[T] and not a Future[Future[T]] (or any layered variances).
If the Futures are independent of each other, use Future.sequence
example:
def future: Future[String] = Future.successful("hi")
def action = Action.async { _ =>
val futures: Seq[Future[String]] = (1 to 50).map(_ => future()).toSeq
val oneFuture = Future.sequence(futures)
oneFuture
}
This will avoid race conditions
FYI, this has nothing to do with the Play framework. This is concurrent programming in scala.

Related

Wrapping Pub-Sub Java API in Akka Streams Custom Graph Stage

I am working with a Java API from a data vendor providing real time streams. I would like to process this stream using Akka streams.
The Java API has a pub sub design and roughly works like this:
Subscription sub = createSubscription();
sub.addListener(new Listener() {
public void eventsReceived(List events) {
for (Event e : events)
buffer.enqueue(e)
}
});
I have tried to embed the creation of this subscription and accompanying buffer in a custom graph stage without much success. Can anyone guide me on the best way to interface with this API using Akka? Is Akka Streams the best tool here?
To feed a Source, you don't necessarily need to use a custom graph stage. Source.queue will materialize as a buffered queue to which you can add elements which will then propagate through the stream.
There are a couple of tricky things to be aware of. The first is that there's some subtlety around materializing the Source.queue so you can set up the subscription. Something like this:
def bufferSize: Int = ???
Source.fromMaterializer { (mat, att) =>
val (queue, source) = Source.queue[Event](bufferSize).preMaterialize()(mat)
val subscription = createSubscription()
subscription.addListener(
new Listener() {
def eventsReceived(events: java.util.List[Event]): Unit = {
import scala.collection.JavaConverters.iterableAsScalaIterable
import akka.stream.QueueOfferResult._
iterableAsScalaIterable(events).foreach { event =>
queue.offer(event) match {
case Enqueued => () // do nothing
case Dropped => ??? // handle a dropped pubsub element, might well do nothing
case QueueClosed => ??? // presumably cancel the subscription...
}
}
}
}
)
source.withAttributes(att)
}
Source.fromMaterializer is used to get access at each materialization to the materializer (which is what compiles the stream definition into actors). When we materialize, we use the materializer to preMaterialize the queue source so we have access to the queue. Our subscription adds incoming elements to the queue.
The API for this pubsub doesn't seem to support backpressure if the consumer can't keep up. The queue will drop elements it's been handed if the buffer is full: you'll probably want to do nothing in that case, but I've called it out in the match that you should make an explicit decision here.
Dropping the newest element is the synchronous behavior for this queue (there are other queue implementations available, but those will communicate dropping asynchronously which can be really bad for memory consumption in a burst). If you'd prefer something else, it may make sense to have a very small buffer in the queue and attach the "overall" Source (the one returned by Source.fromMaterializer) to a stage which signals perpetual demand. For example, a buffer(downstreamBufferSize, OverflowStrategy.dropHead) will drop the oldest event not yet processed. Alternatively, it may be possible to combine your Events in some meaningful way, in which case a conflate stage will automatically combine incoming Events if the downstream can't process them quickly.
Great answer! I did build something similar. There are also kamon metrics to monitor queue size exc.
class AsyncSubscriber(projectId: String, subscriptionId: String, metricsRegistry: CustomMetricsRegistry, pullParallelism: Int)(implicit val ec: Executor) {
private val logger = LoggerFactory.getLogger(getClass)
def bufferSize: Int = 1000
def source(): Source[(PubsubMessage, AckReplyConsumer), Future[NotUsed]] = {
Source.fromMaterializer { (mat, attr) =>
val (queue, source) = Source.queue[(PubsubMessage, AckReplyConsumer)](bufferSize).preMaterialize()(mat)
val receiver: MessageReceiver = {
(message: PubsubMessage, consumer: AckReplyConsumer) => {
metricsRegistry.inputEventQueueSize.update(queue.size())
queue.offer((message, consumer)) match {
case QueueOfferResult.Enqueued =>
metricsRegistry.inputQueueAddEventCounter.increment()
case QueueOfferResult.Dropped =>
metricsRegistry.inputQueueDropEventCounter.increment()
consumer.nack()
logger.warn(s"Buffer is full, message nacked. Pubsub should retry don't panic. If this happens too often, we should also tweak the buffer size or the autoscaler.")
case QueueOfferResult.Failure(ex) =>
metricsRegistry.inputQueueDropEventCounter.increment()
consumer.nack()
logger.error(s"Failed to offer message with id=${message.getMessageId()}", ex)
case QueueOfferResult.QueueClosed =>
logger.error("Destination Queue closed. Something went terribly wrong. Shutting down the jvm.")
consumer.nack()
mat.shutdown()
sys.exit(1)
}
}
}
val subscriptionName = ProjectSubscriptionName.of(projectId, subscriptionId)
val subscriber = Subscriber.newBuilder(subscriptionName, receiver).setParallelPullCount(pullParallelism).build
subscriber.startAsync().awaitRunning()
source.withAttributes(attr)
}
}
}

Can Spark ForEachPartitionAsync be async on worker nodes?

I write a custom spark sink. In my addBatch method I use ForEachPartitionAsync which if I'm not wrong only makes the driver work asynchronously, returning a future.
val work: FutureAction[Unit] = rdd.foreachPartitionAsync { rows =>
val sourceInfo: StreamSourceInfo = serializeRowsAsInputStream(schema, rows)
val ackIngestion = Future {
ingestRows(sourceInfo) } andThen {
case Success(ingestion) => ackIngestionDone(partitionId, ingestion)
}
Await.result(ackIngestion, timeOut) // I would like to remove this line..
}
work onSuccess {
case _ => // move data from temporary table, report success of all workers
}
work onFailure{
//delete tmp data
case t => throw t.getCause
}
I can't find a way to run the worker nodes without blocking on the Await call, as if I remove them a success is reported to the work future object although the future didn't really finish.
Is there a way to report to the driver that all the workers finished
their asynchronous jobs?
Note: I looked at the foreachPartitionAsync function and it has only one implementation that expects a function that returns a Unit (i would've expected it to have another one returning a future or maybe a CountDownLatch..)

How to test `Var`s of `scala.rx` with scalatest?

I have a method which connects to a websocket and gets stream messages from some really outside system.
The simplified version is:
def watchOrders(): Var[Option[Order]] = {
val value = Var[Option[Order]](None)
// onMessage( order => value.update(Some(order))
value
}
When I test it (with scalatest), I want to make it connect to the real outside system, and only check the first 4 orders:
test("watchOrders") {
var result = List.empty[Order]
val stream = client.watchOrders()
stream.foreach {
case Some(order) =>
result = depth :: result
if (result.size == 4) { // 1.
assert(orders should ...) // 2.
stream.kill() // 3.
}
case _ =>
}
Thread.sleep(10000) // 4.
}
I have 4 questions:
Is it the right way to check the first 4 orders? there is no take(4) method found in scala.rx
If the assert fails, the test still passes, how to fix it?
Is it the right way to stop the stream?
If the thread doesn't sleep here, the test will pass the code in case Some(order) never runs. Is there a better way to wait?
One approach you might consider to get a List out of a Var is to use the .fold combinator.
The other issue you have is dealing with the asynchronous nature of the data - assuming you really want to talk to this outside real world system in your test code (ie, this is closer to the integration test side of things), you are going to want to look at scalatest's support for async tests and will probably do something like construct a future out of a promise that you can complete when you accumulate the 4 elements in your list.
See: http://www.scalatest.org/user_guide/async_testing

Play framework Scala run job in background

Is there any way I can trigger a job from the controller (to not to wait for its completion) and display the message to the user that job will be running in the background?
I have one controller method which takes quite long time to run. So I want to make that run offline and display the message to the user that it will be running in the background.
I tried Action.async as shown below. But the processing of the Future object is still taking more time and getting timed out.
def submit(id: Int) = Action.async(parse.multipartFormData) { implicit request =>
val result = Future {
//process the data
}
result map {
res =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be ruuning in background."))
}
}
You can also return a result without waiting for the result of the future in a "fire and forget" way
def submit(id: Int) = Action(parse.multipartFormData) { implicit request =>
Future {
//process the data
}
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be running in background."))
}
The docs state:
By giving a Future[Result] instead of a normal Result, we are able to quickly generate the result without blocking. Play will then serve the result as soon as the promise is redeemed.
The web client will be blocked while waiting for the response, but nothing will be blocked on the server, and server resources can be used to serve other clients.
You can configure your client code to use ajax request and display a Waiting for data message for some part of the page without blocking the rest of the web page from loading.
I also tried the "Futures.timeout" option. It seems to work fine. But I'm not sure its correct way to do it or not.
result.withTimeout(20.seconds)(futures).map { res =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be updated in background."))
}.recover {
case e: scala.concurrent.TimeoutException =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be updated in background."))
}

Consuming a service using WS in Play

I was hoping someone can briefly go over the various ways of consuming a service (this one just returns a string, normally it would be JSON but I just want to understand the concepts here).
My service:
def ping = Action {
Ok("pong")
}
Now in my Play (2.3.x) application, I want to call my client and display the response.
When working with Futures, I want to display the value.
I am a bit confused, what are all the ways I could call this method i.e. there are some ways I see that use Success/Failure,
val futureResponse: Future[String] = WS.url(url + "/ping").get().map { response =>
response.body
}
var resp = ""
futureResponse.onComplete {
case Success(str) => {
Logger.trace(s"future success $str")
resp = str
}
case Failure(ex) => {
Logger.trace(s"future failed")
resp = ex.toString
}
}
Ok(resp)
I can see the trace in STDOUT for success/failure, but my controller action just returns "" to my browser.
I understand that this is because it returns a FUTURE and my action finishes before the future returns.
How can I force it to wait?
What options do I have with error handling?
If you really want to block until feature is completed look at the Future.ready() and Future.result() methods. But you shouldn't.
The point about Future is that you can tell it how to use the result once it arrived, and then go on, no blocks required.
Future can be the result of an Action, in this case framework takes care of it:
def index = Action.async {
WS.url(url + "/ping").get()
.map(response => Ok("Got result: " + response.body))
}
Look at the documentation, it describes the topic very well.
As for the error-handling, you can use Future.recover() method. You should tell it what to return in case of error and it gives you new Future that you should return from your action.
def index = Action.async {
WS.url(url + "/ping").get()
.map(response => Ok("Got result: " + response.body))
.recover{ case e: Exception => InternalServerError(e.getMessage) }
}
So the basic way you consume service is to get result Future, transform it in the way you want by using monadic methods(the methods that return new transformed Future, like map, recover, etc..) and return it as a result of an Action.
You may want to look at Play 2.2 -Scala - How to chain Futures in Controller Action and Dealing with failed futures questions.