Scala future sequence and timeout handling

Scala future sequence and timeout handling - scala

There are some good hints how to combine futures with timeouts.
However I'm curious how to do this with Future sequence sequenceOfFutures
My first approach looks like this
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits._
object FutureSequenceScala extends App {
println("Creating futureList")
val timeout = 2 seconds
val futures = List(1000, 1500, 1200, 800, 2000) map { ms =>
val f = future {
Thread sleep ms
ms toString
}
Future firstCompletedOf Seq(f, fallback(timeout))
}
println("Creating waitinglist")
val waitingList = Future sequence futures
println("Created")
val results = Await result (waitingList, timeout * futures.size)
println(results)
def fallback(timeout: Duration) = future {
Thread sleep (timeout toMillis)
"-1"
}
}
Is there a better way to handle timeouts in a sequence of futures or is this a valid solution?

There are a few things in your code here that you might want to reconsider. For starters, I'm not a huge fan of submitting tasks into the ExecutionContext that have the sole purpose of simulating a timeout and also have Thread.sleep used in them. The sleep call is blocking and you probably want to avoid having a task in the execution context that is purely blocking for the sake of waiting a fixed amount of time. I'm going to steal from my answer here and suggest that for pure timeout handling, you should use something like I outlined in that answer. The HashedWheelTimer is a highly efficient timer implementation that is mush better suited to timeout handling than a task that just sleeps.
Now, if you go that route, the next change I would suggest concerns handling the individual timeout related failures for each future. If you want an individual failure to completely fail the aggregate Future returned from the sequence call, then do nothing extra. If you don't want that to happen, and instead want a timeout to return some default value instead, then you can use recover on the Future like this:
withTimeout(someFuture).recover{
case ex:TimeoutException => someDefaultValue
}
Once you've done that, you can take advantage of the non-blocking callbacks and do something like this:
waitingList onComplete{
case Success(results) => //handle success
case Failure(ex) => //handle fail
}
Each future has a timeout and thus will not just run infinitely. There is no need IMO to block there and provide an additional layer of timeout handling via the atMost param to Await.result. But I guess this assumes you are okay with the non-blocking approach. If you really need to block there, then you should not be waiting timeout * futures.size amount of time. These futures are running in parallel; the timeout there should only need to be as long as the individual timeouts for the futures themselves (or just slightly longer to account for any delays in cpu/timing). It certainly should not be the timeout * the total number of futures.

Here's a version that shows how bad your blocking fallback is.
Notice that the executor is single threaded and you're creating many fallbacks.
#cmbaxter is right, your master timeout shouldn't be timeout * futures.size, it should be bigger!
#cmbaxter is also right that you want to think non-blocking. Once you do that, and you want to impose timeouts, then you will pick a timer component for that, see his linked answer (also linked from your linked answer).
That said, I still like my answer from your link, in so far as sitting in a loop waiting for the next thing that should timeout is really simple.
It just takes a list of futures and their timeouts and a fallback value.
Maybe there is a use case for that, such as a simple app that just blocks for some results (like your test) and must not exit before results are in.
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext
import java.util.concurrent.Executors
import java.lang.System.{ nanoTime => now }
object Test extends App {
//implicit val xc = ExecutionContext.global
implicit val xc = ExecutionContext fromExecutorService (Executors.newSingleThreadExecutor)
def timed[A](body: =>A): A = {
val start = now
val res = body
val end = now
Console println (Duration fromNanos end-start).toMillis + " " + res
res
}
println("Creating futureList")
val timeout = 1500 millis
val futures = List(1000, 1500, 1200, 800, 2000) map { ms =>
val f = future {
timed {
blocking(Thread sleep ms)
ms toString
}
}
Future firstCompletedOf Seq(f, fallback(timeout))
}
println("Creating waitinglist")
val waitingList = Future sequence futures
println("Created")
timed {
val results = Await result (waitingList, 2 * timeout * futures.size)
println(results)
}
xc.shutdown
def fallback(timeout: Duration) = future {
timed {
blocking(Thread sleep (timeout toMillis))
"-1"
}
}
}
What happened:
Creating futureList
Creating waitinglist
Created
1001 1000
1500 -1
1500 1500
1500 -1
1200 1200
1500 -1
800 800
1500 -1
2000 2000
1500 -1
List(1000, 1500, 1200, 800, 2000)
14007 ()

Monix Task has timeout support:
import monix.execution.Scheduler.Implicits.global
import monix.eval._
import scala.concurrent.duration._
println("Creating futureList")
val tasks = List(1000, 1500, 1200, 800, 2000).map{ ms =>
Task {
Thread.sleep(ms)
ms.toString
}.timeoutTo(2.seconds, Task.now("-1"))
}
println("Creating waitinglist")
val waitingList = Task.gather(tasks) // Task.sequence is true/literally "sequencing" operation
println("Created")
val results = Await.result(waitingList, timeout * futures.size)
println(results)

Related

Await for a Sequence of Futures with timeout without failing on TimeoutException

I have a sequence of scala Futures of same type.
I want, after some limited time, to get a result for the entire sequence while some futures may have succeeded, some may have failed and some haven't completed yet, the non completed futures should be considered failed.
I don't want to use Await each future sequentially.
I did look at this question: Scala waiting for sequence of futures
and try to use the solution from there, namely:
private def lift[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
futures.map(_.map { Success(_) }.recover { case t => Failure(t) })
def waitAll[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
Future.sequence(lift(futures))
futures: Seq[Future[MyObject]] = ...
val segments = Await.result(waitAll(futures), waitTimeoutMillis millis)
but I'm still getting a TimeoutException, I guess because some of the futures haven't completed yet.
and that answer also states,
Now Future.sequence(lifted) will be completed when every future is completed, and will represent successes and failures using Try.
But I want my Future to be completed after the timeout has passed, not when every future in the sequence has completed. What else can I do?

If I used raw Future (rather than some IO monad which has this functionality build-in, or without some Akka utils for exactly that) I would hack together utility like:
// make each separate future timeout
object FutureTimeout {
// separate EC for waiting
private val timeoutEC: ExecutorContext = ...
private def timeout[T](delay: Long): Future[T] = Future {
blocking {
Thread.sleep(delay)
}
throw new Exception("Timeout")
}(timeoutEC)
def apply[T](fut: Future[T], delat: Long)(
implicit ec: ExecutionContext
): Future[T] = Future.firstCompletedOf(Seq(
fut,
timeout(delay)
))
}
and then
Future.sequence(
futures
.map(FutureTimeout(_, delay))
.map(Success(_))
.recover { case e => Failure(e) }
)
Since each future would terminate at most after delay we would be able to collect them into one result right after that.
You have to remember though that no matter how would you trigger a timeout you would have no guarantee that the timeouted Future stops executing. It could run on and on on some thread somewhere, it's just that you wouldn't wait for the result. firstCompletedOf just makes this race more explicit.
Some other utilities (like e.g. Cats Effect IO) allow you to cancel computations (which is used in e.g. races like this one) but you still have to remember that JVM cannot arbitrarily "kill" a running thread, so that cancellation would happen after one stage of computation is completed and before the next one is started (so e.g. between .maps or .flatMaps).
If you aren't afraid of adding external deps there are other (and more reliable, as Thread.sleep is just a temporary ugly hack) ways of timing out a Future, like Akka utils. See also other questions like this.

Here is solution using monix
import monix.eval.Task
import monix.execution.Scheduler
val timeoutScheduler = Scheduler.singleThread("timeout") //it's safe to use single thread here because timeout tasks are very fast
def sequenceDiscardTimeouts[T](tasks: Task[T]*): Task[Seq[T]] = {
Task
.parSequence(
tasks
.map(t =>
t.map(Success.apply) // Map to success so we can collect the value
.timeout(500.millis)
.executeOn(timeoutScheduler) //This is needed to run timesouts in dedicated scheduler that won't be blocked by "blocking"/io work if you have any
.onErrorRecoverWith { ex =>
println("timed-out")
Task.pure(Failure(ex)) //It's assumed that any error is a timeout. It's possible to "catch" just timeout exception here
}
)
)
.map { res =>
res.collect { case Success(r) => r }
}
}
Testing code
implicit val mainScheduler = Scheduler.fixedPool(name = "main", poolSize = 10)
def slowTask(msg: String) = {
Task.sleep(Random.nextLong(1000).millis) //Sleep here to emulate a slow task
.map { _ =>
msg
}
}
val app = sequenceDiscardTimeouts(
slowTask("1"),
slowTask("2"),
slowTask("3"),
slowTask("4"),
slowTask("5"),
slowTask("6")
)
val started: Long = System.currentTimeMillis()
app.runSyncUnsafe().foreach(println)
println(s"Done in ${System.currentTimeMillis() - started} millis")
This will print an output different for each run but it should look like following
timed-out
timed-out
timed-out
3
4
5
Done in 564 millis
Please note the usage of two separate schedulers. This is to ensure that timeouts will fire even if the main scheduler is busy with business logic. You can test it by reducing poolSize for main scheduler.

How to read only Successful values from a Seq of Futures

I am learning akka/scala and am trying to read only those Futures that succeeded from a Seq[Future[Int]] but cant get anything to work.
I simulated an array of 10 Future[Int] some of which fail depending on the value FailThreshold takes (all fail for 10 and none fail for 0).
I then try to read them into an ArrayBuffer (could not find a way to return immutable structure with the values).
Also, there isn't a filter on Success/Failure so had to run an onComplete on each future and update buffer as a side-effect.
Even when the FailThreshold=0 and the Seq has all Future set to Success, the array buffer is sometimes empty and different runs return array of different sizes.
I tried a few other suggestions from the web like using Future.sequence on the list but this throws exception if any of future variables fail.
import akka.actor._
import akka.pattern.ask
import scala.concurrent.{Await, Future, Promise}
import scala.concurrent.duration._
import scala.util.{Timeout, Failure, Success}
import concurrent.ExecutionContext.Implicits.global
case object AskNameMessage
implicit val timeout = Timeout(5, SECONDS)
val FailThreshold = 0
class HeyActor(num: Int) extends Actor {
def receive = {
case AskNameMessage => if (num<FailThreshold) {Thread.sleep(1000);sender ! num} else sender ! num
}
}
class FLPActor extends Actor {
def receive = {
case t: IndexedSeq[Future[Int]] => {
println(t)
val b = scala.collection.mutable.ArrayBuffer.empty[Int]
t.foldLeft( b ){ case (bf,ft) =>
ft.onComplete { case Success(v) => bf += ft.value.get.get }
bf
}
println(b)
}
}
}
val system = ActorSystem("AskTest")
val flm = (0 to 10).map( (n) => system.actorOf(Props(new HeyActor(n)), name="futureListMake"+(n)) )
val flp = system.actorOf(Props(new FLPActor), name="futureListProcessor")
// val delay = akka.pattern.after(500 millis, using=system.scheduler)(Future.failed( throw new IllegalArgumentException("DONE!") ))
val delay = akka.pattern.after(500 millis, using=system.scheduler)(Future.successful(0))
val seqOfFtrs = (0 to 10).map( (n) => Future.firstCompletedOf( Seq(delay, flm(n) ? AskNameMessage) ).mapTo[Int] )
flp ! seqOfFtrs
The receive in FLPActor mostly gets
Vector(Future(Success(0)), Future(Success(1)), Future(Success(2)), Future(Success(3)), Future(Success(4)), Future(Success(5)), Future(Success(6)), Future(Success(7)), Future(Success(8)), Future(Success(9)), Future(Success(10)))
but the array buffer b has varying number of values and empty at times.
Can someone please point me to gaps here,
why would the array buffer have varying sizes even when all Future have resolved to Success,
what is the correct pattern to use when we want to ask different actors with TimeOut and use only those asks that have successfully returned for further processing.

Instead of directly sending the IndexedSeq[Future[Int]], you should transform to Future[IndexedSeq[Int]] and then pipe it to the next actor. You don't send the Futures directly to an actor. You have to pipe it.
HeyActor can stay unchanged.
After
val seqOfFtrs = (0 to 10).map( (n) => Future.firstCompletedOf( Seq(delay, flm(n) ? AskNameMessage) ).mapTo[Int] )
do a recover, and use Future.sequence to turn it into one Future:
val oneFut = Future.sequence(seqOfFtrs.map(f=>f.map(Some(_)).recover{ case (ex: Throwable) => None})).map(_.flatten)
If you don't understand the business with Some, None, and flatten, then make sure you understand the Option type. One way to remove values from a sequence is to map values in the sequence to Option (either Some or None) and then to flatten the sequence. The None values are removed and the Some values are unwrapped.
After you have transformed your data into a single Future, pipe it over to FLPActor:
oneFut pipeTo flp
FLPActor should be rewritten with the following receive function:
def receive = {
case printme: IndexedSeq[Int] => println(printme)
}
In Akka, modifying some state in the main thread of your actor from a Future or the onComplete of a Future is a big no-no. In the worst case, it results in race conditions. Remember that each Future runs on its own thread, so running a Future inside an actor means you have concurrent work being done in different threads. Having the Future directly modify some state in your actor while the actor is also processing some state is a recipe for disaster. In Akka, you process all changes to state directly in the primary thread of execution of the main actor. If you have some work done in a Future and need to access that work from the main thread of an actor, you pipe it to that actor. The pipeTo pattern is functional, correct, and safe for accessing the finished computation of a Future.
To answer your question about why FLPActor is not printing out the IndexedSeq correctly: you are printing out the ArrayBuffer before your Futures have been completed. onComplete isn't the right idiom to use in this case, and you should avoid it in general as it isn't good functional style.
Don't forget the import akka.pattern.pipe for the pipeTo syntax.

Akka scheduler not completing

I need to send a message to an actor at specific intervals. I am using the following code:
object SendToActor extends App {
import Sender._
val system: ActorSystem = ActorSystem("sender")
try {
val senderActor: ActorRef = system.actorOf(Sender.props, "sendActor")
val sendSchedule =
system.scheduler.schedule(0 milliseconds, 5 minutes, senderActor, doSomething())
} finally {
system.terminate()
}
}
Unfortunately, the scheduler doesn't seem to run unless I do one of the following:
Put a readLine() right after it:
val sendSchedule = system.scheduler.schedule(0 milliseconds, 5 minutes, senderActor, doSomething())
readLine()
Put a Thread.sleep() right after it:
val sendSchedule = system.scheduler.schedule(0 milliseconds, 5 minutes, senderActor, doSomething())
Thread.sleep(10000)
Is there a reason why the scheduler won't run as coded above? Why does it require the sleep in order to work?

Probably because you're terminating the actor system immediately after defining the scheduler.

Handle twitter4j User Stream 420 Exception

The actual problem is this: I open up a User Stream to populate some cache of mine, some times, this stream gets a 420 exception (Too many login attempts in a short period of time.)
How long should I wait before trying to reestablish connection?
override def onException(ex: Exception): Unit = {
Logger.info("Exception:::" + ex.getMessage + ":::" + ex.getCause)
if (ex.getMessage.startsWith("420")) {
// Can't authenticate for now, thus has to fill up cache hole in next start
// Wait some time (How long?) Thread.sleep(5000L)
// Connect via restApi and fill up the holes in the cache
// Continue listening
}
}

I suppose you would have to use some backoff strategy here, also I wouldn't use sleep, I would keep my application asynchronous.
This probably is not strictly a solution to your problem since it's almost considerable pseudo code, but it could be a start. First I borrow from Play! the timeout future definition:
import scala.language.higherKinds
import scala.concurrent.duration.FiniteDuration
import java.util.concurrent.TimeUnit
import scala.concurrent.{ExecutionContext, Future, Promise => SPromise}
import play.api.libs.concurrent.Akka
import util.Try
def timeout[A](message: => A, duration: Long, unit: TimeUnit = TimeUnit.MILLISECONDS)(implicit ec: ExecutionContext): Future[A] = {
val p = SPromise[A]()
Akka.system.scheduler.scheduleOnce(FiniteDuration(duration, unit)) {
p.complete(Try(message))
}
p.future
}
This uses Akka to schedule a future execution and combined with a promise returns a future. At this point you could chain future execution using flatMap on the timeout future:
val timeoutFuture: Future[String] =
timeout("timeout", duration, TimeUnit.SECONDS)
timeoutFuture.flatMap(timeoutMessage => connectToStream())
At this point the connection is executed only after the timeout has expired but we still need to implement some kind of reconnection mechanism, for that we can use recover:
def twitterStream(duration: Long = 0, retry: Int = 0): Future[Any] = {
val timeoutFuture: Future[String] =
timeout("timeout", duration, TimeUnit.SECONDS)
// check how many time we tried to implement some stop trying strategy
// check how long is the duration and if too long reset.
timeoutFuture.flatMap(timeoutMessage => connectToStream())
.recover {
case connectionLost: SomeConnectionExpiredException =>
twitterStream(duration + 20, retry + 1) // try to reconnect
case ex: Exception if ex.getMessage.startsWith("420") =>
twitterStream(duration + 120, retry + 1) // try to reconect with a longer timer
case _ =>
someDefault()
}
}
def connectToStream(): Future[String] = {
// connect to twitter
// do some computation
// return some future with some result
Future("Tweets")
}
What happens here is that when an exception is catched from the future and if that exception is a 420 or some connection lost exception the recover is executed and the function is re-called restarting the connection after duration + 20 seconds.
A couple of notes, the code is untested (I could only compile it), also the backoff time here is linear (x + y), you may want to have a look at some exponential backoff strategy and lastly you will need Akka to implement the schedule once used in the timeout future (Play has already Akka available), for other possibility of using timeout on futures check this SO question.
Not sure if all this is overkill, probably there are shorter and easier solutions.

scala 2.10 callback at the end of a `Deadline`

In Scala 2.10, along with the new Future/Promise API, they introduced a Duration and Deadline utilities (as described here). I looked around but couldn't find anything that comes with the scala standard library, to do something like:
val deadline = 5 seconds fromNow
After(deadline){
//do stuff
}
//or
val deadlineFuture: Future[Nothing] = (5 seconds fromNow).asFuture
deadlineFuture onComplete {
//do stuff
}
Is there anything like that available that I've missed, or will I have to implement this kind of behavior myself?

Not quite built in, but they provide just enough rope.
The gist is to wait on an empty promise that must disappoint (i.e., time out).
import scala.concurrent._
import scala.concurrent.duration._
import scala.util._
import ExecutionContext.Implicits.global
object Test extends App {
val v = new SyncVar[Boolean]()
val deadline = 5 seconds fromNow
future(Await.ready(Promise().future, deadline.timeLeft)) onComplete { _ =>
println("Bye, now.")
v.put(true)
}
v.take()
// or
val w = new SyncVar[Boolean]()
val dropdeadline = 5 seconds fromNow
val p = Promise[Boolean]()
p.future onComplete {_ =>
println("Bye, now.")
w.put(true)
}
Try(Await.ready(Promise().future, dropdeadline.timeLeft))
p trySuccess true
w.take()
// rolling it
implicit class Expiry(val d: Deadline) extends AnyVal {
def expiring(f: =>Unit) {
future(Await.ready(Promise().future, d.timeLeft)) onComplete { _ =>
f
}
}
}
val x = new SyncVar[Boolean]()
5 seconds fromNow expiring {
println("That's all, folks.")
x.put(true)
}
x.take() // wait for it
}

Its just a timestamp holder. For example you need to distribute execution of N sequential tasks, in T hours. When you have finished with the first one, you check a deadline and schedule next task depending on (time left)/(tasks left) interval. At some point of time isOverdue() occurs, and you just execute tasks left, in parallel.
Or you could check isOverdue(), and if still false, use timeLeft() for setting timeout on executing the next task, for example.
It's much better than manipulating with Date and Calendar to determine time left. Also Duration was used in Akka for timing.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse