scala 2.10 callback at the end of a `Deadline` - scala

In Scala 2.10, along with the new Future/Promise API, they introduced a Duration and Deadline utilities (as described here). I looked around but couldn't find anything that comes with the scala standard library, to do something like:
val deadline = 5 seconds fromNow
After(deadline){
//do stuff
}
//or
val deadlineFuture: Future[Nothing] = (5 seconds fromNow).asFuture
deadlineFuture onComplete {
//do stuff
}
Is there anything like that available that I've missed, or will I have to implement this kind of behavior myself?

Not quite built in, but they provide just enough rope.
The gist is to wait on an empty promise that must disappoint (i.e., time out).
import scala.concurrent._
import scala.concurrent.duration._
import scala.util._
import ExecutionContext.Implicits.global
object Test extends App {
val v = new SyncVar[Boolean]()
val deadline = 5 seconds fromNow
future(Await.ready(Promise().future, deadline.timeLeft)) onComplete { _ =>
println("Bye, now.")
v.put(true)
}
v.take()
// or
val w = new SyncVar[Boolean]()
val dropdeadline = 5 seconds fromNow
val p = Promise[Boolean]()
p.future onComplete {_ =>
println("Bye, now.")
w.put(true)
}
Try(Await.ready(Promise().future, dropdeadline.timeLeft))
p trySuccess true
w.take()
// rolling it
implicit class Expiry(val d: Deadline) extends AnyVal {
def expiring(f: =>Unit) {
future(Await.ready(Promise().future, d.timeLeft)) onComplete { _ =>
f
}
}
}
val x = new SyncVar[Boolean]()
5 seconds fromNow expiring {
println("That's all, folks.")
x.put(true)
}
x.take() // wait for it
}

Its just a timestamp holder. For example you need to distribute execution of N sequential tasks, in T hours. When you have finished with the first one, you check a deadline and schedule next task depending on (time left)/(tasks left) interval. At some point of time isOverdue() occurs, and you just execute tasks left, in parallel.
Or you could check isOverdue(), and if still false, use timeLeft() for setting timeout on executing the next task, for example.
It's much better than manipulating with Date and Calendar to determine time left. Also Duration was used in Akka for timing.

Related

Await for a Sequence of Futures with timeout without failing on TimeoutException

I have a sequence of scala Futures of same type.
I want, after some limited time, to get a result for the entire sequence while some futures may have succeeded, some may have failed and some haven't completed yet, the non completed futures should be considered failed.
I don't want to use Await each future sequentially.
I did look at this question: Scala waiting for sequence of futures
and try to use the solution from there, namely:
private def lift[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
futures.map(_.map { Success(_) }.recover { case t => Failure(t) })
def waitAll[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
Future.sequence(lift(futures))
futures: Seq[Future[MyObject]] = ...
val segments = Await.result(waitAll(futures), waitTimeoutMillis millis)
but I'm still getting a TimeoutException, I guess because some of the futures haven't completed yet.
and that answer also states,
Now Future.sequence(lifted) will be completed when every future is completed, and will represent successes and failures using Try.
But I want my Future to be completed after the timeout has passed, not when every future in the sequence has completed. What else can I do?
If I used raw Future (rather than some IO monad which has this functionality build-in, or without some Akka utils for exactly that) I would hack together utility like:
// make each separate future timeout
object FutureTimeout {
// separate EC for waiting
private val timeoutEC: ExecutorContext = ...
private def timeout[T](delay: Long): Future[T] = Future {
blocking {
Thread.sleep(delay)
}
throw new Exception("Timeout")
}(timeoutEC)
def apply[T](fut: Future[T], delat: Long)(
implicit ec: ExecutionContext
): Future[T] = Future.firstCompletedOf(Seq(
fut,
timeout(delay)
))
}
and then
Future.sequence(
futures
.map(FutureTimeout(_, delay))
.map(Success(_))
.recover { case e => Failure(e) }
)
Since each future would terminate at most after delay we would be able to collect them into one result right after that.
You have to remember though that no matter how would you trigger a timeout you would have no guarantee that the timeouted Future stops executing. It could run on and on on some thread somewhere, it's just that you wouldn't wait for the result. firstCompletedOf just makes this race more explicit.
Some other utilities (like e.g. Cats Effect IO) allow you to cancel computations (which is used in e.g. races like this one) but you still have to remember that JVM cannot arbitrarily "kill" a running thread, so that cancellation would happen after one stage of computation is completed and before the next one is started (so e.g. between .maps or .flatMaps).
If you aren't afraid of adding external deps there are other (and more reliable, as Thread.sleep is just a temporary ugly hack) ways of timing out a Future, like Akka utils. See also other questions like this.
Here is solution using monix
import monix.eval.Task
import monix.execution.Scheduler
val timeoutScheduler = Scheduler.singleThread("timeout") //it's safe to use single thread here because timeout tasks are very fast
def sequenceDiscardTimeouts[T](tasks: Task[T]*): Task[Seq[T]] = {
Task
.parSequence(
tasks
.map(t =>
t.map(Success.apply) // Map to success so we can collect the value
.timeout(500.millis)
.executeOn(timeoutScheduler) //This is needed to run timesouts in dedicated scheduler that won't be blocked by "blocking"/io work if you have any
.onErrorRecoverWith { ex =>
println("timed-out")
Task.pure(Failure(ex)) //It's assumed that any error is a timeout. It's possible to "catch" just timeout exception here
}
)
)
.map { res =>
res.collect { case Success(r) => r }
}
}
Testing code
implicit val mainScheduler = Scheduler.fixedPool(name = "main", poolSize = 10)
def slowTask(msg: String) = {
Task.sleep(Random.nextLong(1000).millis) //Sleep here to emulate a slow task
.map { _ =>
msg
}
}
val app = sequenceDiscardTimeouts(
slowTask("1"),
slowTask("2"),
slowTask("3"),
slowTask("4"),
slowTask("5"),
slowTask("6")
)
val started: Long = System.currentTimeMillis()
app.runSyncUnsafe().foreach(println)
println(s"Done in ${System.currentTimeMillis() - started} millis")
This will print an output different for each run but it should look like following
timed-out
timed-out
timed-out
3
4
5
Done in 564 millis
Please note the usage of two separate schedulers. This is to ensure that timeouts will fire even if the main scheduler is busy with business logic. You can test it by reducing poolSize for main scheduler.

What's the best way to wrap a monix Task with a start time and end time print the difference?

This is what I'm trying right now but it only prints "hey" and not metrics.
I don't want to add metric related stuff in the main function.
import java.util.Date
import monix.eval.Task
import monix.execution.Scheduler.Implicits.global
import scala.concurrent.Await
import scala.concurrent.duration.Duration
class A {
def fellow(): Task[Unit] = {
val result = Task {
println("hey")
Thread.sleep(1000)
}
result
}
}
trait AA extends A {
override def fellow(): Task[Unit] = {
println("AA")
val result = super.fellow()
val start = new Date()
result.foreach(e => {
println("AA", new Date().getTime - start.getTime)
})
result
}
}
val a = new A with AA
val res: Task[Unit] = a.fellow()
Await.result(res.runAsync, Duration.Inf)
You can describe a function such as this:
def measure[A](task: Task[A], logMillis: Long => Task[Unit]): Task[A] =
Task.deferAction { sc =>
val start = sc.clockMonotonic(TimeUnit.MILLISECONDS)
val stopTimer = Task.suspend {
val end = sc.clockMonotonic(TimeUnit.MILLISECONDS)
logMillis(end - start)
}
task.redeemWith(
a => stopTimer.map(_ => a)
e => stopTimer.flatMap(_ => Task.raiseError(e))
)
}
Some piece of advice:
Task values should be pure, along with the functions returning Tasks — functions that trigger side effects and return Task as results are broken
Task is not a 1:1 replacement for Future; when describing a Task, all side effects should be suspended (wrapped) in Task
foreach triggers the Task's evaluation and that's not good, because it triggers the side effects; I was thinking of deprecating and removing it, since its presence is tempting
stop using trait inheritance and just use plain functions — unless you deeply understand OOP and subtyping, it's best to avoid it if possible; and if you're into the Cake pattern, stop doing it and maybe join a support group 🙂
never measure time duration via new Date(), you need a monotonic clock for that and on top of the JVM that's System.nanoTime, which can be accessed via Monix's Scheduler by clockMonotonic, as exemplified above, the Scheduler being given to you via deferAction
stop blocking threads, because that's error prone — instead of doing Thread.sleep, do Task.sleep and all Await.result calls are problematic, unless they are in main or in some other place where dealing with asynchrony is not possible
Hope this helps.
Cheers,
Like #Pierre mentioned, latest version of Monix Task has Task.timed, you can do
timed <- task.timed
(duration, t) = timed

Get actual value in Future Scala [duplicate]

I am a newbie to scala futures and I have a doubt regarding the return value of scala futures.
So, generally syntax for a scala future is
def downloadPage(url: URL) = Future[List[Int]] {
}
I want to know how to access the List[Int] from some other method which calls this method.
In other words,
val result = downloadPage("localhost")
then what should be the approach to get List[Int] out of the future ?
I have tried using map method but not able to do this successfully.`
The case of Success(listInt) => I want to return the listInt and I am not able to figure out how to do that.
The best practice is that you don't return the value. Instead you just pass the future (or a version transformed with map, flatMap, etc.) to everyone who needs this value and they can add their own onComplete.
If you really need to return it (e.g. when implementing a legacy method), then the only thing you can do is to block (e.g. with Await.result) and you need to decide how long to await.
You need to wait for the future to complete to get the result given some timespan, here's something that would work:
import scala.concurrent.duration._
def downloadPage(url: URL) = Future[List[Int]] {
List(1,2,3)
}
val result = downloadPage("localhost")
val myListInt = result.result(10 seconds)
Ideally, if you're using a Future, you don't want to block the executing thread, so you would move your logic that deals with the result of your Future into the onComplete method, something like this:
result.onComplete({
case Success(listInt) => {
//Do something with my list
}
case Failure(exception) => {
//Do something with my error
}
})
I hope you already solved this since it was asked in 2013 but maybe my answer can help someone else:
If you are using Play Framework, it support async Actions (actually all Actions are async inside). An easy way to create an async Action is using Action.async(). You need to provide a Future[Result]to this function.
Now you can just make transformations from your Future[List[Int]] to Future[Result] using Scala's map, flatMap, for-comprehension or async/await. Here an example from Play Framework documentation.
import play.api.libs.concurrent.Execution.Implicits.defaultContext
def index = Action.async {
val futureInt = scala.concurrent.Future { intensiveComputation() }
futureInt.map(i => Ok("Got result: " + i))
}
You can do something like that. If The wait time that is given in Await.result method is less than it takes the awaitable to execute, you will have a TimeoutException, and you need to handle the error (or any other error).
import scala.concurrent._
import ExecutionContext.Implicits.global
import scala.util.{Try, Success, Failure}
import scala.concurrent.duration._
object MyObject {
def main(args: Array[String]) {
val myVal: Future[String] = Future { silly() }
// values less than 5 seconds will go to
// Failure case, because silly() will not be done yet
Try(Await.result(myVal, 10 seconds)) match {
case Success(extractedVal) => { println("Success Happened: " + extractedVal) }
case Failure(_) => { println("Failure Happened") }
case _ => { println("Very Strange") }
}
}
def silly(): String = {
Thread.sleep(5000)
"Hello from silly"
}
}
The best way I’ve found to think of a Future is a box that will, at some point, contain the thing that you want. The key thing with a Future is that you never open the box. Trying to force open the box will lead you to blocking and grief. Instead, you put the Future in another, larger box, typically using the map method.
Here’s an example of a Future that contains a String. When the Future completes, then Console.println is called:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
object Main {
def main(args:Array[String]) : Unit = {
val stringFuture: Future[String] = Future.successful("hello world!")
stringFuture.map {
someString =>
// if you use .foreach you avoid creating an extra Future, but we are proving
// the concept here...
Console.println(someString)
}
}
}
Note that in this case, we’re calling the main method and then… finishing. The string’s Future, provided by the global ExecutionContext, does the work of calling Console.println. This is great, because when we give up control over when someString is going to be there and when Console.println is going to be called, we let the system manage itself. In constrast, look what happens when we try to force the box open:
val stringFuture: Future[String] = Future.successful("hello world!")
val someString = Future.await(stringFuture)
In this case, we have to wait — keep a thread twiddling its thumbs — until we get someString back. We’ve opened the box, but we’ve had to commandeer the system’s resources to get at it.
It wasn't yet mentioned, so I want to emphasize the point of using Future with for-comprehension and the difference of sequential and parallel execution.
For example, for sequential execution:
object FuturesSequential extends App {
def job(n: Int) = Future {
Thread.sleep(1000)
println(s"Job $n")
}
val f = for {
f1 <- job(1)
f2 <- job(2)
f3 <- job(3)
f4 <- job(4)
f5 <- job(5)
} yield List(f1, f2, f3, f4, f5)
f.map(res => println(s"Done. ${res.size} jobs run"))
Thread.sleep(6000) // We need to prevent main thread from quitting too early
}
And for parallel execution (note that the Future are before the for-comprehension):
object FuturesParallel extends App {
def job(n: Int) = Future {
Thread.sleep(1000)
println(s"Job $n")
}
val j1 = job(1)
val j2 = job(2)
val j3 = job(3)
val j4 = job(4)
val j5 = job(5)
val f = for {
f1 <- j1
f2 <- j2
f3 <- j3
f4 <- j4
f5 <- j5
} yield List(f1, f2, f3, f4, f5)
f.map(res => println(s"Done. ${res.size} jobs run"))
Thread.sleep(6000) // We need to prevent main thread from quitting too early
}

Handle twitter4j User Stream 420 Exception

The actual problem is this: I open up a User Stream to populate some cache of mine, some times, this stream gets a 420 exception (Too many login attempts in a short period of time.)
How long should I wait before trying to reestablish connection?
override def onException(ex: Exception): Unit = {
Logger.info("Exception:::" + ex.getMessage + ":::" + ex.getCause)
if (ex.getMessage.startsWith("420")) {
// Can't authenticate for now, thus has to fill up cache hole in next start
// Wait some time (How long?) Thread.sleep(5000L)
// Connect via restApi and fill up the holes in the cache
// Continue listening
}
}
I suppose you would have to use some backoff strategy here, also I wouldn't use sleep, I would keep my application asynchronous.
This probably is not strictly a solution to your problem since it's almost considerable pseudo code, but it could be a start. First I borrow from Play! the timeout future definition:
import scala.language.higherKinds
import scala.concurrent.duration.FiniteDuration
import java.util.concurrent.TimeUnit
import scala.concurrent.{ExecutionContext, Future, Promise => SPromise}
import play.api.libs.concurrent.Akka
import util.Try
def timeout[A](message: => A, duration: Long, unit: TimeUnit = TimeUnit.MILLISECONDS)(implicit ec: ExecutionContext): Future[A] = {
val p = SPromise[A]()
Akka.system.scheduler.scheduleOnce(FiniteDuration(duration, unit)) {
p.complete(Try(message))
}
p.future
}
This uses Akka to schedule a future execution and combined with a promise returns a future. At this point you could chain future execution using flatMap on the timeout future:
val timeoutFuture: Future[String] =
timeout("timeout", duration, TimeUnit.SECONDS)
timeoutFuture.flatMap(timeoutMessage => connectToStream())
At this point the connection is executed only after the timeout has expired but we still need to implement some kind of reconnection mechanism, for that we can use recover:
def twitterStream(duration: Long = 0, retry: Int = 0): Future[Any] = {
val timeoutFuture: Future[String] =
timeout("timeout", duration, TimeUnit.SECONDS)
// check how many time we tried to implement some stop trying strategy
// check how long is the duration and if too long reset.
timeoutFuture.flatMap(timeoutMessage => connectToStream())
.recover {
case connectionLost: SomeConnectionExpiredException =>
twitterStream(duration + 20, retry + 1) // try to reconnect
case ex: Exception if ex.getMessage.startsWith("420") =>
twitterStream(duration + 120, retry + 1) // try to reconect with a longer timer
case _ =>
someDefault()
}
}
def connectToStream(): Future[String] = {
// connect to twitter
// do some computation
// return some future with some result
Future("Tweets")
}
What happens here is that when an exception is catched from the future and if that exception is a 420 or some connection lost exception the recover is executed and the function is re-called restarting the connection after duration + 20 seconds.
A couple of notes, the code is untested (I could only compile it), also the backoff time here is linear (x + y), you may want to have a look at some exponential backoff strategy and lastly you will need Akka to implement the schedule once used in the timeout future (Play has already Akka available), for other possibility of using timeout on futures check this SO question.
Not sure if all this is overkill, probably there are shorter and easier solutions.

Scheduling a task at a fixed time of the day with Akka

I am a beginner with Akka. I need to schedule a task each day at a fixed time of the day, say 8AM. What I know how to do is scheduling a task periodically, for instance
import akka.util.duration._
scheduler.schedule(0 seconds, 10 minutes) {
doSomething()
}
What is the simplest way to schedule tasks at fixed times of the day in Akka?
A small parenthesis
It is easy to do what I want just using this feature. A toy implementation would look like
scheduler.schedule(0 seconds, 24 hours) {
val now = computeTimeOfDay()
val delay = desiredTime - now
scheduler.scheduleOnce(delay) {
doSomething()
}
}
It is not difficult, but I introduced a little race condition. In fact, consider what happens if I launch this just before 8AM. The external closure will start, but by the time I compute delay we may be after 8AM. This means that the internal closure - which should execute right away - will be postponed to tomorrow, thereby skipping execution for one day.
There are ways to fix this race condition: for instance I could perform the check every 12 hours, and instead of scheduling the task right away, sending it to an actor that will not accept more than one task at a time.
But probably, this already exist in Akka or some extension.
Write once, run everyday
val GatherStatisticsPeriod = 24 hours
private[this] val scheduled = new AtomicBoolean(false)
def calcBeforeMidnight: Duration = {
// TODO implement
}
def preRestart(reason: Throwable, message: Option[Any]) {
self ! GatherStatisticsScheduled(scheduled.get)
super.preRestart(reason, message)
}
def schedule(period: Duration, who: ActorRef) =
ServerRoot.actorSystem.scheduler
.scheduleOnce(period)(who ! GatherStatisticsTick)
def receive = {
case StartServer(nodeName) =>
sender ! ServerStarted(nodeName)
if (scheduled.compareAndSet(false, true))
schedule(calcBeforeMidnight, self)
case GatherStatisticsTick =>
stats.update
scheduled.set(true)
schedule(GatherStatisticsPeriod, self)
case GatherStatisticsScheduled(isScheduled) =>
if (isScheduled && scheduled.compareAndSet(false, isScheduled))
schedule(calcBeforeMidnight, self)
}
I believe that Akka's scheduler handles restarts internally, one way or another. I used non-persistent way of sending a message to self - actually no strict guarantee of delivery. Also, ticks may vary, so GatherStatisticsPeriod might be a function.
To use this kind of scheduling in Akka, you would have to roll your own or maybe use Quartz, either through Akka Camel or this prototype quartz for akka.
If you don't need anything fancy and extremely accurate, then I would just calculate the delay to the desired first time and use that as the start delay to the schedule call, and trust the interval.
Let's say you want to run your task every day at 13 pm.
import scala.concurrent.duration._
import java.time.LocalTime
val interval = 24.hours
val delay = {
val time = LocalTime.of(13, 0).toSecondOfDay
val now = LocalTime.now().toSecondOfDay
val fullDay = 60 * 60 * 24
val difference = time - now
if (difference < 0) {
fullDay + difference
} else {
time - now
}
}.seconds
system.scheduler.schedule(delay, interval)(doSomething())
Also remember that server timezone may be different from yours.
Just to add another way to achieve it, this can be done using Akka Streams by ticking a message and filtering on time.
Source
.tick(0.seconds, 2.seconds, "hello") // emits "hello" every two seconds
.filter(_ => {
val now = LocalDateTime.now.getSecond
now > 20 && now < 30 // will let through only if the timing is right.
})
.runForeach(n => println("final sink received " + n))