Scheduling a task at a fixed time of the day with Akka - scala

I am a beginner with Akka. I need to schedule a task each day at a fixed time of the day, say 8AM. What I know how to do is scheduling a task periodically, for instance
import akka.util.duration._
scheduler.schedule(0 seconds, 10 minutes) {
doSomething()
}
What is the simplest way to schedule tasks at fixed times of the day in Akka?
A small parenthesis
It is easy to do what I want just using this feature. A toy implementation would look like
scheduler.schedule(0 seconds, 24 hours) {
val now = computeTimeOfDay()
val delay = desiredTime - now
scheduler.scheduleOnce(delay) {
doSomething()
}
}
It is not difficult, but I introduced a little race condition. In fact, consider what happens if I launch this just before 8AM. The external closure will start, but by the time I compute delay we may be after 8AM. This means that the internal closure - which should execute right away - will be postponed to tomorrow, thereby skipping execution for one day.
There are ways to fix this race condition: for instance I could perform the check every 12 hours, and instead of scheduling the task right away, sending it to an actor that will not accept more than one task at a time.
But probably, this already exist in Akka or some extension.

Write once, run everyday
val GatherStatisticsPeriod = 24 hours
private[this] val scheduled = new AtomicBoolean(false)
def calcBeforeMidnight: Duration = {
// TODO implement
}
def preRestart(reason: Throwable, message: Option[Any]) {
self ! GatherStatisticsScheduled(scheduled.get)
super.preRestart(reason, message)
}
def schedule(period: Duration, who: ActorRef) =
ServerRoot.actorSystem.scheduler
.scheduleOnce(period)(who ! GatherStatisticsTick)
def receive = {
case StartServer(nodeName) =>
sender ! ServerStarted(nodeName)
if (scheduled.compareAndSet(false, true))
schedule(calcBeforeMidnight, self)
case GatherStatisticsTick =>
stats.update
scheduled.set(true)
schedule(GatherStatisticsPeriod, self)
case GatherStatisticsScheduled(isScheduled) =>
if (isScheduled && scheduled.compareAndSet(false, isScheduled))
schedule(calcBeforeMidnight, self)
}
I believe that Akka's scheduler handles restarts internally, one way or another. I used non-persistent way of sending a message to self - actually no strict guarantee of delivery. Also, ticks may vary, so GatherStatisticsPeriod might be a function.

To use this kind of scheduling in Akka, you would have to roll your own or maybe use Quartz, either through Akka Camel or this prototype quartz for akka.
If you don't need anything fancy and extremely accurate, then I would just calculate the delay to the desired first time and use that as the start delay to the schedule call, and trust the interval.

Let's say you want to run your task every day at 13 pm.
import scala.concurrent.duration._
import java.time.LocalTime
val interval = 24.hours
val delay = {
val time = LocalTime.of(13, 0).toSecondOfDay
val now = LocalTime.now().toSecondOfDay
val fullDay = 60 * 60 * 24
val difference = time - now
if (difference < 0) {
fullDay + difference
} else {
time - now
}
}.seconds
system.scheduler.schedule(delay, interval)(doSomething())
Also remember that server timezone may be different from yours.

Just to add another way to achieve it, this can be done using Akka Streams by ticking a message and filtering on time.
Source
.tick(0.seconds, 2.seconds, "hello") // emits "hello" every two seconds
.filter(_ => {
val now = LocalDateTime.now.getSecond
now > 20 && now < 30 // will let through only if the timing is right.
})
.runForeach(n => println("final sink received " + n))

Related

Await for a Sequence of Futures with timeout without failing on TimeoutException

I have a sequence of scala Futures of same type.
I want, after some limited time, to get a result for the entire sequence while some futures may have succeeded, some may have failed and some haven't completed yet, the non completed futures should be considered failed.
I don't want to use Await each future sequentially.
I did look at this question: Scala waiting for sequence of futures
and try to use the solution from there, namely:
private def lift[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
futures.map(_.map { Success(_) }.recover { case t => Failure(t) })
def waitAll[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
Future.sequence(lift(futures))
futures: Seq[Future[MyObject]] = ...
val segments = Await.result(waitAll(futures), waitTimeoutMillis millis)
but I'm still getting a TimeoutException, I guess because some of the futures haven't completed yet.
and that answer also states,
Now Future.sequence(lifted) will be completed when every future is completed, and will represent successes and failures using Try.
But I want my Future to be completed after the timeout has passed, not when every future in the sequence has completed. What else can I do?
If I used raw Future (rather than some IO monad which has this functionality build-in, or without some Akka utils for exactly that) I would hack together utility like:
// make each separate future timeout
object FutureTimeout {
// separate EC for waiting
private val timeoutEC: ExecutorContext = ...
private def timeout[T](delay: Long): Future[T] = Future {
blocking {
Thread.sleep(delay)
}
throw new Exception("Timeout")
}(timeoutEC)
def apply[T](fut: Future[T], delat: Long)(
implicit ec: ExecutionContext
): Future[T] = Future.firstCompletedOf(Seq(
fut,
timeout(delay)
))
}
and then
Future.sequence(
futures
.map(FutureTimeout(_, delay))
.map(Success(_))
.recover { case e => Failure(e) }
)
Since each future would terminate at most after delay we would be able to collect them into one result right after that.
You have to remember though that no matter how would you trigger a timeout you would have no guarantee that the timeouted Future stops executing. It could run on and on on some thread somewhere, it's just that you wouldn't wait for the result. firstCompletedOf just makes this race more explicit.
Some other utilities (like e.g. Cats Effect IO) allow you to cancel computations (which is used in e.g. races like this one) but you still have to remember that JVM cannot arbitrarily "kill" a running thread, so that cancellation would happen after one stage of computation is completed and before the next one is started (so e.g. between .maps or .flatMaps).
If you aren't afraid of adding external deps there are other (and more reliable, as Thread.sleep is just a temporary ugly hack) ways of timing out a Future, like Akka utils. See also other questions like this.
Here is solution using monix
import monix.eval.Task
import monix.execution.Scheduler
val timeoutScheduler = Scheduler.singleThread("timeout") //it's safe to use single thread here because timeout tasks are very fast
def sequenceDiscardTimeouts[T](tasks: Task[T]*): Task[Seq[T]] = {
Task
.parSequence(
tasks
.map(t =>
t.map(Success.apply) // Map to success so we can collect the value
.timeout(500.millis)
.executeOn(timeoutScheduler) //This is needed to run timesouts in dedicated scheduler that won't be blocked by "blocking"/io work if you have any
.onErrorRecoverWith { ex =>
println("timed-out")
Task.pure(Failure(ex)) //It's assumed that any error is a timeout. It's possible to "catch" just timeout exception here
}
)
)
.map { res =>
res.collect { case Success(r) => r }
}
}
Testing code
implicit val mainScheduler = Scheduler.fixedPool(name = "main", poolSize = 10)
def slowTask(msg: String) = {
Task.sleep(Random.nextLong(1000).millis) //Sleep here to emulate a slow task
.map { _ =>
msg
}
}
val app = sequenceDiscardTimeouts(
slowTask("1"),
slowTask("2"),
slowTask("3"),
slowTask("4"),
slowTask("5"),
slowTask("6")
)
val started: Long = System.currentTimeMillis()
app.runSyncUnsafe().foreach(println)
println(s"Done in ${System.currentTimeMillis() - started} millis")
This will print an output different for each run but it should look like following
timed-out
timed-out
timed-out
3
4
5
Done in 564 millis
Please note the usage of two separate schedulers. This is to ensure that timeouts will fire even if the main scheduler is busy with business logic. You can test it by reducing poolSize for main scheduler.

Akka scheduler not completing

I need to send a message to an actor at specific intervals. I am using the following code:
object SendToActor extends App {
import Sender._
val system: ActorSystem = ActorSystem("sender")
try {
val senderActor: ActorRef = system.actorOf(Sender.props, "sendActor")
val sendSchedule =
system.scheduler.schedule(0 milliseconds, 5 minutes, senderActor, doSomething())
} finally {
system.terminate()
}
}
Unfortunately, the scheduler doesn't seem to run unless I do one of the following:
Put a readLine() right after it:
val sendSchedule = system.scheduler.schedule(0 milliseconds, 5 minutes, senderActor, doSomething())
readLine()
Put a Thread.sleep() right after it:
val sendSchedule = system.scheduler.schedule(0 milliseconds, 5 minutes, senderActor, doSomething())
Thread.sleep(10000)
Is there a reason why the scheduler won't run as coded above? Why does it require the sleep in order to work?
Probably because you're terminating the actor system immediately after defining the scheduler.

How to implement reliable Periodic Actor?

I am using PeriodicActor taken from akka-pattern with some minor changes
trait PeriodicActor[T] extends DecoratingActor with ActorLogging {
import context.dispatcher
var messages = new ListBuffer[T]
abstract override def preStart() = {
schedule()
super.preStart()
}
protected def schedule() {
context.system.scheduler.scheduleOnce(1 seconds, self, Tick)
}
receiver {
case Tick => {
flush()
schedule()
}
}
def flush() = {
handleMessages(this.messages)
.recover {
case NonFatal(e) => log.error(e, "error in actor")
}
this.messages.clear()
}
/**
* implement to handle buffered messages.
*/
def handleMessages(messages: ListBuffer[T]): Future[Any]
def buffer(msg: T) = {
messages.append(msg)
}
}
Every 1 second a Tick is sent to self, to flush all messages received during this second.
My problem is that scheduleOnce is not reliable enough. The Tick message might not get received, which means the Ticking mechanism will stop working.
So I thought of ways to make sure this will not stop:
Maybe add if statement on the buffer method, that will make sure for example the if the list get to a certain size, it will send a Tick to flush it. The problem here, is that if the there is one lost Tick message, the list might not get flushed for a while, which is not good for me.
thought maybe to add another scheduling of Tick message every 5 minutes to make sure the mechanism will keep working.
context.system.scheduler.schedule(5 minutes, longInterval, self, Tick)
What do you think? Is there a better way?
Your problem caused by the at-most-once delivery semantics which Akka guarantees. Since you're prepares consider sending additional Ticks every 5 minutes, let's assume that you don't need exactly-once delivery of Ticks, so the occasional extra Tick is acceptable. This means that you need at-least-once delivery of Ticks.
at-least-once is achieved using some sort of ACK-RETRY, in this case you could achieve this using the ask pattern.
protected def schedule() {
context.system.scheduler.scheduleOnce(1 seconds){
sendTick()
}
}
def sendTick() {
implicit val timeout = 1 second
(self ? Tick).onComplete{
case Success(_) => schedule() // <- queue up next tick
case Failure(_) => sendTick() // <- send another tick immediately
}
}
In your receiver, you need to provide the acknowledgement:
receiver {
case Tick => {
sender ! TickAck
flush()
schedule()
}
}
More information on akka delivery guarantees: http://www.mjlivesey.co.uk/2016/02/19/akka-delivery-guarantees.html
As well as lost messages another cause of failure is the actor dying. This requires a supervision strategy. The default strategy is to restart the actor, and the postRestart() method by default calls preStart() so unless you use a different supervision strategy or override postRestart() you should get acceptable behaviour; if the actor dies it will restart and schedule another Tick.

Handle twitter4j User Stream 420 Exception

The actual problem is this: I open up a User Stream to populate some cache of mine, some times, this stream gets a 420 exception (Too many login attempts in a short period of time.)
How long should I wait before trying to reestablish connection?
override def onException(ex: Exception): Unit = {
Logger.info("Exception:::" + ex.getMessage + ":::" + ex.getCause)
if (ex.getMessage.startsWith("420")) {
// Can't authenticate for now, thus has to fill up cache hole in next start
// Wait some time (How long?) Thread.sleep(5000L)
// Connect via restApi and fill up the holes in the cache
// Continue listening
}
}
I suppose you would have to use some backoff strategy here, also I wouldn't use sleep, I would keep my application asynchronous.
This probably is not strictly a solution to your problem since it's almost considerable pseudo code, but it could be a start. First I borrow from Play! the timeout future definition:
import scala.language.higherKinds
import scala.concurrent.duration.FiniteDuration
import java.util.concurrent.TimeUnit
import scala.concurrent.{ExecutionContext, Future, Promise => SPromise}
import play.api.libs.concurrent.Akka
import util.Try
def timeout[A](message: => A, duration: Long, unit: TimeUnit = TimeUnit.MILLISECONDS)(implicit ec: ExecutionContext): Future[A] = {
val p = SPromise[A]()
Akka.system.scheduler.scheduleOnce(FiniteDuration(duration, unit)) {
p.complete(Try(message))
}
p.future
}
This uses Akka to schedule a future execution and combined with a promise returns a future. At this point you could chain future execution using flatMap on the timeout future:
val timeoutFuture: Future[String] =
timeout("timeout", duration, TimeUnit.SECONDS)
timeoutFuture.flatMap(timeoutMessage => connectToStream())
At this point the connection is executed only after the timeout has expired but we still need to implement some kind of reconnection mechanism, for that we can use recover:
def twitterStream(duration: Long = 0, retry: Int = 0): Future[Any] = {
val timeoutFuture: Future[String] =
timeout("timeout", duration, TimeUnit.SECONDS)
// check how many time we tried to implement some stop trying strategy
// check how long is the duration and if too long reset.
timeoutFuture.flatMap(timeoutMessage => connectToStream())
.recover {
case connectionLost: SomeConnectionExpiredException =>
twitterStream(duration + 20, retry + 1) // try to reconnect
case ex: Exception if ex.getMessage.startsWith("420") =>
twitterStream(duration + 120, retry + 1) // try to reconect with a longer timer
case _ =>
someDefault()
}
}
def connectToStream(): Future[String] = {
// connect to twitter
// do some computation
// return some future with some result
Future("Tweets")
}
What happens here is that when an exception is catched from the future and if that exception is a 420 or some connection lost exception the recover is executed and the function is re-called restarting the connection after duration + 20 seconds.
A couple of notes, the code is untested (I could only compile it), also the backoff time here is linear (x + y), you may want to have a look at some exponential backoff strategy and lastly you will need Akka to implement the schedule once used in the timeout future (Play has already Akka available), for other possibility of using timeout on futures check this SO question.
Not sure if all this is overkill, probably there are shorter and easier solutions.

scala 2.10 callback at the end of a `Deadline`

In Scala 2.10, along with the new Future/Promise API, they introduced a Duration and Deadline utilities (as described here). I looked around but couldn't find anything that comes with the scala standard library, to do something like:
val deadline = 5 seconds fromNow
After(deadline){
//do stuff
}
//or
val deadlineFuture: Future[Nothing] = (5 seconds fromNow).asFuture
deadlineFuture onComplete {
//do stuff
}
Is there anything like that available that I've missed, or will I have to implement this kind of behavior myself?
Not quite built in, but they provide just enough rope.
The gist is to wait on an empty promise that must disappoint (i.e., time out).
import scala.concurrent._
import scala.concurrent.duration._
import scala.util._
import ExecutionContext.Implicits.global
object Test extends App {
val v = new SyncVar[Boolean]()
val deadline = 5 seconds fromNow
future(Await.ready(Promise().future, deadline.timeLeft)) onComplete { _ =>
println("Bye, now.")
v.put(true)
}
v.take()
// or
val w = new SyncVar[Boolean]()
val dropdeadline = 5 seconds fromNow
val p = Promise[Boolean]()
p.future onComplete {_ =>
println("Bye, now.")
w.put(true)
}
Try(Await.ready(Promise().future, dropdeadline.timeLeft))
p trySuccess true
w.take()
// rolling it
implicit class Expiry(val d: Deadline) extends AnyVal {
def expiring(f: =>Unit) {
future(Await.ready(Promise().future, d.timeLeft)) onComplete { _ =>
f
}
}
}
val x = new SyncVar[Boolean]()
5 seconds fromNow expiring {
println("That's all, folks.")
x.put(true)
}
x.take() // wait for it
}
Its just a timestamp holder. For example you need to distribute execution of N sequential tasks, in T hours. When you have finished with the first one, you check a deadline and schedule next task depending on (time left)/(tasks left) interval. At some point of time isOverdue() occurs, and you just execute tasks left, in parallel.
Or you could check isOverdue(), and if still false, use timeLeft() for setting timeout on executing the next task, for example.
It's much better than manipulating with Date and Calendar to determine time left. Also Duration was used in Akka for timing.