Crontab style scheduling in Play 2.4.x? - scala

Technically I can install cron on the machine and curl the url, but I'm trying to avoid that. Any way to accomplish this?
Reason I want to avoid cron is so I can easily change the schedule or stop it completely without also ssh'ing into the machine to do so.

Take a look at: https://github.com/enragedginger/akka-quartz-scheduler.
Refer to http://quartz-scheduler.org/api/2.1.7/org/quartz/CronExpression.html for valid CronExpressions and examples.
An example taken from the docs:
An example schedule called Every-30-Seconds which, aptly, fires-off every 30 seconds:
akka {
quartz {
schedules {
Every30Seconds {
description = "A cron job that fires off every 30 seconds"
expression = "*/30 * * ? * *"
calendar = "OnlyBusinessHours"
}
}
}
}
You can integrate this into your Play! application (probably in your Global application)

You can use the Akka scheduler.
val scheduler = Akka.system(app).scheduler
scheduler.schedule(0 seconds, 1 hour) {
// run this block every hour
}
The first parameter is a delay, so if you wanted to delay to a specific time you could easily calculate the target time with some simple date arithmetic.

Check out https://github.com/philcali/cronish
Some example code from README.md:
val payroll = task {
println("You have just been paid... Finally!")
}
// Yes... that's how you run it
payroll executes "every last Friday in every month"
val greetings = job (println("Hello there")) describedAs "General Greetings"
// give a delayed start
val delayed = greetings runs "every day at 7:30" in 5.seconds
// give an exact time to start
val exact = greetings runs "every day at noon" starting now + 1.week
// resets a job to its definition
val reseted = exact.reset()
reseted starting now + 1.day

Related

Killing the spark.sql

I am new to scala and spark both .
I have a code in scala which executes quieres in while loop one after the other.
What we need to do is if a particular query takes more than a certain time , for example # 10 mins we should be able to stop the query execution for that particular query and move on to the next one
for example
do {
var f = Future(
spark.sql("some query"))
)
f onSucess {
case suc - > println("Query ran in 10mins")
}
f failure {
case fail -> println("query took more than 10mins")
}
}while(some condition)
var result = Await.ready(f,Duration(10,TimeUnit.MINUTES))
I understand that when we call spark.sql the control is sent to spark which i need to kill/stop when the duration is over so that i can get back the resources
I have tried multiple things but I am not sure how to solve this.
Any help would be welcomed as i am stuck with this.

Schedule Celery task to run after other task(s) complete

I want to accomplish something like this:
results = []
for i in range(N):
data = generate_data_slowly()
res = tasks.process_data.apply_async(data)
results.append(res)
celery.collect(results).then(tasks.combine_processed_data())
ie launch asynchronous tasks over a long period of time, then schedule a dependent task that will only be executed once all earlier tasks are complete.
I've looked at things like chain and chord, but it seems like they only work if you can construct your task graph completely upfront.
For anyone interested, I ended up using this snippet:
#app.task(bind=True, max_retries=None)
def wait_for(self, task_id_or_ids):
try:
ready = app.AsyncResult(task_id_or_ids).ready()
except TypeError:
ready = all(app.AsyncResult(task_id).ready()
for task_id in task_id_or_ids)
if not ready:
self.retry(countdown=2**self.request.retries)
And writing the workflow something like this:
task_ids = []
for i in range(N):
task = (generate_data_slowly.si(i) |
process_data.si(i)
)
task_id = task.delay().task_id
task_ids.append(task_id)
final_task = (wait_for(task_ids) |
combine_processed_data.si()
)
final_task.delay()
That way you would be running your tasks synchronously.
The solution depends entirely on how and where data are collected. Roughly, given that generate_data_slowly and tasks.process_data are synchronized, a better approach would be to join both in one task (or a chain) and to group them.
chord will allow you to add a callback to that group.
The simplest example would be:
from celery import chord
#app.task
def getnprocess_data():
data = generate_data_slowly()
return whatever_process_data_does(data)
header = [getnprocess_data.s() for i in range(N)]
callback = combine_processed_data.s()
chord(header)(callback).get()

Schedule a job for the first day of each month

I need to schedule a task for the first day of each month. Up until now, I have been using this:
system.scheduler.schedule(0.microseconds, 30.days, schedulerActor, "update")
But as you may have guessed it, this ends up sometimes running the task twice a month (march) or none a month (february). Is there a better way to schedule the task for the first day of each month using Akka Scheduler?
Built-in Akka scheduler is more a delayer than a scheduler. I would recommend using akka-quartz-scheduler. This module allows you to actually schedule tasks to run when you want.
The usage is simple. Some config:
akka {
quartz {
schedules {
YourScheduleName {
description = "A cron job that fires off every first of the month at 5AM"
expression = "0 0 5 1 1/1 ? *"
}
}
}
}
And then in the code:
case object Tick
val yourActor = system.actorOf(Props[YourActor])
QuartzSchedulerExtension(system).schedule("YourScheduleName", yourActor, Tick)

dispatch_after in swift explanation

I am currently working on a project and for a part of it I need to unhighlight a button after a set period of time. I decided to use dispatch_after.
I have managed to get it working, but can someone please explain me how this line of code exactly works? I have been unable to understand how dispatch_after exactly works.
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, Int64(1000 * Double(NSEC_PER_MSEC))), dispatch_get_main_queue()) {
self.redButton.highlighted = false
}
Let's break it down into multiple statements:
let when = dispatch_time(DISPATCH_TIME_NOW, Int64(1000 * Double(NSEC_PER_MSEC)))
let queue = dispatch_get_main_queue()
dispatch_after(when, queue) {
self.redButton.highlighted = false
}
dispatch_after() enqueues the block for execution at a certain time
on a certain queue. In your case, the queue is the "main queue"
which is "the serial dispatch queue associated with the application’s main thread". All UI elements must be modified on the main thread only.
The when: parameter of dispatch_after() is a dispatch_time_t
which is documented as "a somewhat abstract representation of time".
dispatch_time() is an utility function to compute that time value.
It takes an initial time, in this case DISPATCH_TIME_NOW which
"indicates a time that occurs immediately", and adds an offset which
is specified in nanoseconds:
let when = dispatch_time(DISPATCH_TIME_NOW, Int64(1000 * Double(NSEC_PER_MSEC)))
NSEC_PER_MSEC = 1000000 is the number of nanoseconds per millisecond,
so
Int64(1000 * Double(NSEC_PER_MSEC))
is an offset of 1000*1000000 nanoseconds = 1000 milliseconds = one second.
The explicit type conversions are necessary because Swift does not
implicitly convert between types. Using Double ensures that it
works also in cases like
let when = dispatch_time(DISPATCH_TIME_NOW, Int64(0.3 * Double(NSEC_PER_SEC)))
to specify an offset of 0.3 seconds.
Summary: Your code enqueues a block to be executed on the main
thread in 1000 ms from now.
Update: See How do I write dispatch_after GCD in Swift 3 and 4? for how the syntax changed
in Swift 3.

Replacing Celerybeat with Chronos

How mature is Chronos? Is it a viable alternative to scheduler like celery-beat?
Right now our scheduling implements a periodic "heartbeat" task that checks of "outstanding" events and fires them if they are overdue. We are using python-dateutil's rrule for defining this.
We are looking at alternatives to this approach, and Chronos seems a very attactive alternative: 1) it would mitigate the necessity to use a heartbeat schedule task, 2) it supports RESTful submission of events with ISO8601 format, 3) has a useful interface for management, and 4) it scales.
The crucial requirement is that scheduling needs to be configurable on the fly from the Web Interface. This is why can't use celerybeat's built-in scheduling out of the box.
Are we going to shoot ourselves in the foot by switching over to Chronos?
This SO has solutions to your dynamic periodic task problem. It's not the accepted answer at the moment:
from djcelery.models import PeriodicTask, IntervalSchedule
from datetime import datetime
class TaskScheduler(models.Model):
periodic_task = models.ForeignKey(PeriodicTask)
#staticmethod
def schedule_every(task_name, period, every, args=None, kwargs=None):
""" schedules a task by name every "every" "period". So an example call would be:
TaskScheduler('mycustomtask', 'seconds', 30, [1,2,3])
that would schedule your custom task to run every 30 seconds with the arguments 1 ,2 and 3 passed to the actual task.
"""
permissible_periods = ['days', 'hours', 'minutes', 'seconds']
if period not in permissible_periods:
raise Exception('Invalid period specified')
# create the periodic task and the interval
ptask_name = "%s_%s" % (task_name, datetime.datetime.now()) # create some name for the period task
interval_schedules = IntervalSchedule.objects.filter(period=period, every=every)
if interval_schedules: # just check if interval schedules exist like that already and reuse em
interval_schedule = interval_schedules[0]
else: # create a brand new interval schedule
interval_schedule = IntervalSchedule()
interval_schedule.every = every # should check to make sure this is a positive int
interval_schedule.period = period
interval_schedule.save()
ptask = PeriodicTask(name=ptask_name, task=task_name, interval=interval_schedule)
if args:
ptask.args = args
if kwargs:
ptask.kwargs = kwargs
ptask.save()
return TaskScheduler.objects.create(periodic_task=ptask)
def stop(self):
"""pauses the task"""
ptask = self.periodic_task
ptask.enabled = False
ptask.save()
def start(self):
"""starts the task"""
ptask = self.periodic_task
ptask.enabled = True
ptask.save()
def terminate(self):
self.stop()
ptask = self.periodic_task
self.delete()
ptask.delete()
I haven't used djcelery yet, but it supposedly has an admin interface for dynamic periodic tasks.