Is a crontab enough to schedule a celery task to run periodically? - celery

I have a periodic task that uses a crontab to run every day at 1:01 AM using
run_every = crontab(hour=1, minute=1)
Once I get my server up and running, is that enough to trigger the task to run once a day? Or do I also need to use a database scheduler?

Yes. It should be enough as Celery beat has own state file that is enough to run everything as you require.

Related

Running very lightweight tasks periodically with kubernetes

Consider a requirement where we need to run very simple and lightweight tasks , say running curl command every 10 minutes.
If this was to run in a kubernetes cluster , is it efficient to create a container every 10 minutes ? Just to execute a task that may take a few seconds or even millisecond ? Is it an overkill from time and cost angle ?
Please note unfortunately lambda functions or cloud functions is not an option.
You can use a CronJob to run Jobs on a time-based schedule. These automated jobs run like Cron tasks on a Linux or UNIX system. Cron jobs are useful for creating periodic and recurring tasks.
https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/

Django celery beat not running multiple clocked task at same time. Is there any other way out?

I want to dynamically add tasks to run at a particular time (clocked task). I am using django celery beat. The problem I am facing is that the celery only executes one task and ignores the rest.
I have tried the following code and searched in library that the django-celery-beat disables the schedule once it has executed the clocked task. This might be the reason for other/next task not running.
What am I doing wrong? and
What can be the alternative way to schedule multiple tasks to run at same time?
clocked, _ = ClockedSchedule.objects.get_or_create(
clocked_time=next_run_time
)
PeriodicTask.objects.create(
clocked=clocked,
name=guid1,
one_off=True,
task="schedulerapp.jobscheduler.runEvent",
args=json.dumps([guid1])
)
PeriodicTask.objects.create(
clocked=clocked,
name=guid2,
one_off=True,
task="schedulerapp.jobscheduler.runEvent",
args=json.dumps([guid2])
)
This should work.
from django_celery_beat.models import PeriodicTask, IntervalSchedule
schedule = IntervalSchedule.objects.create(every=10, period=IntervalSchedule.SECONDS)
task = PeriodicTask.objects.create(interval=schedule, name=guid1, task='schedulerapp.jobscheduler.runEvent', args=json.dumps([guid1]))

How to have celery expire results when using a database backend

I'm not sure I understand how result_expires works.
I read,
result_expires
Default: Expire after 1 day.
Time (in seconds, or a timedelta object) for when after stored task tombstones will be deleted.
A built-in periodic task will delete the results after this time (celery.backend_cleanup), assuming that celery beat is enabled. The task runs daily at 4am.
...
When using the database backend, celery beat must be running for the results to be expired.
(from here: http://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-result_expires)
So, in order for this to work, I have to actually do something like this:
python -m celery -A myapp beat -l info --detach
?
Is that what the documentation is referring to by "celery beat is enabled"? Or, rather than executing this manually, there is some configuration that needs to be set which would cause celery beat to be called automatically?
Re: celery beat--you are correct. If you use a database backend, you have to run celery beat as you posted in your original post. By default celery beat sets up a daily task that will delete older results from the results database. If you are using a redis results backend, you do not have to run celery beat. How you choose to run celery beat is up to you, personally, we do it via systemd.
If you want to configure the default expiration time to be something other than the default 1 day, you can use the result_expires setting in celery to set the number of seconds after a result is recorded that it should be deleted. e.g., 1800 for 30 minutes.

celery beat running task on cleanup, how to stop it

I have bunch of celery beat tasks running at different times a day, but there is one particularly task at 8:00AM to send birthday messages which gets executed while beats cleanup happens at 4:00AM, so my task is running twice in a day. and I notice this happen when I restart celery beat the previous day. How to get around this and tell celery not to execute it at 4:00AM.

Celery beat scheduling option to immediately launch task upon celery launch

I can schedule an hourly task in my Django app using celery beat in settings.py like so:
CELERYBEAT_SCHEDULE={
'tasks.my_task':{
'task':'tasks.my_task',
'schedule':timedelta(seconds=60*60),
'args':(),
},
}
But is there a way to schedule a task such that it immediately queues up and is calculated, thereafter following the configured schedule from there on? E.g., something like executing a selected task instantly at celery launch. What's the configuration for that?
Add the following to tasks.py:
obj = locals()['task_function_name']
obj.run()
This ensures the specified task is run when celery is run. Thereafter, it executes according to schedule.