I want to have all asycnhronous tasks in my app retry on any exception and also want the retries to follow exponential backoff.
#celery_app.task(autoretry_for=(Exception,))
def some_task():
...
In my configuration I have
CELERY_TASK_ANNOTATIONS = {'*': {'max_retries': 5, 'retry_backoff': 5}}
The max_retries setting works and all tasks are now retried 5 times before failing. But all of them are retried after 180 seconds.
I want some way for all the tasks to follow retry_backoff without having to specify it for each of them so that I can change it anytime at one place.
It looks like according to the Celery documentation the property you want to set is retry_backoff_max.
Task.retry_backoff_max
A number. If retry_backoff is enabled, this option will set a maximum
delay in seconds between task autoretries. By default, this option is
set to 600, which is 10 minutes.
retry_backoff can be a number or a boolean and based on which it is the backoff will behave differently. For an exponential backoff it appears you want to set this true.
Task.retry_backoff
A boolean, or a number. If this option is set to
True, autoretries will be delayed following the rules of exponential
backoff. The first retry will have a delay of 1 second, the second
retry will have a delay of 2 seconds, the third will delay 4 seconds,
the fourth will delay 8 seconds, and so on. (However, this delay value
is modified by retry_jitter, if it is enabled.) If this option is set
to a number, it is used as a delay factor. For example, if this option
is set to 3, the first retry will delay 3 seconds, the second will
delay 6 seconds, the third will delay 12 seconds, the fourth will
delay 24 seconds, and so on. By default, this option is set to False,
and autoretries will not be delayed.
What you can do to avoid changing this in multiple places is to have a global variable, say global_retry_backoff=5 that you will use in your task annotations: #celery_app.task(autoretry_for=(Exception,), retry_backoff=global_retry_backoff) .
Related
I 'm trying to create a retry on a celery task but every retry needs to be done after a specific delay.
Is it possible to that and how ?
for example first retry would be 5 seconds, next one 25, next one 125
#app.task(bind=true, max_retries=10, delay=5)
What is the difference between eventTimeTimeout and processingTimeTimeout in mapGroupsWithState?
Also, is possible to make a state expire after every 10 min and if the data for that particular key arrives after 10 min the state should be maintained from the beginning?
In short:
processing-based timeouts rely on the time/clock of the machine your job is running. It is independent of any timestamps given in your data/events.
event-based timeouts rely on a timestamp column within your data that serves as the event time. In that case you need to declare this timestamp as a Watermark.
More details are available in the Scala Docs on the relevant class
GroupState:
With ProcessingTimeTimeout, the timeout duration can be set by calling GroupState.setTimeoutDuration. The timeout will occur when the clock has advanced by the set duration. Guarantees provided by this timeout with a duration of D ms are as follows:
Timeout will never be occur before the clock time has advanced by D ms
Timeout will occur eventually when there is a trigger in the query (i.e. after D ms). So there is a no strict upper bound on when the timeout would occur. For example, the trigger interval of the query will affect when the timeout actually occurs. If there is no data in the stream (for any group) for a while, then their will not be any trigger and timeout function call will not occur until there is data.
Since the processing time timeout is based on the clock time, it is affected by the variations in the system clock (i.e. time zone changes, clock skew, etc.).
With EventTimeTimeout, the user also has to specify the event time watermark in the query using Dataset.withWatermark(). With this setting, data that is older than the watermark are filtered out. The timeout can be set for a group by setting a timeout timestamp usingGroupState.setTimeoutTimestamp(), and the timeout would occur when the watermark advances beyond the set timestamp. You can control the timeout delay by two parameters - (i) watermark delay and an additional duration beyond the timestamp in the event (which is guaranteed to be newer than watermark due to the filtering). Guarantees provided by this timeout are as follows:
Timeout will never be occur before watermark has exceeded the set timeout.
Similar to processing time timeouts, there is a no strict upper bound on the delay when the timeout actually occurs. The watermark can advance only when there is data in the stream, and the event time of the data has actually advanced.
"Also, is possible to make a state expire after every 10 min and if the data for that particular key arrives after 10 min the state should be maintained from the beginning?"
This is happening automatically when using mapGroupsWithState. You just need to make sure to actually remove the state after the 10 minutes.
I'm using Entity Framework Core 2.2 and i decided to follow some blog sugestion and enable retry on failure:
services.AddDbContext<MyDbContext>( options =>
options.UseSqlServer(Configurations["ConnectionString"]),
sqlServerOptionsAction: sqlOptions =>
{
sqlOptions.EnableRetryOnFailure(
maxRetryCount: 10,
maxRetryDelay: TimeSpan.FromSeconds(5),
errorNumbersToAdd: null);
});
My question is what is the maxRetryDelay argument for?
I would expect it to be the delay time between retries, but the name implies its the maximum time, does that mean i can do my 10 retries 1 second apart and not 5 seconds apart as i desire?
The delay between retries is randomized up to the value specified by maxRetryDelay.
This is done to avoid multiple retries occuring at the same time and overwhelming a server. Imagine for example 10K requests to a web service failing due to a network issue and retrying at the same time after 15 seconds. The database server would get a sudden wave of 15K queries.
By randomizing the delay, retries are spread across time and client.
The delay for each retry is calculated by ExecutionStragegy.GetNextDelay. The source shows it's a random exponential backoff.
The default SqlServerRetryingExecutionStrategy uses that implementation. A custom retry strategy could use a different implementation
I know orchard has its default scheduler running in every minutes. Does it mean that the minimal interval is 1 minutes?
If I want a task to run in every 20 seconds, how can I make it.
Thanks
You can open the file src\Orchard.Web\Config\Sites.config to uncomment the Delay between background services executions paragraph. Here you can set the interval.
I read this in the celery documentation for Task.rate_limit:
Note that this is a per worker instance rate limit, and not a global rate limit. To enforce a global rate limit (e.g., for an API with a maximum number of requests per second), you must restrict to a given queue.
How do I put a rate limit on a celery queue?
Turns out it cant be done at queue level for multiple workers.
IT can be done at queue level for 1 worker. Or at queue level for each worker.
So if u say 10 jobs/ minute on 5 workers. Your workers will process upto 50 jobs per minute collectively.
So to have only 10 jobs running at a time you either chose one worker. Or chose 5 workers with a limit of 2/minute.
Update: How to exactly put the limit in settings/configuration:
task_annotations = {'tasks.<task_name>': {'rate_limit': '10/m'}}
or change the same for all tasks:
task_annotations = {'*': {'rate_limit': '10/m'}}
10/m means 10 tasks per minute, /s would mean per second. More details here: Task annotations setting
hey I am trying to find a way to do rate limit on queue, and I find out Celery can't do that, however Celery can control the rate per tasks, see this:
http://docs.celeryproject.org/en/latest/userguide/workers.html#rate-limits
so for a workaround, maybe you can set up one tasks per queue(which makes sense in a lot of situations), and put the limit on task.
You can set this limit in the flower > worker pane.
there is a specified blank space for entering your limit there.
The format that is suggested to be used is also like the below:
The rate limits can be specified in seconds, minutes or hours by appending “/s”, >“/m” or “/h” to the value. Tasks will be evenly distributed over the specified >time frame.
Example: “100/m” (hundred tasks a minute). This will enforce a minimum delay of >600ms between starting two tasks on the same worker instance.