EDF scheduling: what if a task has missed its deadline? - real-time

I know the tasks won't be schedulable but what if I have to choose at a certain time between a task which has already missed its deadline and another task with a deadline still ahead. What does the algorithm state in this case?
Thank you

From my search, it haven't seen anyone say anything specific about what happens when a task's deadline is older than the current time. The specific action would be up to the developer, because I don't think that task would be a valid task in a valid schedule. When a task misses it's deadline and is yet to be executed, it should be dropped, when using any valid schedule.
Given that scheduling usually happens periodically, by the time the new schedule is set up, it would have a new deadline or it would have been dropped I believe if the task is aperiodic and has not been submitted again.
This document about the linux kernel's specific algorithm shows an example and an explanation of one obvious solution by a developer, when the task deadline < current time.

The one that has missed continues until it finishes its execution. This can lead to the domino effect in EDF.
Check out "Transient Over Load Condition & Domino Effect in Earliest deadline first" here

Related

How are missed deadlines and dropped tasks work in EDF and RMS scheduling algorithms?

I am writing a school project about real-time systems and their scheduling algorithms. Particularly I am trying to compare several of these algorithms, namely RMS, EDF, and LLF under overloaded systems. But I am confused about how these systems deal with deadlines that are missed.
For example, consider the following set of tasks. I assume all tasks are periodic and their deadlines are equal to their periods.
Task 1:
Execution time: 2
Period: 5
Task 2:
Execution time: 2
Period: 6
Task 3:
Execution time: 2
Period: 7
Task 4:
Execution time: 2
Period: 8
It is not possible to make a feasible schedule with these tasks because the CPU Utilization is over 100%, which means some deadlines will miss and more importantly some tasks will not be completed at all. For the sake of comparison, I want to calculate a penalty (or cost) for each of the tasks which increases as more and more deadlines are missed. Here's where the questions and confusion start.
Now I understand, for example, that in RMS, the first task will never miss since it has the highest priority, and the second task also never misses. On the other hand, the third task does miss a bunch of deadlines. Here is the first question:
Do we consider a task to be dropped in RMS if it misses its deadline and a new task is dispatched?
1.a) If we do consider it dropped how would I reflect this in my penalty calculations? Since the task is never completed it would seem redundant to calculate the time it took to complete the task after its deadline passed.
1.b) If we do not consider it to be dropped and the execution of the task continues even after its deadline passes by a whole period, what happens to the new task that is dispatched? Do we drop that task instead of the one we started already, or does it just domino onto the next one and the next one and etc.? If that is the case this means that when a schedule with a length of the LCM of the tasks' periods are made, there are some task 3 dispatches that are not completed at all.
Another confusion is of the same nature but with EDF. EDF fails after a certain time on several tasks. I understand that in the case of EDF I must continue with the execution of the tasks even if I pass their deadlines which means all of the tasks will be completed even though they will not fit their deadlines completely, hence the domino effect. Then the question becomes
Do we drop any tasks at all? What happens to the tasks which are dispatched because the period resets but they cannot be executed because the same task is being executed since it missed its deadline on the period before?
I know it is a long post but any help is appreciated. Thank you. If you cannot understand any of the questions I may clarify them at your request.

Delay Celery task based on condition

Is there any way to delay a Celery task from running based on a condition? Before it moves from scheduled to active I would like to perform a quick check to see if my machine can run the task based on the arguments provided and my machine's state at the time. If it's not, it halts the scheduled queue and waits until the condition is satisfied.
I've looked around at the following points but it didn't seem to cut it:
Celery's Signals: closest thing I could get to is task_prerun() but regardless of what I put in there, the task will get run and it doesn't halt the other scheduled tasks from running. There's also worker_ready() but that doesn't look at the upcoming task's arguments to do the check.
Database Lock (also here as well): I can have each of the tasks start running normally and then do the check at the beginning of the task's run but if I set a periodic interval to check if condition is met, I lose the order of the active queue as condition can be met at any point and one of the many active tasks will be able to continue. This is where the database lock comes in and is so far the most feasible solution. I can make a lock every time I do the check and if the condition's not met, it stays locked. When the condition's finally met, I release the lock for the next item in the queue, preserving the queue's original order.
I find it surprising that celery doesn't have this functionality to specify if/when the next item in the scheduled queue is ready to be run.

Is there a way to make the Start Time closer than Schedule Time in an SCOM Task?

I realize that when I execute a SCOM Task on demand from a Powershell script, there are 2 columns in Task Status view called Schedule Time and Start Time. It seems that there is an interval these two fields of around 15 seconds. I'm wondering if there is a way to minimize this time so I could have a response time shorter when I execute an SCOM task on demand.
This is not generally something that users can control. The "ScheduledTime" correlates to the time when the SDK received the request to execute the task. The "StartTime" represents the time that the agent healthservice actually began executing the task workflow locally.
In between those times, things are moving as fast as they can. The request needs to propagate to the database, and a server healthservice needs to be notified that a task is being triggered. The servers then need to determine the correct route for the task message to take, then the healthservices need to actually send and receive the message. Finally, it gets to the actual agent where the task will execute. All of these messages go through the same queues as other monitoring data.
That sequence can be very quick (when running a task against the local server), or fairly slow (in a big Management Group, or when there is lots of load, or if machines/network are slow). Besides upgrading your hardware, you can't really do anything to make the process run quicker.

Work around celerybeat being a single point of failure

I'm looking for recommended solution to work around celerybeat being a single point of failure for celery/rabbitmq deployment. I didn't find anything that made sense so far, by searching the web.
In my case, once a day timed scheduler kicks off a series of jobs that could run for half a day or longer. Since there can only be one celerybeat instance, if something happens to it or the server that it's running on, critical jobs will not be run.
I'm hoping there is already a working solution for this, as I can't be the only one who needs reliable (clustered or the like) scheduler. I don't want to resort to some sort of database-backed scheduler, if I don't have to.
There is an open issue in celery github repo about this. Don't know if they are working on it though.
As a workaround you could add a lock for tasks so that only 1 instance of specific PeriodicTask will run at a time.
Something like:
if not cache.add('My-unique-lock-name', True, timeout=lock_timeout):
return
Figuring out lock timeout is well, tricky. We're using 0.9 * task run_every seconds if different celerybeats will try to run them at different times.
0.9 just to leave some margin (e.g. when celery is a little behind schedule once, then it is on schedule which would cause lock to still be active).
Then you can use celerybeat instance on all machines. Each task will be queued for every celerybeat instance but only one task of them will finish the run.
Tasks will still respect run_every this way - worst case scenario: tasks will run at 0.9*run_every speed.
One issue with this case: if tasks were queued but not processed at scheduled time (for example because queue processors was unavailable) - then lock may be placed at wrong time causing possibly 1 next task to simply not run. To go around this you would need some kind of detection mechanism whether task is more or less on time.
Still, this shouldn't be a common situation when using in production.
Another solution is to subclass celerybeat Scheduler and override its tick method. Then for every tick add a lock before processing tasks. This makes sure that only celerybeats with same periodic tasks won't queue same tasks multiple times. Only one celerybeat for each tick (one who wins the race condition) will queue tasks. In one celerybeat goes down, with next tick another one will win the race.
This of course can be used in combination with the first solution.
Of course for this to work cache backend needs to be replicated and/or shared for all of servers.
It's an old question but I hope it helps anyone.

Use beanstalkd, for periodic tasks, how to always make a job replaced by its latest one?

I am trying to use beanstalk for queuing a large number of periodic
tasks (for example, tasks need processed every N minutes), for each
task, if the last queued job is not completed (not reserved, i mean)
when current job to be added, the last queued job should be replaced
with current job, in other words, only the latest queued job of a task
should be processed.
how can i achieve that using beanstalk?
Ideas i have got right now is:
for each task, use memcached store its latest timestamps (set this
when add jobs to queue),
every time the worker reserved a job successfully, it first checks
timestamps for this task in memcached,
if timestamps of the job is same as timestamps in memcached, then
process this job,
otherwise skip this job, and delete it from the queue.
So is there better ways to do such work? please give your suggestions,
thanks.
I found a memcache/beanstalk combination also the best solution for an implementation where I didnt want a newer but identical job entering a queue.
Until 'named jobs' are done and the software released, that may be one of the better solutions.