Dispatching priority of waiting jobs - scheduler

I am investigating scheduling issues on gridengine (6.2u5) and I am puzzled by the scheduling priority of waiting tasks. When I run qstat I can see in the second column (prior) the priority of running jobs (they all have positive values there) but all the waiting jobs have zeroes in that field so I am unable to check which waiting job actually has the higher priority for the next scheduling. Does it depend on configuration or is it expected to have zeroes for jobs in "qw" state? Is there any way to show computed priority values for jobs which did not get dispatched yet?

Jobs will typically only have a priority of 0.0000 if they are not being considered for scheduling.
Situations where this may arise are: jobs in hqw state, or array tasks where the currently running jobs matches the tc value.

Related

Round-robin scheduling algorithm

I'm studying operating system on this book and my prof's slides. I'm arrived at the "Process scheduling algorithms" chapter. Talking about the RoundRobin (RR) algorithm i found some inconsistencies. I understand that is a preemptive version of the FCFS algorithm with a time-slice (quatum).
From now on, I will use the following notation:
#1 = prof's version
#2 = book's version
#3 = other version
Here's the inconsistency (suppose a quantum of 100ms):
#1 The RR uses two queue (Q1, Q2):
Q1: queue for processes that did not end their quantum;
Q2: queue for processes that did end their quantum;
The scheduler takes the process from the head of Q1;
If the process ends before the quantum expires, the process release the CPU on purpose and the scheduler takes the next process from Q1
If the process doesn't end before the quantum expires, is preempted and placed at the end of Q2;
When a process is ready it's placed at the end of Q1;
When Q1 is empty, Q1 and Q2 are swapped;
So when a process is blocked for an I/O request (e.g. after 30ms) and its quantum is not expired yet, is placed at the end of Q1 (I guess) and when it will be scheduled again it will use the CPU for his remaining time (70ms in this case).
#2 (The book did not talk about multiple queues, so I assume it will use just one queue)
The scheduler takes the process from the head of the ready queue;
If the process ends before the quantum expires, the process release the CPU on purpose and the scheduler takes the next process from the ready queue
If the process doesn't end before the quantum expires, is preempted and placed at the end of the ready queue;
#3 Source
The scheduler takes the first process in the ready queue;
If the process ends before the quantum expires, the process release the CPU on purpose and the scheduler takes the next process from the ready queue
If the process doesn't end before the quantum expires, is preempted and placed at the end of the ready queue;
If the process is blocked by a I/O request, it's placed in a waiting queue and when it will became ready will be placed again in the ready queue;
To me, these are 3 different implementations of the RR scheduling algorithm. I think that the most valuable is the #3 because the #1 can cause a starvation (if the process is placed in Q2 and new processes keep coming in Q1, then the process will never be scheduled again) and #2 will waste CPU time when a process is blocked for an I/O request. So, my question is: which one is the right one?
Round robin in theory
Round Robin scheduling can be quite good visualized when thinking about an analog clock: The hand is turning around at constant speed, so it's in the slice of a single digit for 1/12 of the time it takes for one complete run.
A single digit thus has some slice of the total available amount of resource. And, most importantly, there's a fixed order in which the digits get served: After the hand just passed some digit, it'll only get visited again after the hand passed all the other digits.
Looking at the variants you presented, number #2 the version from the book matches this exactly: After a task has been served it is put at the end of a (often so called) ready queue, and thus only gets served again after all of the other tasks have been served once.
Round robin is, as a theoretical scheduling algorithm, only considering the scheduling of multiple consumers (tasks) to a single resource (CPU).
Variants
Some common variations of the basic round robin scheduling are to either use different slice sizes for different tasks, or to dynamically adjust the slice of a task based on some metric, or even to provide more than a single slice to some tasks.
In practice
When you are scheduling tasks, you have to schedule them to more than the CPU as single resource, there will be other resources that need to be managed, like IO devices.
Very simple schedulers just ignore that fact, and leave tasks that are currently waiting for some other resource in the task queue for the CPU.
So when such a task gets its time slice, all it'll do is find that it still needs to wait for that other resource and hands back the CPU, just to get put back into the task queue by the scheduler. Starting the task, checking that the task still needs to wait for the other resource, and stopping the task takes some time that could better be spend for a task that actually can use the CPU.
To solve this, one usually has a task queue for each resource that is managed, i.e. one for the CPU, one for each IO device, etc. When a task is doing a blocking IO call, it is then removed from the queue for the CPU and put into the queue of the device it is accessing. This way, tasks that are waiting for a resource other than the CPU don't sit in the CPU task queue (and thus waste no time getting started and immediately stopped again).
This is what #3 is talking about (when you look again you see that they're talking about "waiting queues")
Another kind of "waiting queue"
Often there's also a "waiting queue" when you're talking about multi level schedulers: In that case, the ready queue is the queue of tasks that are ready and get scheduled by the primary scheduler (using round robin, for example). If a task blocks because of an IO operation, it is - as described above - put into the queue of the corresponding resource. When that resource gets available again, the task is first put into the waiting queue, from which a secondary scheduler (which is just a task for the primary scheduler) eventually takes it and puts it into the ready queue of the primary scheduler.
What your prof is about
The version #1 is probably better to understand if you rename the queues into something like "queue with tasks that did not yet run in this turn" and "queue with tasks that did already run in this turn". This is basically just a "workaround" if you don't want to have circular lists. So it's round robin, too, but a bit obscured.
[..] can cause a starvation (if the process is placed in Q2 and new processes keep coming in Q1, then the process will never be scheduled again) [..]
This is a very good observation. This could be solved if new, ready tasks get inserted at the end of Q2 instead of Q1. If suitable, this is a very good start for a discussion when your prof is asking for questions.
The first implementation can not only cause starvation, it can also cause deadlocks. If only we modify the step 5 of first implementation, the method can be made just fine.
The second approach is round-robin in its purest form.
The third approach is not round-robin but is in fact a smarter version which understands that I/O bound process should not be given another chance too soon as it will probably not be ready yet.
If you will continue reading that book you will read the next scheduling algorithm called Multi-level Queue. That is actually the best of all implementations of different variations of Round-robin. In Multi-level queue we put all incoming processes in one queue and then depending upon whether they finished in their first time slice or not, put them in another queue. This new queue has higher time slice and ends up holding CPU bound processes.
Using Multi-level queues (like 4-5 of them) the CPU combs out all the incoming processes into various classes and then picks optimum number of processes from each queue so that it is neither over-subscribed nor under-subscribed.
Any process which arrives in the system is queue at the end of the ready queue. A process which is at the head of the ready queue is selected and allowed to execute on the CPU for a time quantum q.At the expiry of q, the process is queued at the tail of the ready queue.The next process scheduled is the one at the head of ready queue.This is know Round Robin Scheduling.
OR
In round robin scheduling, processes are dispatched FIFO but are given a limited amount of CPU time called a time-slice or a quantum.If a process does not complete before its CPU times expires,the CPU is preempted and given to the next waiting process,the preempted process is than placed at the back of the ready queue.

Switching from high priority task to low priority task in uCOS II

I'm new to RTOS (uCOS II) and learning it by reading the book written by uCOS author. I have a doubt and I'm unable to find the answer to it.
In uCOS the task with highest priority is given CPU as per the scheduling algorithm. So, if I create write a uCOS application by creating two tasks One with High priority ( Prio = 1 for ex) and the other with low priority ( for ex Prio = 9).
If for example the highest priority task is waiting for an event, then the scheduler should start executing the next higher priority task ? If thats correct then what part of the code switches High priority with low priority ?
The three arch dependent codes are :
1. Interrupt level context switch
2. Start highest priority task ready to run
3. Task level context switch
In case 1 after serving the interrupt the scheduler returns to the highest priority task. In case 2, its called when we start the OS by OSStart()
In case 3, When ever a higher priority task is made ready and its called by timer interrupt
Now, where exactly or how exactly will the scheduler assigns CPU to a lower priority task given the high priority task is in wait ??
Thanks
Another way to consider your question is to ask yourself how did the high priority task get into the waiting state. The answer to both questions is that the high priority task calls an RTOS routine such as GetEvent(). (I don't know whether that is a real uCOS-II routine -- I'm just generalizing.). The RTOS routine puts the high priority task into the waiting state (i.e. blocked) and then the RTOS scheduler finds the next highest priority task that is ready to run and switches to that task's context. The RTOS will have several blocking functions that allow for a task context switch. For example when you read from a queue or mailbox or when you wait for a semaphore or mutex.
The scheduler runs whenever a scheduling event occurs. In your example, that occurs when the high priority task calls the event wait. In general OS calls that may block or yield cause the scheduler to run. The scheduler also runs on exit from ISRs including the IS timer ISR.
In general, when the scheduler performs a context switch, it copies the current processor core registers to the task's control block, and copies the stored register values for the task being switched to into the processor registers, with the stack pointer and program-counter copies last. The change to the program-counter causes execution to continue in the new task with the task's own stack, in the state it was when it last blocked or was preemted. Preemption can occur when a scheduling event occurs in an ISR that causes a higher priority task to become ready.
The thing about uC/OS-II is that it is described in intricate detail in Jean Labrosse's book. The general principles of RTOS with examples using uC/OS-II are described in
this online course by Jack Ganssle.
Interrupt level context switch is used for preemptive, for example, you have an low priority task running, and high priority need to run (OSTimeDly timeout, for example), in this situation, Interrupt level context switch will pause low priority task, then switch to high priority one.
For high to low priority switch, it need high one give up CPU resource by calling OS_Sched

Types of Scheduling algorithms

I understand that CPU scheduling algorithms are classified into
Interactive - Round Robin, Priority scheduling
Batch Scheduling - FCFS,SJF
But I cant understand the reason behind the naming Interactive and Batch Scheduling..??
Why are algorithms like RR called interactive and those like FCFS called batch scheduling??
Thanks in advance...
The idea of Batch Scheduling is that there will be no change in the schedule during runtime: a process is scheduled to do an operation on data, and it runs until the process is finished. In 'interactive' scheduling, a new process could be launched while another process is running, and so time would be allocated for that process as well as the other. In batch scheduling the schedule is determined at the beginning of the operation.
Example of priority (interactive) scheduling:
Process A has a high priority, and process B has a low priority. Process A runs until it requires some input from the user. While A is waiting, the CPU gives some time to process B. Once the input for A has been gathered, process B is swapped out and process A is given the CPU, due to its higher priority.
Example of batch (FCFS) scheduling:
Process A and process B are processes to be scheduled. Process A is given to the CPU first, so B will not receive any time until A finishes running. Even if A pauses for user input, B will not run (and the CPU time while waiting for input is effectively wasted).
Of course, as with everything this low-level, it's not entirely that simple: to gain the illusion of multi-tasking, time is generally divided up between processes even when nothing is waiting for I/O. In priority scheduling, this may mean that more time slices are given to A than B while both are running so that A executes quicker. Both interactive and batch scheduling have their pros and cons: while interactive scheduling gives a quicker response time to the user and divides time up more 'fairly', an overhead is incurred due to how long a 'context switch' takes, which is the time taken for the processor to switch from working on process A to process B.
Interactive scheduling policies assign a time-slice to each process. Once the time-slice is over, the process is swapped even if not yet terminated. It can also be said that scheduling of this kind are preemptive.
Batch scheduling policies, instead, are non-preemptive. Once a Process is in the Running-status, it will not change status until it terminates.

Scheduling policies in Linux Kernel

Can there be more than two scheduling policies working at the same time in Linux Kernel ?
Can FIFO and Round Robin be working on the same machine ?
Yes, Linux supports no less then 4 different scheduling methods for tasks: SCHED_BATCH, SCHED_FAIR, SCHED_FIFO and SCHED_RR.
Regardless of scheduling method, all tasks also have a fixed hard priority (which is 0 for batch and fair and from 1- 99 for the RT schedulign methods of FIFO and RR). Tasks are first and foremost picked by priority - the highest priority wins.
However, with several tasks available for running with the same priority, that is where the scheduling method kicks in: A fair task will only run for its allotted weighted (with the weight coming from a soft priority called the task nice level) share of the CPU time with regard to other fair tasks, a FIFO task will run for a fixed time slice before yielding to another task (of the same priority - higher priority tasks always wins) and RR tasks will run till it blocks disregarding other tasks with the same priority.
Please note what I wrote above is accurate but not complete, because it does not take into account advance CPU reservation features, but it give the details about different scheduling method interact with each other.
yes !! now a days we have different scheduling policies at different stages in OS .. Round robin is done generally before getting the core execution ... fifo is done, at start stage of new coming process ... !!!

Task Schedulers

Had an interesting discussion with some colleagues about the best scheduling strategies for realtime tasks, but not everyone had a good understanding of the common or useful scheduling strategies.
For your answer, please choose one strategy and go over it in some detail, rather than giving a little info on several strategies. If you have something to add to someone else's description and it's short, add a comment rather than a new answer (if it's long or useful, or simply a much better description, then please use an answer)
What is the strategy - describe the general case (assume people know what a task queue is, semaphores, locks, and other OS fundamentals outside the scheduler itself)
What is this strategy optimized for (task latency, efficiency, realtime, jitter, resource sharing, etc)
Is it realtime, or can it be made realtime
Current strategies:
Priority Based Preemptive
Lowest power slowest clock
-Adam
As described in a paper titled Real-Time Task Scheduling for Energy-Aware Embedded Systems, Swaminathan and Chakrabarty describe the challenges of real-time task scheduling in low-power (embedded) devices with multiple processor speeds and power consumption profiles available. The scheduling algorithm they outline (and is shown to be only about 1% worse than an optimal solution in tests) has an interesting way of scheduling tasks they call the LEDF Heuristic.
From the paper:
The low-energy earliest deadline first
heuristic, or simply LEDF, is an
extension of the well-known earliest
deadline first (EDF) algorithm. The
operation of LEDF is as follows: LEDF
maintains a list of all released
tasks, called the “ready list”. When
tasks are released, the task with the
nearest deadline is chosen to be
executed. A check is performed to see
if the task deadline can be met by
executing it at the lower voltage
(speed). If the deadline can be met,
LEDF assigns the lower voltage to the
task and the task begins execution.
During the task’s execution, other
tasks may enter the system. These
tasks are assumed to be placed
automatically on the “ready list”.
LEDF again selects the task with the
nearest deadline to be executed. As
long as there are tasks waiting to be
executed, LEDF does not keep the pro-
cessor idle. This process is repeated
until all the tasks have been
scheduled.
And in pseudo-code:
Repeat forever {
if tasks are waiting to be scheduled {
Sort deadlines in ascending order
Schedule task with earliest deadline
Check if deadline can be met at lower speed (voltage)
If deadline can be met,
schedule task to execute at lower voltage (speed)
If deadline cannot be met,
check if deadline can be met at higher speed (voltage)
If deadline can be met,
schedule task to execute at higher voltage (speed)
If deadline cannot be met,
task cannot be scheduled: run the exception handler!
}
}
It seems that real-time scheduling is an interesting and evolving problem as small, low-power devices become more ubiquitous. I think this is an area in which we'll see plenty of further research and I look forward to keeping abreast!
One common real-time scheduling scheme is to use priority-based preemptive multitasking.
Each tasks is assigned a different priority level.
The highest priority task on the ready queue will be the task that runs. It will run until it either gives up the CPU (i.e. delays, waits on a semaphore, etc...) or a higher priority task becomes ready to run.
The advantage of this scheme is that the system designer has full control over what tasks will run at what priority. The scheduling algorithm is also simple and should be deterministic.
On the other hand, low priority tasks might be starved for CPU. This would indicate a design problem.