I've heard the phrase 'priority inversion' in reference to development of operating systems.
What exactly is priority inversion?
What is the problem it's meant to solve, and how does it solve it?
Imagine three (3) tasks of different priority: tLow, tMed and tHigh. tLow and tHigh access the same critical resource at different times; tMed does its own thing.
tLow is running, tMed and tHigh are presently blocked (but not in critical section).
tLow comes along and enters the critical section.
tHigh unblocks and since it is the highest priority task in the system, it runs.
tHigh then attempts to enter the critical resource but blocks as tLow is in there.
tMed unblocks and since it is now the highest priority task in the system, it runs.
tHigh can not run until tLow gives up the resource. tLow can not run until tMed blocks or ends. The priority of the tasks has been inverted; tHigh though it has the highest priority is at the bottom of the execution chain.
To "solve" priority inversion, the priority of tLow must be bumped up to be at least as high as tHigh. Some may bump its priority to the highest possible priority level. Just as important as bumping up the priority level of tLow, is dropping the priority level of tLow at the appropriate time(s). Different systems will take different approaches.
When to drop the priority of tLow ...
No other tasks are blocked on any of the resources that tLow has. This may be due to timeouts or the releasing of resources.
No other tasks contributing to the raising the priority level of tLow are blocked on the resources that tLow has. This may be due to timeouts or the releasing of resources.
When there is a change in which tasks are waiting for the resource(s), drop the priority of tLow to match the priority of the highest priority level task blocked on its resource(s).
Method #2 is an improvement over method #1 in that it shortens the length of time that tLow has had its priority level bumped. Note that its priority level stays bumped at tHigh's priority level during this period.
Method #3 allows the priority level of tLow to step down in increments if necessary instead of in one all-or-nothing step.
Different systems will implement different methods depending upon what factors they consider important.
memory footprint
complexity
real time responsiveness
developer knowledge
Hope this helps.
Priority inversion is a problem, not a solution. The typical example is a low priority process acquiring a resource that a high priority process needs, and then being preempted by a medium priority process, so the high priority process is blocked on the resource while the medium priority one finishes (effectively being executed with a lower priority).
A rather famous example was the problem experienced by the Mars Pathfinder rover: http://www.cs.duke.edu/~carla/mars.html, it's a pretty interesting read.
Suppose an application has three threads:
Thread 1 has high priority.
Thread 2 has medium priority.
Thread 3 has low priority.
Let's assume that Thread 1 and Thread 3 share the same critical section code
Thread 1 and thread 2 are sleeping or blocked at the beginning of the example. Thread 3 runs and enters a critical section.
At that moment, thread 2 starts running, preempting thread 3 because thread 2 has a higher priority. So, thread 3 continues to own a critical section.
Later, thread 1 starts running, preempting thread 2. Thread 1 tries to enter the critical section that thread 3 owns, but because it is owned by another thread, thread 1 blocks, waiting for the critical section.
At that point, thread 2 starts running because it has a higher priority than thread 3 and thread 1 is not running. Thread 3 never releases the critical section that thread 1 is waiting for because thread 2 continues to run.
Therefore, the highest-priority thread in the system, thread 1, becomes blocked waiting for lower-priority threads to run.
It is the problem rather than the solution.
It describes the situation that when low-priority threads obtain locks during their work, high-priority threads will have to wait for them to finish (which might take especially long since they are low-priority). The inversion here is that the high-priority thread cannot continue until the low-priority thread does, so in effect it also has low priority now.
A common solution is to have the low-priority threads temporarily inherit the high priority of everyone who is waiting on locks they hold.
[ Assume, Low process = LP, Medium Process = MP, High process = HP ]
LP is executing a critical section. While entering the critical section, LP must have acquired a lock on some object, say OBJ.
LP is now inside the critical section.
Meanwhile, HP is created. Because of higher priority, CPU does a context switch, and HP is now executing (not the same critical section, but some other code). At some point during HP's execution, it needs a lock on the same OBJ (may or may not be on the same critical section), but the lock on OBJ is still held by LP, since it was pre-empted while executing the critical section. LP cannot relinquish now because the process is in READY state, not RUNNING. Now HP is moved to BLOCKED / WAITING state.
Now, MP comes in, and executes its own code. MP does not need a lock on OBJ, so it keeps executing normally. HP waits for LP to release lock, and LP waits for MP to finish executing so that LP can come back to RUNNING state (.. and execute and release lock). Only after LP has released lock can HP come back to READY (and then go to RUNNING by pre-empting the low priority tasks.)
So, effectively it means that until MP finishes, LP cannot execute and hence HP cannot execute. So, it seems like HP is waiting for MP, even though they are not directly related through any OBJ locks. -> Priority Inversion.
A solution to Priority Inversion is Priority Inheritance -
increase the priority of a process (A) to the maximum priority of any
other process waiting for any resource on which A has a resource lock.
Let me make it very simple and clear. (This answer is based on the answers above but presented in crisp way).
Say there is a resource R and 3 processes. L, M, H. where p(L) < p(M) < p(H) (where p(X) is priority of X).
Say
L starts executing first and catch holds on R. (exclusive access to R)
H comes later and also want exclusive access to R and since L is holding it, H has to wait.
M comes after H and it doesn't need R. And since M has got everything it wants to execute it forces L to leave as it has high priority compared to L. But H cannot do this as it has a resource locked by L which it needs for execution.
Now making the problem more clear, actually the M should wait for H to complete as p(H) > p(M) which didn't happen and this itself is the problem. If many processes such as M come along and don't allow the L to execute and release the lock H will never execute. Which can be hazardous in time critical applications
And for solutions refer the above answers :)
Priority inversion is where a lower priority process gets ahold of a resource that a higher priority process needs, preventing the higher priority process from proceeding till the resource is freed.
eg:
FileA needs to be accessed by Proc1 and Proc2.
Proc 1 has a higher priority than Proc2, but Proc2 manages to open FileA first.
Normally Proc1 would run maybe 10 times as often as Proc2, but won't be able to do anything because Proc2 is holding the file.
So what ends up happening is that Proc1 blocks until Proc2 finishes with FileA, essentially their priorities are 'inverted' while Proc2 holds FileA's handle.
As far as 'Solving a problem' goes, priority inversion is a problem in itself if it keeps happening.
The worst case (most operating systems won't let this happen though) is if Proc2 wasn't allowed to run until Proc1 had. This would cause the system to lock as Proc1 would keep getting assigned CPU time, and Proc2 will never get CPU time, so the file will never be released.
Priority inversion occurs as such:
Given processes H, M and L where the names stand for high, medium and low priorities,
only H and L share a common resource.
Say, L acquires the resource first and starts running. Since H also needs that resource, it enters the waiting queue.
M doesn't share the resource and can start to run, hence it does. When L is interrupted by any means, M takes the running state since it has higher priority and it is running on the instant that interrupt happens.
Although H has higher priority than M, since it is on the waiting queue, it cannot acquire the resource, implying a lower priority than even M.
After M finishes, L will again take over CPU causing H to wait the whole time.
Priority Inversion can be avoided if the blocked high priority thread transfers its high priority to the low priority thread that is holding onto the resource.
A scheduling challenge arises when a higher-priority process needs to read or modify kernel data that are currently being accessed by a lower-priority process—or a chain of lower-priority processes. Since kernel data are typically protected with a lock, the higher-priority process will have to wait for a lower-priority one to finish with the resource. The situation becomes more complicated if the lower-priority process is preempted in favor of another process with a higher priority. As an example, assume we have three processes—L, M, and H—whose priorities follow the order L < M < H. Assume that process H requires resource R,which is currently being accessed by process L.Ordinarily,process H would wait for L to finish using resource R. However, now suppose that process M becomes runnable, thereby preempting process L. Indirectly, a process with a lower priority—process M—has affected how long process H must wait for L to relinquish resource R. This problem is known as priority inversion.It occurs only in systems with more than two priorities,so one solution is to have only two priorities.That is insufficient for most general-purpose operating systems, however. Typically these systems solve the problem by implementing a priority-inheritance protocol. According to this protocol, all processes that are accessing resources needed by a higher-priority process inherit the higher priority until they are finished with the resources in question.When they are finished,their priorities revert to their original values. In the example above, a priority-inheritance protocol would allow process L to temporarily inherit the priority of process H,thereby preventing process M from preempting its execution. When process L had finished using resource R,it would relinquish its inherited priority from H and assume its original priority.Because resource R would now be available, process H—not M—would run next.
Reference :ABRAHAM SILBERSCHATZ
Consider a system with two processes,H with high priority and L with low priority. The scheduling rules are such that H is run whenever it is in ready state because of its high priority. At a certain moment, with L in its critical region, H becomes ready to run (e.g., an I/O operation completes). H now begins busy waiting, but since L is never scheduled while H is running, L never gets the chance to leave the critical section. So H loops forever.
This situation is called Priority Inversion. Because higher priority process is waiting on lower priority process.
Related
Consider a system that has three processes and three identical resources. Each process needs a maximum of two resources. Is deadlock possible in this system?
It it my understanding that deadlock is possible if the four conditions hold simultaneously:
Mutual exclusion, hold and wait, no pre-emption, and circular wait.
If each process is allocated one resource, then all three resources will be held. There is no forth resource available.
How do I go about proving that deadlock is not possible, and how would I calculate how many resources should be available to make the system deadlock free?
In such cases the trick is to evaluate the CIRCULAR WAIT condition and see if it holds or not. 3 Processes and 3 identical resources. Let us give 1 to each of them. 0 resources left, yet no process requirement is complete(as each needs 2),which means that every process is waiting for some other process to release the resources. Circular wait condition satisfied. Therefore the given scenario can lead to deadlock.
Suppose we have n processes and m identical resources with maximum demands as d1,d2,d3......dn.
If m > (d1-1) + (d2-1) + (d3-1)......(dn-1)., then its deadlock free,
otherwise it can lead to deadlock
Consider a system with m resources of the same type being shared by n
processes. Resources can be requested and released by processes only on at a
time. The system is deadlock free if and only if the sum of all max needs is < m+n.
For example:
A system has 3 processes sharing 4 resources. If each process needs a maximum of 2 units, then:
To make a system deadlock free, assign each process with one less then their max need. After doing so if we are left with one or more resources then there is no deadlock.
Assign 1 resource (max need -1) to each process.
allocated resources=1+1+1=3
we are still left with 1 resource to avoid deadlock.
so deadlock can never occur.
In Priority based scheduling, I came across the problem of Priority inversion which is a higher priority process is forced to wait for the lower priority task.
One possible scenario is, consider three process L,M,H with order of priority L < M < H .
L is running in CS ; H also needs to run in CS ; H waits for L to come out of CS ; M interrupts L and starts running ; M runs till completion and relinquishes control ; L resumes and starts running till the end of CS ; H enters CS and starts running.
Here, my question is, regarding the statement M interrupts L and starts running i.e., can a process executing in Critical section be interrupted or pre-empted.
Here, my question is, regarding the statement M interrupts L and starts running i.e., can a process executing in Critical section be interrupted or pre-empted.
It depends on how the critical section is implemented.
In operating system code you will frequently find critical sections implemented where interrupts are blocked. In this kind of implementation, a process will always execute the entire critical section without interruption.
In user code that uses critical sections implemented through system services, the process invariably can be interrupted. If the were not the case a process could take over the system by putting all its code in a critical section.
You are describing one of the reasons process priorities should be consistent. Unless you are doing real time processing or background batch processing, all processes should generally have the same base priority.
The old DECUS tapes used to be filled with "fair share" applications that would lower the priory of processes with high CPU usage and that would wreak havoc with system scheduling.
The answer is simple and yes.
If someother process with a higher priority in a preemptive system doesn't need to run in critical section, i.e. doesn't need to aquire a lock which is held by a lower priority process, then it can preempt the lower priority process regardless of what it is executing.
Even if M needs the CS, it will preempt L, run, get blocked and switched out for L to continue execution.
The concept of priority inversion is a case when TSL cannot be deadlock free then how one can say the TSL is deadlock free?
Lets first look at the definition of Deadlock.
Definition :- A set of processes are said to be in Deadlock, if they wait for happening of event caused by other processes in the same set. Here the processes in the set are in waiting/blocked state.
Now lets understand the Priority Inversion. Lets process A is a lower priority process running in critical section. and process B be higher priority process.
When the process B goes to ready state,it gets scheduled (goes to running state) and process A gets pre-empted by schedular as it has low priority. Here Process A goes to waiting/blocked state but it has already locked the critical section and process B is in running state which then goes in Busy waiting because the critical section is already locked by process A !!. This causes an infinite waiting. This case is basically called as SpinLock since process B is in running state and process A is in waiting/blocked state but according to definition of Deadlock all process should be in waiting/blocked state.
This is why TSL is said to be Deadlock free but its NOT free from SpinLock.
I hope this will be helpful :)
I'm studying operating system on this book and my prof's slides. I'm arrived at the "Process scheduling algorithms" chapter. Talking about the RoundRobin (RR) algorithm i found some inconsistencies. I understand that is a preemptive version of the FCFS algorithm with a time-slice (quatum).
From now on, I will use the following notation:
#1 = prof's version
#2 = book's version
#3 = other version
Here's the inconsistency (suppose a quantum of 100ms):
#1 The RR uses two queue (Q1, Q2):
Q1: queue for processes that did not end their quantum;
Q2: queue for processes that did end their quantum;
The scheduler takes the process from the head of Q1;
If the process ends before the quantum expires, the process release the CPU on purpose and the scheduler takes the next process from Q1
If the process doesn't end before the quantum expires, is preempted and placed at the end of Q2;
When a process is ready it's placed at the end of Q1;
When Q1 is empty, Q1 and Q2 are swapped;
So when a process is blocked for an I/O request (e.g. after 30ms) and its quantum is not expired yet, is placed at the end of Q1 (I guess) and when it will be scheduled again it will use the CPU for his remaining time (70ms in this case).
#2 (The book did not talk about multiple queues, so I assume it will use just one queue)
The scheduler takes the process from the head of the ready queue;
If the process ends before the quantum expires, the process release the CPU on purpose and the scheduler takes the next process from the ready queue
If the process doesn't end before the quantum expires, is preempted and placed at the end of the ready queue;
#3 Source
The scheduler takes the first process in the ready queue;
If the process ends before the quantum expires, the process release the CPU on purpose and the scheduler takes the next process from the ready queue
If the process doesn't end before the quantum expires, is preempted and placed at the end of the ready queue;
If the process is blocked by a I/O request, it's placed in a waiting queue and when it will became ready will be placed again in the ready queue;
To me, these are 3 different implementations of the RR scheduling algorithm. I think that the most valuable is the #3 because the #1 can cause a starvation (if the process is placed in Q2 and new processes keep coming in Q1, then the process will never be scheduled again) and #2 will waste CPU time when a process is blocked for an I/O request. So, my question is: which one is the right one?
Round robin in theory
Round Robin scheduling can be quite good visualized when thinking about an analog clock: The hand is turning around at constant speed, so it's in the slice of a single digit for 1/12 of the time it takes for one complete run.
A single digit thus has some slice of the total available amount of resource. And, most importantly, there's a fixed order in which the digits get served: After the hand just passed some digit, it'll only get visited again after the hand passed all the other digits.
Looking at the variants you presented, number #2 the version from the book matches this exactly: After a task has been served it is put at the end of a (often so called) ready queue, and thus only gets served again after all of the other tasks have been served once.
Round robin is, as a theoretical scheduling algorithm, only considering the scheduling of multiple consumers (tasks) to a single resource (CPU).
Variants
Some common variations of the basic round robin scheduling are to either use different slice sizes for different tasks, or to dynamically adjust the slice of a task based on some metric, or even to provide more than a single slice to some tasks.
In practice
When you are scheduling tasks, you have to schedule them to more than the CPU as single resource, there will be other resources that need to be managed, like IO devices.
Very simple schedulers just ignore that fact, and leave tasks that are currently waiting for some other resource in the task queue for the CPU.
So when such a task gets its time slice, all it'll do is find that it still needs to wait for that other resource and hands back the CPU, just to get put back into the task queue by the scheduler. Starting the task, checking that the task still needs to wait for the other resource, and stopping the task takes some time that could better be spend for a task that actually can use the CPU.
To solve this, one usually has a task queue for each resource that is managed, i.e. one for the CPU, one for each IO device, etc. When a task is doing a blocking IO call, it is then removed from the queue for the CPU and put into the queue of the device it is accessing. This way, tasks that are waiting for a resource other than the CPU don't sit in the CPU task queue (and thus waste no time getting started and immediately stopped again).
This is what #3 is talking about (when you look again you see that they're talking about "waiting queues")
Another kind of "waiting queue"
Often there's also a "waiting queue" when you're talking about multi level schedulers: In that case, the ready queue is the queue of tasks that are ready and get scheduled by the primary scheduler (using round robin, for example). If a task blocks because of an IO operation, it is - as described above - put into the queue of the corresponding resource. When that resource gets available again, the task is first put into the waiting queue, from which a secondary scheduler (which is just a task for the primary scheduler) eventually takes it and puts it into the ready queue of the primary scheduler.
What your prof is about
The version #1 is probably better to understand if you rename the queues into something like "queue with tasks that did not yet run in this turn" and "queue with tasks that did already run in this turn". This is basically just a "workaround" if you don't want to have circular lists. So it's round robin, too, but a bit obscured.
[..] can cause a starvation (if the process is placed in Q2 and new processes keep coming in Q1, then the process will never be scheduled again) [..]
This is a very good observation. This could be solved if new, ready tasks get inserted at the end of Q2 instead of Q1. If suitable, this is a very good start for a discussion when your prof is asking for questions.
The first implementation can not only cause starvation, it can also cause deadlocks. If only we modify the step 5 of first implementation, the method can be made just fine.
The second approach is round-robin in its purest form.
The third approach is not round-robin but is in fact a smarter version which understands that I/O bound process should not be given another chance too soon as it will probably not be ready yet.
If you will continue reading that book you will read the next scheduling algorithm called Multi-level Queue. That is actually the best of all implementations of different variations of Round-robin. In Multi-level queue we put all incoming processes in one queue and then depending upon whether they finished in their first time slice or not, put them in another queue. This new queue has higher time slice and ends up holding CPU bound processes.
Using Multi-level queues (like 4-5 of them) the CPU combs out all the incoming processes into various classes and then picks optimum number of processes from each queue so that it is neither over-subscribed nor under-subscribed.
Any process which arrives in the system is queue at the end of the ready queue. A process which is at the head of the ready queue is selected and allowed to execute on the CPU for a time quantum q.At the expiry of q, the process is queued at the tail of the ready queue.The next process scheduled is the one at the head of ready queue.This is know Round Robin Scheduling.
OR
In round robin scheduling, processes are dispatched FIFO but are given a limited amount of CPU time called a time-slice or a quantum.If a process does not complete before its CPU times expires,the CPU is preempted and given to the next waiting process,the preempted process is than placed at the back of the ready queue.
I'm new to RTOS (uCOS II) and learning it by reading the book written by uCOS author. I have a doubt and I'm unable to find the answer to it.
In uCOS the task with highest priority is given CPU as per the scheduling algorithm. So, if I create write a uCOS application by creating two tasks One with High priority ( Prio = 1 for ex) and the other with low priority ( for ex Prio = 9).
If for example the highest priority task is waiting for an event, then the scheduler should start executing the next higher priority task ? If thats correct then what part of the code switches High priority with low priority ?
The three arch dependent codes are :
1. Interrupt level context switch
2. Start highest priority task ready to run
3. Task level context switch
In case 1 after serving the interrupt the scheduler returns to the highest priority task. In case 2, its called when we start the OS by OSStart()
In case 3, When ever a higher priority task is made ready and its called by timer interrupt
Now, where exactly or how exactly will the scheduler assigns CPU to a lower priority task given the high priority task is in wait ??
Thanks
Another way to consider your question is to ask yourself how did the high priority task get into the waiting state. The answer to both questions is that the high priority task calls an RTOS routine such as GetEvent(). (I don't know whether that is a real uCOS-II routine -- I'm just generalizing.). The RTOS routine puts the high priority task into the waiting state (i.e. blocked) and then the RTOS scheduler finds the next highest priority task that is ready to run and switches to that task's context. The RTOS will have several blocking functions that allow for a task context switch. For example when you read from a queue or mailbox or when you wait for a semaphore or mutex.
The scheduler runs whenever a scheduling event occurs. In your example, that occurs when the high priority task calls the event wait. In general OS calls that may block or yield cause the scheduler to run. The scheduler also runs on exit from ISRs including the IS timer ISR.
In general, when the scheduler performs a context switch, it copies the current processor core registers to the task's control block, and copies the stored register values for the task being switched to into the processor registers, with the stack pointer and program-counter copies last. The change to the program-counter causes execution to continue in the new task with the task's own stack, in the state it was when it last blocked or was preemted. Preemption can occur when a scheduling event occurs in an ISR that causes a higher priority task to become ready.
The thing about uC/OS-II is that it is described in intricate detail in Jean Labrosse's book. The general principles of RTOS with examples using uC/OS-II are described in
this online course by Jack Ganssle.
Interrupt level context switch is used for preemptive, for example, you have an low priority task running, and high priority need to run (OSTimeDly timeout, for example), in this situation, Interrupt level context switch will pause low priority task, then switch to high priority one.
For high to low priority switch, it need high one give up CPU resource by calling OS_Sched