I was reading Critical Section Problem from Operating System Concepts by Peter B. Galvin.
According to it
1) Progress is : If no process is executing in its critical section and some processes wish to enter their critical sections, then only those processes that are not executing in their remainder section can participate in deciding which will enter its critical section next, and this selection cannot be postponed indefinitely.
And
2) Bounded waiting is : There exists a bound, or limit, on the number of times other processes are allowed to enter their critical sections after a process has made request to enter its critical section and before that request is granted.
I am not understanding what the author wants to say in both the cases.
Could you please make me understand by giving a proper example related to this definition.
Thank You.
First, let me introduce some terminology. A critical section (CS) is a sequence of instructions that can be executed by at most one process at the same time. When using critical sections, the code can be broken down into the following sections:
// Some arbitrary code (such as initialization).
EnterCriticalSection(cs);
// The code that constitutes the CS.
// Only one process can be executing this code at the same time.
LeaveCriticalSection(cs);
// Some arbitrary code. This is called the remainder section.
The first section contains some code such as initialization code. We don't have a name for that section. The second section is the code that tries to enter the CS. The third section is the CS itself. The fourth section is the code that leaves the critical section. The fifth and last section is called the remainder section which can contain any code. Note that the CS itself can be different between processes (consider for example a process that that receives requests from a client and insert them in a queue and another process that processes these requests).
To make sure that an implementation of critical sections works properly, there are three conditions that must be satisfied. You mentioned two of them (which I will explain next). The third is mutual exclusion which is obviously vital. It's worth noting that mutual exclusion applies only to the CS and the leave section. However, the other three sections are not exclusive.
The first condition is progress. The purpose of this condition is to make sure that either some process is currently in the CS and doing some work or, if there was at least one process that wants to enter the CS, it will and then do some work. In both cases, some work is getting done and therefore all processes are making progress overall.
Progress: If no process is executing in its critical section and
some processes wish to enter their critical sections, then only those
processes that are not executing in their remainder section can
participate in deciding which will enter its critical section next,
and this selection cannot be postponed indefinitely.
Let's understand this definition sentence by sentence.
If no process is executing in its critical section
If there is a process executing in its critical section (even though not stated explicitly, this includes the leave section as well), then this means that some work is getting done. So we are making progress. Otherwise, if this was not the case...
and some processes wish to enter their critical sections
If no process wants to enter their critical sections, then there is no more work to do. Otherwise, if there is at least one process that wishes to enter its critical section...
then only those processes that are not executing in their remainder section
This means we are talking about those processes that are executing in either of the first two sections (remember, no process is executing in its critical section or the leave section)...
can participate in deciding which will enter its critical section next,
Since there is at least one process that wishes to enter its CS, somehow we must choose one of them to enter its CS. But who's going to make this decision? Those process who already requested permission to enter their critical sections have the right to participate in making this decision. In addition, those processes that may wish to enter their CSs but have not yet requested the permission to do so (this means that they are in executing in the first section) also have the right to participate in making this decision.
and this selection cannot be postponed indefinitely.
This states that it will take a limited amount of time to select a process to enter its CS. In particular, no deadlock or livelock will occur. So after this limited amount of time, a process will enter its CS and do some work, thereby making progress.
Now I will explain the last condition, namely bounded waiting. The purpose of this condition is to make sure that every process gets the chance to actually enter its critical section so that no process starves forever. However, please note that neither this condition nor progress guarantees fairness. An implementation of a CS doesn't have to be fair.
Bounded waiting: There exists a bound, or limit, on the number of
times other processes are allowed to enter their critical sections
after a process has made request to enter its critical section and
before that request is granted.
Let's understand this definition sentence by sentence, starting from the last one.
after a process has made request to enter its critical section and
before that request is granted.
In other words, if there is a process that has requested to enter its CS but has not yet entered it. Let's call this process P.
There exists a bound, or limit, on the number of
times other processes are allowed to enter their critical sections
While P is waiting to enter its CS, other processes may be waiting as well and some process is executing in its CS. When it leaves its CS, some other process has to be selected to enter the CS which may or may not be P. Suppose a process other than P was selected. This situation might happen again and again. That is, other processes are getting the chance to enter their CSs but never P. Note that progress is being made, but by other processes, not by P. The problem is that P is not getting the chance to do any work. To prevent starvation, there must be a guarantee that P will eventually enter its CS. For this to happen, the number of times other processes enter their CSs must be limited. In this case, P will definitely get the chance to enter its CS.
I would like to mention that the definition of a CS can be generalized so that at most N processes are executing in their critical sections where N is any positive integer. There are also variants of reader-writer critical sections.
Mutual exclusion
No two process can be simultaneously present inside critical section at any point in time, only one process can enter into a critical section at any point in time.
Image for Progress:
Progress
No process running outside the critical section should block the other interesting process from entering into a critical section when in fact the critical section is free.
In this image, P1 (which is running outside of critical section )is blocking P2 from entering into the critical section where in fact critical section is free.
Bounded waiting
No process should have to wait forever to enter into the critical section.
there should be a boundary on getting chances to enter into the critical section.
If bounded waiting is not satisfied then there is a possibility of starvation.
Note
No assumption is related to H/W or processing speed.
Overall, a solution to the critical section problem must satisfy three conditions:
Mutual Exclusion: Exclusive access of each process to the shared memory. Only one process can be in it's critical section at any given time.
Progress: If no process is in its critical section, and if one or more threads want to execute their critical section then any one of these threads must be allowed to get into its critical section.
Bounded Waiting: After a process makes a request for getting into its critical section, there is a limit for how many other processes can get into their critical section, before this process's request is granted. So after the limit is reached, system must grant the process permission to get into its critical section. The purpose of this condition is to make sure that every process gets the chance to actually enter its critical section so that no process starves forever.
Requirements to tell synchronisation solution is correct or not
1). Mutual exclusion:-at any point of time only one process should be present inside critical section.
2). Progress:-the process which is outside critical section and who do not want to enter critical section then such process should not stop the other interested process to enter into its critical section. If a process is getting success to stop other interested process then the progress is not guaranteed or else it is guaranteed. Critical section should be free.
3). Bounded waiting:-the waiting time of a process outside a critical section should be Limited.
4). Architectural neutral:-there is no assumption regarding hardware
(Definition in simple words)
Bounded Waiting :- when only single process gets the turn to enter into critical section every time indeed other process are also interested to enter into critical section.
Related
How does strict alternation guarantee bounded waiting?
If there are two process P⁰ and P¹.
Suppose turn=0 but P⁰ doesn't want to enter CS. And P¹ wants to. Won't it lead to starvation, so how bounded waiting is guaranteed?
Strict alternation implies there is unbounded waiting exactly due to the scenario you describe. If it is process 2's turn and it doesn't want to enter the critical section, then process 1 must wait for process 2 to enter and exit the critical section even though it was safe for process 1 to enter. Even worse, if process 2 halts while process 1 is waiting, then process 1 will wait forever.
An algorithm that strictly alternates violates both progress and bounded waiting properties.
As an aside, most good algorithms that alternate only do so under contention. For example, Peterson's Algorithm alternates between processes when the critical section is contended. However, if there is no contention (like the situation you describe), then a process can enter the critical section even though it is not its turn. Hence, Peterson's Algorithm has bounded waiting.
Alright, so I know that if a particular conditional branch has a condition that takes time to compute (memory access, for instance), the CPU assumes a condition result and speculatively executes along that path. However, what would happen if, along that path, yet another slow conditional branch pops up (assuming, of course, that the first condition hasn't been resolved yet and the CPU can't just commit the changes)? Does the CPU just speculate inside the speculation? What happens if the last condition is mispredicted but the first wasn't? Does it just rollback all the way?
I'm talking about something like this:
if (value_in_memory == y){
// computations
if (another_val_memory == x){
//computations
}
}
Speculative execution is the regular state of execution, not a special mode that an out of order CPU enters when it sees a branch and then leaves when the branch is no longer in flight.
This is easier to see if you consider that it's not just branches that can fault, but many instructions, including those that access memory, have restrictions on their input values, etc. So any substantial out of order execution implies constant speculation, and CPUs are built around that idea.
So "nested branches" doesn't end up being special in that sense.
Now, modern CPUs have a variety of methods for quick branch misprediction recovery, faster than recovery from other types of faults1. For example they may snapshot the state of the register mapping at some branches, to allow recovery to start before the branch is at the head of the reorder buffer. Since it is not always feasible to snapshot at all branches, there might be complicated heuristics involved to decide where to take snapshots.
I mention this last part because it is one way in which nested branches might matter: when there are lots of branches in flight, you might hit some microarchitectural limits related to the tracking of these branches for recovery purposes. For more details, you can look through patents for "branch order buffer" (for Intel techniques, but there are no doubt others).
1 The basic recovery method is keep executing until the faulting instruction is the next to retire, and then throw away all younger instructions. In the context of branch mispredictions, this means you could actually suffer two or more mispredictions only the oldest of which actually takes effect: e.g., a younger branch mispredicts, and while executing up to that branch (at which point recovery can occur), another mispredict occurs, so the younger one ends up getting discarded.
(Maybe not a complete answer, but I had some of this written when #BeeOnRope posted an answer. Posting this anyway for some more links and technical details in case anyone's curious.)
Everything is always speculative until it reaches retirement and becomes non-speculative, definitely happened, part of the architectural state.
e.g. any load might fault with a bad address, any div might trap on divide by zero. See also Out-of-order execution vs. speculative execution That and What exactly happens when a skylake CPU mispredicts a branch? mention that branch mispredicts are handled specially, because they're expected to be frequent. Fast-recovery can start before a mis-predicted branch reaches retirement, unlike the behaviour for a faulting load for example. (That's part of why Meltdown is exploitable.)
So even "regular" instructions are executed speculatively before being commited, and the only distinction between them is a human-made distinction, not computer-made? I presume, then, that the CPU stores multiple, possible rollback points? For instance if I have load instructions that may lead to page faults or simply use stale values, inside a conditional branch, the CPU identifies such instructions and scenarios and saves a state for each of them? I feel like I misunderstood because this may lead to a lot of storing register states and complicated dependencies.
The retirement state is always consistent so you can always roll back to there and discard all in-flight work, e.g. if an external interrupt arrives you want to handle it without waiting for a chain of a dozen cache miss loads to all execute. When an interrupt occurs, what happens to instructions in the pipeline?
This tracking basically happens for free or is something you need to do anyway to be able to detect which instruction faulted, not just that there was a problem somewhere. (This is called "precise exceptions")
The real distinction humans can usefully make is speculation that has a real chance of being wrong during execution of non-error cases. If your code gets a bad pointer, it doesn't really matter how it performs; it's going to page-fault and that's going to be very slow compared to local OoO exec details.
You're talking about a modern out-of-order (OoO) execution (not just fetch) CPU, like modern Intel or AMD x86, high-end ARM, MIPS r10000, etc.
The front-end is in-order (with speculation down predicted paths), and so is commit (aka retirement) from the out-of-order back-end into non-speculative retirement state. (aka known-good architectural state).
The CPU uses two major structures to track instructions (or on x86, uops = parts of instructions) in the back-end. The last stage of the front-end (after fetch / decode) allocates/renames instructions and adds them into both of these structures at once.
RS = Reservation Station = scheduler: not-yet-executed instructions, waiting for an execution unit. The RS tracks dependencies and sends the oldest-ready uops to execution units that are ready.
ROB = ReOrder Buffer: not-yet-retired instructions. Instructions enter and leave in-order so it can just be a circular buffer.
Includes a flag to mark each entry as executed or not, set once the RS has sent it to an execution unit which reports success. The oldest instructions in the ROB that all have their done-executing bit set can "retire".
Also includes a flag which indicates "fault if this reaches retirement". This avoids spending time handling page faults from load instruction on the wrong path of execution (that might well have pointers into an unmapped page), for example. Either in the shadow of a branch mispredict, or just after another instruction (in program order) that should have faulted first but OoO exec got to it later.
(I'm also leaving out register-renaming onto a large physical register file.
That's the "rename" part. Allocate includes choosing which execution port an instruction will use, and reserving a load or store buffer entry for memory instructions.)
(There's also a store-buffer; stores don't write directly to L1d cache, they write to the store buffer. This makes it possible to speculatively execute stores and still roll back without them becoming visible to other cores. It also decouples cache-miss stores from execution. Once a store instruction retires, the store-buffer entry "graduates" and is eligible to commit to L1d cache, once MESI gets exclusive access to the cache line, and once memory-ordering rules are satisfied.)
Execution units detect whether an instruction should fault, or was mis-speculated and should roll back, but don't necessarily act on that until the instruction reaches retirement.
In-order retirement is the step that recovers program-order after OoO exec, including the case of exceptions of mis-speculation.
Terminology: Intel calls it "issue" when instructions are sent from the front-end into the ROB + RS. Other computer architecture people often call that "dispatch".
Sending uops from the RS to execution units is called "dispatch" by Intel, "issue" by other people.
Refer to Galvin et. al Operating System Concepts, 8th edition, 6th chapter, section 6.9, page 257. It says, "If two critical sections are instead executed concurrently, the result is equivalent to their sequential execution in some unknown order. Although this property is useful in many application domains, in many cases we would like to make sure that a critical section forms a single logical unit of work that either is performed in its entirety or is not performed at all." When is that property useful? Please explain, thanks in advance! Also, please offer me some vegemite to eat!
The property is useful (because it increases potential parallelism) when the order that the critical sections are executed is irrelevant.
For a more complex example; let's say you have a thread fetching the next block from a file, a thread compressing the current block, and a thread sending the previously compressed block to a network connection.
In this case there are obvious constraints (you can't compress the current block while it's still being fetched, and you can't send the compressed block to a network connection until it's finished being compressed), but there are also obvious opportunities for parallelism where the order is irrelevant (you don't care if the next block is fetched before or after or while the current block is compressed, you don't care if the current block is compressed before or after or while the previously compressed block is being sent to network, and you don't care if the next block is fetched before or after or while the previously compressed block is being sent to network).
I'm looking for an example of a job for which response time is important.
One definition of response time is:
The time taken in an interactive program from the issuance of a command to the commence of a response to that command.
I've read that response time is important for interactivity, but I can't understand why. If the job isn't fully completed, what output could be produced that would be of interest to a user?
Wouldn't the user only care about how soon a job finishes, as that's the first time any output is produced?
For example, consider these two possible schedulings of two jobs:
Case 1: |---B---|---A---|
Case 2: |-A-|---B---|-A-|
Suppose that job A and B are issued at the same time, A being a command typed in by the user and B being some background process.
The response time for job A as I understand it would be shorter in case 2. As job A finishes (and produces output) at the same time in the two cases, I don't understand how the user benefits (or even notices) the better response time in case 2.
When writing an operating system, one has to take into consideration what will the intended audience be. In some cases it matters most to finish jobs as quickly as possible (supercomputer systems), in some cases it matters most to be as responsive as possible (regular desktop systems), and in some cases it matters most to be as predictable as possible (real-time systems).
For finishing jobs as fast as possible, tasks should be interrupted the rarest possible (so big intervals between task switches are the best option). Here response time doesn't really matter much. It should be noted that task switches usually take some time (thousands of CPU cycles usually) due to having to save the state (including registers and paging structures) of the old task to memory and restore the state (including registers and paging structures) of the new task from memory. This also causes cache and TLB misses, since the cached information doesn't usually belong to the current process.
For being the most responsive possible, tasks should be interrupted as often as possible so the user doesn't experience the so-called lag. This is where response time is important. Note however that on interrupt-driven architectures (like x86) an interrupt from the keyboard or the mouse would automatically pause execution of the current task and call the interrupt handler, which processes the input and sends it to the appropriate program.
For being the most predictable possible, input should be processed neither too fast, neither too slow. This means that response time is constrained from both ways, thus being much more important than in "most responsive possible" designs. A misprediction can even be a fatal failure in mission-critical systems.
In a nutshell, importance of response time varies from design to design and can range from nearly unimportant to critical.
I think I have an answer to my own question. The problem was, I was just thinking about simple processes like ls that once issued runs for some amount of time and then, when they're finished, deliver their first and only output.
However, suppose job A in the example from the question is a program with multiple print statements. Output will in that case be produced before the process is complete (and some of the printouts may well occur during the first scheduled burst). It would thus make sense for interactivity to want to begin running such a process as soon as possible.
In an operating system subject i'm taking this semester we were asked this question
what are the techniques that can be used to protect critical sections ??
i tried searching online but couldn't find anything
could anyone please briefly explain critical sections and what techniques to protect them ?
First of all critical section applies only to parallel execution, and it is a piece of code that cannot be executed by more than one thread / process at given time.
It occurs when two or more threads or processes want to write into the same location at once,
which potentially can cause incorrect state of the data or deadlock.
Even so innocent looking piece of code as i += 1 has to be protected in parallel world -- you have to remember that execution of thread or process can be suspended at any time by OS.
The basic mechanism of synchronization are mutexes and monitors.
With semaphores one can limit access to resources.
a) A process must first declare its intention to enter
the critical section by raising a flag.
b) Next, the critical section is entered and upon
leaving, the flag is lowered.
c) If the process is suspended after raising the flag
but before it able to enter the critical section,
the other process will see the raised flag and not
enter until the flag is lowered.