When CPU protection ring play it's role? - operating-system

I was reading cpu protection rings and system call working. But that leads me to a different question. What if I don't (i.e. a user program ) use kernel API calls (system calls), and if possible write everything in assembly and execute it. If the user program has some inconsistent code, the CPU must not execute them, or the system may crash. But at what point in time the CPU realizes that a particular instruction xyzw must not be executed? How the protection level plays the key role here? Does the underlying ISA have a predefined privilege level for each instruction?
Thank You.

If the user program has some inconsistent code, the CPU must not
execute them, or the system may crash. But at what point in time the
CPU realizes that a particular instruction xyzw must not be executed?
what does this mean?
if there is wrong stuff, say division by 0, the cpu will raise an exception while trying to execute it. this switches you to the kernel and the os decides what to do - typically kill the process. modulo cpu bugs this is what happens for all "inconsistent" instructions.
cpu raises exceptions and switches to the kernel all the time - page faults, first time fpu use and whatnot are the standard reasons.

Related

How does the scheduler of an operating system regain control from a process?

-I would like to know, if we have a single core cpu and lets say that for a long time there are only cpu intesive processes (no I\O requests) how does the scheduler regain the control?
-I have read some stuff about timer interupts, i would like to know how is, the operating system, able to set this timer?
I would like to know, if we have a single core cpu and lets say that for a long time there are only cpu intesive processes (no I\O requests) how does the scheduler regain the control?
There's multiple choices:
a) It's a cooperative scheduler and gets control when the currently running task voluntarily or accidentally gives the scheduler control via. a kernel API function (which might be like yield() but could be anything that cause the currently running task to block - e.g. read()) or an exception (e.g. trying to access data that the kernel sent to swap space, causing a page fault where the page fault handler blocks the task until the data it needs is fetched from swap space). This can include the task crashing.
b) It's a preemptive scheduler that uses hardware (e.g. a timer) to ensure that kernel will gain control (and pass control to scheduler). Note that it might or might not be a timer (e.g. it could be a counter that counts the number of instructions executed, which has advantages for modern systems where CPU speed varies due to power management).
c) It's a "less cooperative/semi-preemptive" scheduler that opportunistically checks if a task switch should be done any time anything causes the kernel to gain control but doesn't explicitly use any hardware to ensure that kernel will gain control (e.g. so that things that seem unrelated to scheduling, like freeing memory, can cause a task switch).
d) It's a combination of the last 2 options - a preemptive scheduler that uses hardware to ensure that kernel will gain control; that (whenever kernel has control for any reason) opportunistically checks if a task switch can be done a little early to avoid a relatively expensive IRQ that would've occurred soon.
I have read some stuff about timer interupts, i would like to know how is, the operating system, able to set this timer?
"The operating system" is a huge amount of stuff (e.g. includes things like data files for a help system and graphics for icons and ...). Typically there is a kernel which is able to do anything it likes with no restrictions; including accessing timer hardware directly.
The exact details of how a kernel would set a timer depends on which kind of timer it is. Note that there may be different types of timer to choose from (e.g. an 80x86 PC might have a PIT chip, an RTC chip, HPET, and a local APIC timer built into each CPU; where some are configured via. IO ports, some are configured via. memory mapped registers, and one may be configured via. special registers/MSRs built into the CPU; where each type of timer has different frequencies, precision, accuracy, capabilities, etc).

EPT violations through speculative execution [duplicate]

The questions stems from reading the Spectre attack paper. If I understand it correctly the attack stems from the possibility of CPU heuristics speculatively executing (the wrong) branch of code.
Consider the example (in C):
int arr[42];
if (i < 42) {
int j = arr[i];
}
If I understand the paper correctly, the int j = arr[i] can be (in certain circumstances) speculatively executed even when i >= 42. My question is - when I access array outside of its bounds my program would often crash (segmentation fault on Linux, "The program performed an illegal operation" error on Windows).
Why does speculative execution not cause programs to crash in case of array out of bound access?
The key point is that in modern CPUs the verb executing doesn't mean what you think it means.
To execute an instruction is the act of computing its output and side effects if any.
However, this doesn't change the program state.
This seems hard to grasp at first but it's really nothing exotic.
The CPU has a quite big internal memory made by all its registers, most of this memory is not visible to the programmer, the part that is it is known as the architectural state.
The architectural state (AS) is what is documented in the CPU manuals and what is altered by a sequence of instructions (a program, for example).
Since altering the AS can only happen with the semantics given in the ISA (the manuals) and the ISA specify a serial semantics (instructions are completed one after the other in the program order) this doesn't allow parallelism.
However, a modern CPU has a lot of resources (known as execution unit) that can do their work independently.
To exploit all these resources the front-end of the CPU (the part that is responsible for reading instructions from the memory hierarchy and feeding them to the execution units) is able to reach, decode and output more that one instruction per cycle.
The boundary between the front-end and the back-end (where the execution units lie) is not really dealing with instructions anymore (but with uops) but that's an x86 CISC nuisance.
So now the CPU is given 4/6 uops to "execute" at a time but if the ISA is serial, what it could possibly do other than queuing these uops?
Well, the front-end is made so that these uops don't operate on the AS but on a shadow state (SS, my terminology here), their operands are renamed, made of part of the big invisible memory of the CPU.
Altering the in parallel or out-of-order is fine as it is not the AS.
This is what execution is: altering the SS.
Does it really worth it? Afterall it is the AS that matters.
Well, transferring the SS to the AS is really fast compared to execution, so it's worth it.
It is a matter of "renaming back" (inverting the previous renaming) and it is called retiring of the instructions.
Actually, retiring is a bit more than that.
Since execution doesn't affect the AS, the side effect should also not affect it.
These side effects include exceptions but speculatively handling an exception is too cumbersome (it needs to coordinate a lot of resources) so exception handling is delayed until retirement.
This also has the advantage of having the correct AS at the moment the exception is handled and the advantage of raising an exception only when it must actually be.
The point of speculative execution is to bet, the CPU bets that the instructions sequence doesn't generate any exception (including page fault) and thus execute it with most checks off (I cannot exclude, off the top of my head, that some check is not made regardless) thereby gaining a lot of advantage.
When it's time to retire those instructions the bets are checked and if any fails, the SS is discarded.
That's why speculatively execution doesn't crash your program.
What Spectre relies on is the fact that speculatively execution does indeed alter the AS in some sense: the caches are not invalidated (again for performance reasons, the SS is simply not copied into the AS when a bet is off) and timing attacks are possible.
This could be corrected in a number of ways, including performing a basic privilege check when reading from the TLB (after all only privileges 0 and 3 are used, so the logic is simple) or adding a bit to the cache lines to mark them speculative (treated as invalid by non speculative code).

Does each system call create a process?

Does each system call create a process?
Are all functions (e.g. interrupts) of programs and operating systems executed in the form of processes?
I feel that such a large number of process control blocks, a large number of process scheduling waste a lot of resources.
Or, the kernel instruction of the system call is regarded as part of the current
process.
The short answer is - not exactly. But we have to agree on what we are going to call a "process". A process is more of an abstract idea, which encapsulates multiple instructions, each sequentially executed.
So let's start from the first question.
Does each system call create a process?
No. Each system call is the product of the currently running process, that tells the OS - "Hey OS, I need you to open this file for me, or read these here bits". In this case, the process is a bag of sequentially executed instructions, some are system calls, some are not.
Then we have.
Are all functions (e.g. interrupts) of programs and operating systems executed in the form of processes?
Well this kind of goes back to the first question. We are not considering that a system call (an operation that tell the OS to do something and works under very strict conditions) is a separate process. We will NOT see that system call execution to have its OWN process id (pid).
Then we have.
I feel that such a large number of process control blocks, a large number of process scheduling waste a lot of resources.
Well, I would say, do not underestimate your OS and the capabilities of your hardware. A modern processor with a modern OS on it, is VERY, VERY fast and more than capable of computing billions of instructions in seconds. We can't really imagine how fast that is. I wouldn't worry for optimizations on such a micro-level.
Okay, but let's dig deeper into this. What is a process exactly?
Informally, a process is a program in execution. The status of the current activity of a process is represented by a value, called the program counter, and the contents of the processor’s registers. The memory layout of a process is typically divided into multiple sections.
These sections include:
Text section.
Data section.
Heap section.
Stack section.
As a process executes, it changes state. The state of a process is defined in part by the current activity of that process. Each process is represented in the OS by a process control block (PCB), as you already mentioned.
So we can see that we treat a process as a very complicated structure that is MORE that just occupying CPU time. It has a state, storage, timing, and so on.
But because you are interested in system calls, then what are they?
For us, system calls provide an interface to the services made available by an OS. They are the way we tell the OS to do things FOR US. We know that systems execute thousands of system calls per second.
No, they don't.
The operating system uses software interrupt to execute the system call operation within the same process.
You can imagine them as a function call but they are executed with kernel privileges.

How is it possible for OS processes to manage User processes while they themselves are processes?

Recently, I have been reading about Operating Systems, and this bugs me a lot.
How is it really possible for one process to manage other process.
Basically a CPU simply executes instructions, after executing one instruction, then it executes the instruction at address pointed by IP and increments the IP.
Let me elaborate my doubt with an example. Lets say I have an User process (or simply a process) which is being executed by CPU. Lets say, it has 'n' instruction and currently executing 'i'th instruction. IP points to (i+1)th instruction.
So, at this point how can all other OS processes like Scheduler, dispatcher etc... comes into play, Since CPU is already executing another process.
One solution (Just a guess), I could think of is , the use of Interrupts and Interrupt Service Routines.
But its only a guess.
PS: I searched and couldn't find any satisfying answer.
With the help of the hardware, ticks causes the CPU to execute operating system code. This code checks the system state and the time that has elapsed since the beginning of this process execution. At this point, the operating system can decide to schedule a different process. All it has to do is save the current state of the running process with the process that is about to start running. (basically changing the content of the registers and saving the registers state before changing to the new process).
Eventually, the CPU is taken away even if the process doesn't want to yield it.
To address your concern, there are no operating system processes in the way you think... it isn't like there are OS processes in the queue waiting among other processes....

Relationship between a kernel and a user thread

Is there a relationship between a kernel and a user thread?
Some operating system textbooks said that "maps one (many) user thread to one (many) kernel thread". What does map means here?
When they say map, they mean that each kernel thread is assigned to a certain number of user mode threads.
Kernel threads are used to provide privileged services to applications (such as system calls ). They are also used by the kernel to keep track of what all is running on the system, how much of which resources are allocated to what process, and to schedule them.
If your applications make heavy use of system calls, more user threads per kernel thread, and your applications will run slower. This is because the kernel thread will become a bottleneck, since all system calls will pass through it.
On the flip side though, if your programs rarely use system calls (or other kernel services), you can assign a large number of user threads to a kernel thread without much performance penalty, other than overhead.
You can increase the number of kernel threads, but this adds overhead to the kernel in general, so while individual threads will be more responsive with respect to system calls, the system as a whole will become slower.
That is why it is important to find a good balance between the number of kernel threads and the number of user threads per kernel thread.
http://www.informit.com/articles/printerfriendly.aspx?p=25075
Implementing Threads in User Space
There are two main ways to implement a threads package: in user space and in the kernel. The choice is moderately controversial, and a hybrid implementation is also possible. We will now describe these methods, along with their advantages and disadvantages.
The first method is to put the threads package entirely in user space. The kernel knows nothing about them. As far as the kernel is concerned, it is managing ordinary, single-threaded processes. The first, and most obvious, advantage is that a user-level threads package can be implemented on an operating system that does not support threads. All operating systems used to fall into this category, and even now some still do.
All of these implementations have the same general structure, which is illustrated in Fig. 2-8(a). The threads run on top of a run-time system, which is a collection of procedures that manage threads. We have seen four of these already: thread_create, thread_exit, thread_wait, and thread_yield, but usually there are more.
When threads are managed in user space, each process needs its own private thread table to keep track of the threads in that process. This table is analogous to the kernel's process table, except that it keeps track only of the per-thread properties such the each thread's program counter, stack pointer, registers, state, etc. The thread table is managed by the run-time system. When a thread is moved to ready state or blocked state, the information needed to restart it is stored in the thread table, exactly the same way as the kernel stores information about processes in the process table.
When a thread does something that may cause it to become blocked locally, for example, waiting for another thread in its process to complete some work, it calls a run-time system procedure. This procedure checks to see if the thread must be put into blocked state. If so, it stores the thread's registers (i.e., its own) in the thread table, looks in the table for a ready thread to run, and reloads the machine registers with the new thread's saved values. As soon as the stack pointer and program counter have been switched, the new thread comes to life again automatically. If the machine has an instruction to store all the registers and another one to load them all, the entire thread switch can be done in a handful of instructions. Doing thread switching like this is at least an order of magnitude faster than trapping to the kernel and is a strong argument in favor of user-level threads packages.
However, there is one key difference with processes. When a thread is finished running for the moment, for example, when it calls thread_yield, the code of thread_yield can save the thread's information in the thread table itself. Furthermore, it can then call the thread scheduler to pick another thread to run. The procedure that saves the thread's state and the scheduler are just local procedures, so invoking them is much more efficient than making a kernel call. Among other issues, no trap is needed, no context switch is needed, the memory cache need not be flushed, and so on. This makes thread scheduling very fast.
User-level threads also have other advantages. They allow each process to have its own customized scheduling algorithm. For some applications, for example, those with a garbage collector thread, not having to worry about a thread being stopped at an inconvenient moment is a plus. They also scale better, since kernel threads invariably require some table space and stack space in the kernel, which can be a problem if there are a very large number of threads.
Despite their better performance, user-level threads packages have some major problems. First among these is the problem of how blocking system calls are implemented. Suppose that a thread reads from the keyboard before any keys have been hit. Letting the thread actually make the system call is unacceptable, since this will stop all the threads. One of the main goals of having threads in the first place was to allow each one to use blocking calls, but to prevent one blocked thread from affecting the others. With blocking system calls, it is hard to see how this goal can be achieved readily.
The system calls could all be changed to be nonblocking (e.g., a read on the keyboard would just return 0 bytes if no characters were already buffered), but requiring changes to the operating system is unattractive. Besides, one of the arguments for user-level threads was precisely that they could run with existing operating systems. In addition, changing the semantics of read will require changes to many user programs.
Another alternative is possible in the event that it is possible to tell in advance if a call will block. In some versions of UNIX, a system call, select, exists, which allows the caller to tell whether a prospective read will block. When this call is present, the library procedure read can be replaced with a new one that first does a select call and then only does the read call if it is safe (i.e., will not block). If the read call will block, the call is not made. Instead, another thread is run. The next time the run-time system gets control, it can check again to see if the read is now safe. This approach requires rewriting parts of the system call library, is inefficient and inelegant, but there is little choice. The code placed around the system call to do the checking is called a jacket or wrapper.
Somewhat analogous to the problem of blocking system calls is the problem of page faults. We will study these in Chap. 4. For the moment, it is sufficient to say that computers can be set up in such a way that not all of the program is in main memory at once. If the program calls or jumps to an instruction that is not in memory, a page fault occurs and the operating system will go and get the missing instruction (and its neighbors) from disk. This is called a page fault. The process is blocked while the necessary instruction is being located and read in. If a thread causes a page fault, the kernel, not even knowing about the existence of threads, naturally blocks the entire process until the disk I/O is complete, even though other threads might be runnable.
Another problem with user-level thread packages is that if a thread starts running, no other thread in that process will ever run unless the first thread voluntarily gives up the CPU. Within a single process, there are no clock interrupts, making it impossible to schedule processes round-robin fashion (taking turns). Unless a thread enters the run-time system of its own free will, the scheduler will never get a chance.
One possible solution to the problem of threads running forever is to have the run-time system request a clock signal (interrupt) once a second to give it control, but this, too, is crude and messy to program. Periodic clock interrupts at a higher frequency are not always possible, and even if they are, the total overhead may be substantial. Furthermore, a thread might also need a clock interrupt, interfering with the run-time system's use of the clock.
Another, and probably the most devastating argument against user-level threads, is that programmers generally want threads precisely in applications where the threads block often, as, for example, in a multithreaded Web server. These threads are constantly making system calls. Once a trap has occurred to the kernel to carry out the system call, it is hardly any more work for the kernel to switch threads if the old one has blocked, and having the kernel do this eliminates the need for constantly making select system calls that check to see if read system calls are safe. For applications that are essentially entirely CPU bound and rarely block, what is the point of having threads at all? No one would seriously propose computing the first n prime numbers or playing chess using threads because there is nothing to be gained by doing it that way.
User threads are managed in userspace - that means scheduling, switching, etc. are not from the kernel.
Since, ultimately, the OS kernel is responsible for context switching between "execution units" - your user threads must be associated (ie., "map") to a kernel schedulable object - a kernel thread†1.
So, given N user threads - you could use N kernel threads (a 1:1 map). That allows you to take advantage of the kernel's hardware multi-processing (running on multiple CPUs) and be a pretty simplistic library - basically just deferring most of the work to the kernel. It does, however, make your app portable between OS's as you're not directly calling the kernel thread functions. I believe that POSIX Threads (PThreads) is the preferred *nix implementation, and that it follows the 1:1 map (making it virtually equivalent to a kernel thread). That, however, is not guaranteed as it'd be implementation dependent (a main reason for using PThreads would be portability between kernels).
Or, you could use only 1 kernel thread. That'd allow you to run on non multitasking OS's, or be completely in charge of scheduling. Windows' User Mode Scheduling is an example of this N:1 map.
Or, you could map to an arbitrary number of kernel threads - a N:M map. Windows has Fibers, which would allow you to map N fibers to M kernel threads and cooperatively schedule them. A threadpool could also be an example of this - N workitems for M threads.
†1: A process has at least 1 kernel thread, which is the actual execution unit. Also, a kernel thread must be contained in a process. OS's must schedule the thread to run - not the process.
This is a question about thread library implement.
In Linux, a thread (or task) could be in user space or in kernel space. The process enter kernel space when it ask kernel to do something by syscall(read, write or ioctl).
There is also a so-called kernel-thread that runs always in kernel space and does not represent any user process.
According to Wikipedia and Oracle, user-level threads are actually in a layer mounted on the kernel threads; not that kernel threads execute alongside user-level threads but that, generally speaking, the only entities that are actually executed by the processor/OS are kernel threads.
For example, assume that we have a program with 2 user-level threads, both mapped to (i.e. assigned) the same kernel thread. Sometimes, the kernel thread runs the first user-level thread (and it is said that currently this kernel thread is mapped to the first user-level thread) and some other times the kernel thread runs the second user-level thread. So we say that we have two user-level threads mapped to the same kernel thread.
As a clarification:
The core of an OS is called its kernel, so the threads at the kernel level (i.e. the threads that the kernel knows of and manages) are called kernel threads, the calls to the OS core for services can be called kernel calls, and ... . The only definite relation between kernel things is that they are strongly related to the OS core, nothing more.