minimal number of privileged instructions?

minimal number of privileged instructions? - operating-system

minimal number of privileged instructions?
Say we want to write a OS with minimal number of privileged instruction.
I think it should be 1, only MMU register. But what about other things? i.e mode bit, trap

Well you can implement an operating system with everything in system mode, and you could argue that there are no "privileged" instructions.
As to whether you could implement an OS with privileged and non-privileged modes using N different privileged instructions:
it would depend on the functionality you aimed to implement,
it would depend on the hardware instruction set, MMU design, etcetera, and
unless you were prepared to put months / years into a theoretical analysis, it would be a matter of debate / opinion as to whether your proposed answer was indeed correct.

Operating System need to provide security (including memory isolation from different programs) and abstraction (each program doesn't need to care how much memory is available on physical memory).
To maintain these: you need at least 1 privileged instruction.
The privilege instruction is to setup Memory Management Unit registers so that you can ensure the memory is protected. There should be no IO instructions and all IO and interrupt access should be memory mapped.
Use MMU to ensure kernel memory, kernel code, "memory of interrupt access" and "device's memory mapped IO interface" are not mapped to user space, so user processes cannot access these. These memories lies in kernel memory.

Related

Hoe does a bare metal hypervisor and the operating system it hosts cooridinate on system calls?

I have read a great deal about bare metal hypervisors, but never quite get the way they interact with an OS they are hosting.
Suppose you have Unix itself on bare metal. When in user mode, you can't touch or affect the OS internals. You get things done by a system call that gets trapped, sets the machine to kernel mode, then does the job for you. For example, in C you might malloc() a bunch, then eventually run out of initially allocated memory. If memory serves me right, malloc - when it knows it is out of memory - must make the system call to what I believe is break(). Once in kernel mode, your process's page table can be extended, then it returns and malloc() has the required extra memory (or something like that).
But if you have Unix on top of a bare metal hypervisor, how does this actually happen? The hypervisor, it would seem, must have the actual page tables for the whole system (across OSs, even). So Unix can't be in kernel mode when a system call to Unix gets made, otherwise it could mess with other OSs running at the same time. On the other hand, if it is running in User mode, how would the code that implements break ever let the hypervisor know it wants more memory without the Unix code being rewritten?

In most architectures another level is added beyond supervisor, and supervisor is somewhat degraded. The kernel believes itself to be in control of the machine, but that is an illusion crafted by the hypervisor.
In ARM, user mode is 0, system is 1, hypervisor is 2. Intel were a bit short sighted (gasp) and had user as 3, supervisor as 0, thus hypervisor is a sort of -1. Obviously its not -1, but that is a handy shorthand to the intensely ugly interface they constructed for this handling.
In most architectures, the hypervisor gets to install an extra set of page tables which take effect after then guest's page tables do. So, your unix kernel thinks it was loaded at 1M physical, could be at any arbitrary address, and every address your unix kernel thinks is contiguous at a page boundary could be scatter over a vast set of actual (bus) addresses.
Even if your architecture doesn't permit an extra level of page tables, it is straightforward enough for a hypervisor to "trap & emulate" the page tables constructed by the guest, and maintain an actual set in a completely transparent fashion. The continual motion towards longer pipelines, however, increases the cost of each trap, thus an extra level page table is much appreciated.
So, your UNIX thinks it has all 8M of memory to itself; however unbeknownst to it, a sneaky hypervisor may be paging that 8M to a really big floppy drive, and only giving it a paltry 640K of real RAM. All the normal unix-y stuff works fine, except that it may have a pretty trippy sense of time, where time slows and speeds up in alternating phases, as they hypervisor attempts to pretend that a 250msec floppy disk access completed in the time of a 60nsec dram access.
This is where hypervisors get hard.

Is micro kernel possible without MMU?

In the following link;
https://www.openhub.net/p/f9-kernel
F9 Microkernel runs on Cortex M, but Cortex M series doesn't have MMU. My knowledge on MMU and Virtual Memory are limited hence the following quesitons.
How the visibility of entire physical memory is prevented for each process without MMU?
Is it possible to achieve isolation with some static memory settings without MMU. (with enough on chip RAM to run my application and kernel then, just different hard coded memory regions for my limited processes). But still I don't will this prevent the access?

ARM Cortex-M processors lack of MMU, and there is optional memory protection unit (MPU) in some implementations such as STMicroelectronics' STM32F series.
Unlike other L4 kernels, F9 microkernel is designed for MPU-only environments, optimized for Cortex M3/M4, where ARMv7 Protected Memory System Architecture (PMSAv7) model is supported. The system address space of a PMSAv7 compliant system is protected by a MPU. Also, the available RAM is typically small (about 256 Kbytes), but a larger Physical address space (up to 32-bit) can be used with the aid of bit-banding.
MPU-protected memory is divided up into a set of regions, with the number of regions supported IMPLEMENTATION DEFINED. For example, STM32F429, provides 8 separate memory regions. In PMSAv7, the minimum protect region size is 32 bytes, and maximum is up to 4 GB. MPU provides full access over:
Protection region
Overlapping protection region
Access permissions
Exporting memory attributes to the system
MPU mismatches and permission violations invoke the programmable priority MemManage fault handler.
Memory management in F9 microkernel, can split into three conceptions:
memory pool, which represents the area of PAS with specific attributes (hardcoded in mem map table).
address space - sorted list of fpages bound to particular thread(s).
flexible page - unlike traditional pages in L4, fpage represent in MPU region instead.

Yes, but ....
There is no requirement for an MMU at all, things just get less convenient and flexible. Practically, anything that provides some form of isolation (e.g. MPU) might be good enough to make a system work - assuming you do need isolation at all. If you don't need it for some reason and just want the kernel to do scheduling, then a kernel can do this without an MMU or MPU also.

Context Switch questions: What part of the OS is involved in managing the Context Switch?

I was asked to anwer these questions about the OS context switch, the question is pretty tricky and I cannot find any answer in my textbook:
How many PCBs exist in a system at a particular time?
What are two situations that could cause a Context Switch to occur? (I think they are interrupt and termination of a process,but I am not sure )
Hardware support can make a difference in the amount of time it takes to do the switch. What are two different approaches?
What part of the OS is involved in managing the Context Switch?

There can be any number of PCBs in the system at a given moment in time. Each PCB is linked to a process.
Timer interrupts in preemptive kernels or process renouncing control of processor in cooperative kernels. And, of course, process termination and blocking at I/O operations.
I don't know the answer here, but see Marko's answer
One of the schedulers from the kernel.

3: A whole number of possible hardware optimisations
Small register sets (therefore less to save and restore on context switch)
'Dirty' flags for floating point/vector processor register set - allows the kernel to avoid saving the context if nothing has happened to it since it was switched in. FP/VP contexts are usually very large and a great many threads never use them. Some RTOSs provide an API to tell the kernel that a thread never uses FP/VP at all eliminating even more context restores and some saves - particularly when a thread handling an ISR pre-empts another, and then quickly completes, with the kernel immediately rescheduling the original thread.
Shadow register banks: Seen on small embedded CPUs with on-board singe-cycle SRAM. CPU registers are memory backed. As a result, switching bank is merely a case of switching base-address of the registers. This is usually achieved in a few instructions and is very cheap. Usually the number of context is severely limited in these systems.
Shadow interrupt registers: Shadow register banks for use in ISRs. An example is all ARM CPUs that have a shadow bank of about 6 or 7 registers for its fast interrupt handler and a slightly fewer shadowed for the regular one.
Whilst not strictly a performance increase for context switching, this can help ith the cost of context switching on the back of an ISR.
Physically rather than virtually mapped caches. A virtually mapped cache has to be flushed on context switch if the MMU is changed - which it will be in any multi-process environment with memory protection. However, a physically mapped cache means that virtual-physical address translation is a critical-path activity on load and store operations, and a lot of gates are expended on caching to improve performance. Virtually mapped caches were therefore a design choice on some CPUs designed for embedded systems.

The scheduler is the part of the operating systems that manage context switching, it perform context switching in one of the following conditions:
1.Multitasking
2.Interrupt handling
3.User and kernel mode switching
and each process have its own PCB

How are the stack pointer and program status word maintained in multiprocessor architecture?

In a multi-processor architecture, how are registers organized?
For example, in a 4 cores processor, a minimum of 4 processes can run at a time.
How are stack pointer, program status registers and program counter organized?
What about other general purpose registers?
My guess is, each core will have a separate set of registers.

Imagine 4 completely separate computers, each with a single-core CPU. A 4-core computer is like that; except:
All CPUs share the same physical address space (and can all use the same RAM, PCI devices, etc)
Interrupt/IRQ controllers may be designed so the OS can tell it which CPU/s should be interrupted by the IRQ
CPUs are typically able to signal each other (e.g. "inter-processor interrupts")
Some CPUs may share some caches
Some CPUs may share some control registers (e.g. for things like power management, cache configuration, etc)
For modern CPUs, some CPUs may share some or all execution units (SMT, hyper-threading, etc)
For modern systems (where memory controller is built into the physical chip) some CPUs may share the same memory controller
Most of this is "invisible" to most software. Unless you're writing part of an OS that controls power management, you don't need to care if power management is shared between CPUs or not; unless you're writing an OSs/kernel's low level IRQ handling you don't need to care how IRQs reach device drivers, etc.
The same applies to how many CPUs actually exist. The OS/kernel normally ensures that applications only need to care about higher level abstractions (e.g. "threads"). How this higher level abstraction works depends on the OS - normally (for most OSs) the OS/kernel attempts to provide the illusion that all threads are running at the same time by switching between them "quickly enough" (where if there's only 4 CPUs a maximum of 4 threads actually do run at the exact same time), but it's usually far more complex than this (involving things like thread priorities, pre-emption rules, etc) and (even though it's relatively rare) it may be very different (e.g. for some systems the same thread may be run on multiple CPUs at the same time for fault tolerance/redundancy purposes; for some systems there might just be a queue of functions and their data, where multiple functions run at the same time; etc).

Multiprocessor means that there are at least two discrete processors on the same platform -- usually on the same motherboard
A subset is distributed multiprocessing, where two PC's for example are programmed to appear as a single system with two processors
Multicore means that the most or all of the CPU is replicated many times on single chip.
- this also means that stack, status, program counter and all generic purpose registers are replicated.
Hyperthreading is a technique, where each stage of the pipeline executes commands from different processes.
Multiprocessing means in OS level that everything a process consists of, is switched every now and then.
Multithreading is a lightweight variant of multiprocessing, where the threads e.g. share the same code segment and same data segment, same file descriptors etc. but have unique stacks (and of course unique status registers and program counters)
Also means multiprocessing in general (hardware architecture)

How exactly does OS protect kernel

my question is how exactly does operating system protect it's kernel part.
From what I've found there are basically 2 modes kernel and user. And there should be some bits in memory segments which tels if a memory segment is kernel or user space segment. But where is the origin of those bits? Is there some "switch" in compiler that marks programs as kernel programs? And for example if driver is in kernel mode how does OS manages its integration to system so there is not malicious software added as a driver?
If someone could enlighten me on this issue, I would be very grateful, thank you

The normal technique is by using a feature of the virtual memmory manager present in most modern cpus.
The way that piece of hardware works is that it keeps a list of fragments of memory in a cache, and a list of the addresses to which they correspond. When a program tries to read some memory that is not present in that cache, the MMU doesn't just go and fetch the memory from main ram, because the addresses in the cacher are only 'logical' addresses. Instead, it invokes another program that will interpret the address and fetch that memory from wherever it should be.
That program, called a pager, is supplied by the kernel, and special flags in the MMU prevent that program from being overridden.
If that program determines that the address corresponds to memory the process should get to use, it supplies the MMU with the physical address in main memory that corresponds to the logical address the user program asked for, the MMU fetches it into its cache, and resumes running the user program.
If that address is a 'special' address, like for a memory mapped file, then the kernel fetches the corresponding part of the file into the cache and lets the program run along with that.
If the address is in the range that belongs to the kernel, or that the program hasn't allocated that address to itself yet, the pager raises a SEGFAULT, killing the program.
Because the addresses are logical addresses, not physical addresses, different user programs may use the same logical addresses to mean different physical addresses, the kernel pager program and the MMU make this all transparent and automatic.
This level of protection is not available on older CPU's (like 80286 cpus) and some very low power devices (like ARM CortexM3 or Attiny CPUs) because there is no MMU, all addresses on these systems are physical addresses, with a 1 to 1 correspondence between ram and address space

The “switch” is actually in the processor itself. Some instructions are only available in kernel mode (a.k.a. ring 0 on i386). Switching from kernel mode to user mode is easy. However, there are not so many ways to switch back to kernel mode. You can either:
send an interrupt to the processor
make a system call.
In either case, the operation has the side effect of transferring the control to some trusted, kernel code.

When a computer boots up, it starts running code from some well known location. That code ultimately ends up loading some OS kernel to memory and passing control to it. The OS kernel then sets up the CPU memory map via some CPU specific method.

And for example if driver is in kernel mode how does OS manages its integration to system so there is not malicious software added as a driver?
It actually depends on the OS architecture. I will give you two examples:
Linux kernel: A driver code can be very powerful. The level of protections are following:
a) A driver is allowed to access limited number of symbols in the kernel, specified using EXPORT_SYMBOL. The exported symbols are generally functions. But nothing prevents a driver from trashing a kernel using wild pointers. And the security using EXPORT_SYMBOL is nominal.
b) A driver can only be loaded by the privileged user who has root permission on the box. So as long as root privileges are not breached system is safe.
Micro kernel like QNX: The operating system exports enough interface to the user so that a driver can be implemented as a user space program. Hence the driver at least cannot easily trash the system.