Hoe does a bare metal hypervisor and the operating system it hosts cooridinate on system calls?

Hoe does a bare metal hypervisor and the operating system it hosts cooridinate on system calls? - hypervisor

I have read a great deal about bare metal hypervisors, but never quite get the way they interact with an OS they are hosting.
Suppose you have Unix itself on bare metal. When in user mode, you can't touch or affect the OS internals. You get things done by a system call that gets trapped, sets the machine to kernel mode, then does the job for you. For example, in C you might malloc() a bunch, then eventually run out of initially allocated memory. If memory serves me right, malloc - when it knows it is out of memory - must make the system call to what I believe is break(). Once in kernel mode, your process's page table can be extended, then it returns and malloc() has the required extra memory (or something like that).
But if you have Unix on top of a bare metal hypervisor, how does this actually happen? The hypervisor, it would seem, must have the actual page tables for the whole system (across OSs, even). So Unix can't be in kernel mode when a system call to Unix gets made, otherwise it could mess with other OSs running at the same time. On the other hand, if it is running in User mode, how would the code that implements break ever let the hypervisor know it wants more memory without the Unix code being rewritten?

In most architectures another level is added beyond supervisor, and supervisor is somewhat degraded. The kernel believes itself to be in control of the machine, but that is an illusion crafted by the hypervisor.
In ARM, user mode is 0, system is 1, hypervisor is 2. Intel were a bit short sighted (gasp) and had user as 3, supervisor as 0, thus hypervisor is a sort of -1. Obviously its not -1, but that is a handy shorthand to the intensely ugly interface they constructed for this handling.
In most architectures, the hypervisor gets to install an extra set of page tables which take effect after then guest's page tables do. So, your unix kernel thinks it was loaded at 1M physical, could be at any arbitrary address, and every address your unix kernel thinks is contiguous at a page boundary could be scatter over a vast set of actual (bus) addresses.
Even if your architecture doesn't permit an extra level of page tables, it is straightforward enough for a hypervisor to "trap & emulate" the page tables constructed by the guest, and maintain an actual set in a completely transparent fashion. The continual motion towards longer pipelines, however, increases the cost of each trap, thus an extra level page table is much appreciated.
So, your UNIX thinks it has all 8M of memory to itself; however unbeknownst to it, a sneaky hypervisor may be paging that 8M to a really big floppy drive, and only giving it a paltry 640K of real RAM. All the normal unix-y stuff works fine, except that it may have a pretty trippy sense of time, where time slows and speeds up in alternating phases, as they hypervisor attempts to pretend that a 250msec floppy disk access completed in the time of a 60nsec dram access.
This is where hypervisors get hard.

Related

What is the purpose of Logical addresses in operating system? Why they are generated

I want to know that why the CPU generates logical addresses and then maps them into Physical addresses with the help of memory manager? Why do we need them.

Virtual addresses are required to run several program on a computer.
Assume there is no virtual address mechanism. Compilers and link editors generate a memory layout with a given pattern. Instruction (text segment) are positioned in memory from address 0. Then are the segments for initialized or uninitialized data (data and bss) and the dynamic memory (heap and stack). (see for instance https://www.geeksforgeeks.org/memory-layout-of-c-program/ if you have no idea on memory layout)
When you run this program, it will occupy part of the memory that will no longer be available for other processes in a completely unpredictable way. For instance, addresses 0 to 1M will be occupied, or 0 to 16k, or 0 to 128M, it completely depends on the program characteristics.
If you now want to run concurrently a second program, where will its instructions and data go to memory? Memory addresses are generated by the compiler that obviously do not know at compile time what will be the free memory. And remember memory addresses (for instructions or data) are somehow hard-coded in the program code.
A second problem happens when you want to run many processes and that you run out of memory. In this situations, some processes are swapped out to disk and restored later. But when restored, a process will go where memory is free and again, it is something that is unpredictable and would require modifying internal addresses of the program.
Virtual memory simplifies all these tasks. When running a process (or restoring it after a swap), the system looks at free memory and fills page tables to create a mapping between virtual addresses (manipulated by the processor and always unchanged) and physical addresses (that depends on the free memory on the computer at a given time).

Logical address translation serves several functions.
One of these is to support the common mapping of a system address space to all processes. This makes it possible for any process to handle interrupt because the system addresses needed to handle interrupts are always in the same place, regardless of the process.
The logical translation system also handles page protection. This makes is possible to protect the common system address space from individual users messing with it. It also allows protecting the user address space, such as making code and data read only, to check for errors.
Logical translation is also a prerequisite for implementing virtual memory. In an virtual memory system, each process's address space is constructed in secondary storage (ie disk). Pages within the address space are brought into memory as needed. This kind of system would be impossible to implement if processes with large address spaces had to be mapped contiguously within memory.

How are applications and data accessed by the CPU from RAM

I am having a bit of trouble understanding how applications and data are accessed by the CPU from RAM after the application has been loaded into RAM and a file opened (thus data for the file also stored in RAM).
By my understanding, a CPU just gets instructions from RAM as the program counter ticks or carries out tasks after an interrupt. How then does it access the application and data. Is it that it doesn't and still just gets instructions (for example to load a file on the hard drive to be opened in the application) and processes any requests made by the application which are stored in RAM as instructions thereafter (like saving a file). Or does the application and data relating to an opened file (for example) just stay in RAM and not get accessed by the CPU at all.
Similarly, after reading an article, it said that a copy of the operating system is stored in RAM. The CPU can then access the operating system. (I thought the CPU just worked with instructions from RAM). How does it then communicate with the operating system and how are interrupts sent to the CPU, from the copy of the OS in RAM or from the OS in the hard drive.
Sorry if this is really confusing, alot i didn't understand.

Root of your question: Lack of clear differentiation between Computer's Hardware and Computer's Software.
Components of a Computer System
Just so that we are clear about both of them and that we understand their nature, let me state as follows:
Hardware: It includes CPU, RAM, Disk, Register, Graphics Card, Network Card, Memory BUS and everything that you can touch and call to be the 'Computer'. It is the body.
Software: It includes Operating System, Program, CPU instruction, Compiler, Programming Language and almost everything intangible about the computer. It is the soul.
Firmware: It is that basic code which is absolutely essential for hardware's working. This is stored on a Read Only Memory installed in the hardware itself. This piece of software is vital for hardware therefore is considered in the mid of hardware and software and hence called Firmware.
We will start with understanding from the time when we say that the computer is up and running and is properly executing our instructions. But at that time you will say - How did I reach here? So I will mention a few points about the startup of the computer.
When the power button is pressed...
...the most primitive and basic input output system (therefore called BIOS), which is hard written on the computer hardware begins execution. This is written on Read Only Memory and this starts the process to get the machine to stand on its own. And it loads the software (Operating System) from one piece of hardware (disks) into another piece of hardware (RAM and CPU registers) enabling the software to work properly with hardware.
Now the body and soul are together and the individual (machine) can work.
Until now, OS is already in RAM and CPU. (Read When the power button is pressed if you doubt it.) Let's handle your question paragraph by paragraph now -
First Paragraph
I am having a bit of trouble understanding how applications and data
are accessed by the CPU from RAM after the application has been loaded
into RAM and a file opened (thus data for the file also stored in
RAM).
The explanation is as follows:
The exact issue here is your thinking that it is CPU and RAM that access the data. CPU and RAM are only executing units.
It is OS (software) that accesses the data by means of CPU and RAM (hardware). It is in the realm of OS where applications are executed.
This is why you can install Linux and Windows on same hardware but cannot execute .exe files in Linux because OS does the execution and not RAM/CPU.
Further, how do CPU and RAM and disk physically interact to bring in the data, execute it, save it back etc. is in the domain of hardware. That would require explanation which involves logic gates (AND, OR, NOT...), diodes, circuitry and a hell lot of other things which an Electronics guy can explain.
Second Paragraph
By my understanding, a CPU just gets instructions from RAM as the
program counter ticks or carries out tasks after an interrupt. How
then does it access the application and data. Is it that it doesn't
and still just gets instructions (for example to load a file on the
hard drive to be opened in the application) and processes any
requests made by the application which are stored in RAM as
instructions thereafter (like saving a file).
As you have guessed it - CPU doesn't get instructions, Operating System does it through CPU. Also, just the way brain doesn't directly instruct the hands and legs to move and instead uses nerves for interaction, the CPU doesn't tell the disks to give/take the data. CPU works with RAM and registers only. Multiple units of hardware work in conjunction to provide a path for data and instruction to travel. The important pieces of involved hardware are:
Processor (CPU and registers built in the CPU)
Cache
Memory (RAM)
Disk
Tape
I like the image provided in this answer. This image not only lists the hardware pieces but also illustrates the mammoth difference in the execution speed of these pieces.
Let's move on to the...
Third Paragraph
Similarly, after reading an article, it said that a copy of the
operating system is stored in RAM. The CPU can then access the
operating system. (I thought the CPU just worked with instructions
from RAM). How does it then communicate with the operating system and
how are interrupts sent to the CPU, from the copy of the OS in RAM or
from the OS in the hard drive.
By now you already know that indeed OS is present in RAM and CPU registers. That is where it lives. That is from where it tells the CPU how to work. If OS would be small enough (or if Registers and Caches would be big enough), the OS would live even closer to CPU.
The CPU does not communicate with the OS. It can't. It is the worker that is controlled by a boss. OS is that boss.
CPU cannot access Operating System. CPU is the body, OS is the soul. Soul tells the body what to do, not vice-versa.
CPU doesn't work with instructions from RAM. It merely executes the instructions given by the Operating System (which may be living in RAM). So even when there is an instruction to load some module of OS into the RAM, it is not RAM/CPU but OS itself that issues that instruction.
Interrupts are of two types - Hardware and Software - and your query is about the software interrupts. Since the executive part of OS is in the RAM, in simple words we can say that interrupts are sent to CPU from OS living in RAM.
Conclusions
The lack of distinction between hardware and software is the basic cause of your confusions. Take some course about Operating Systems on Coursera or Academic Earth for deeper understanding.

It is confusing indeed. Let me try to explain.
CPU and RAM
The CPU is hardwired to the RAM via the 'motherboard', and they work together. The CPU can perform many instructions, but it has to be told what to do by instructions in RAM. The CPU is basically in a loop: all it does it fetch the next instruction from RAM and execute it, over and over.
So how does this RAM get filled with instructions?
BIOS (basic input/output system)
When the computer first boots up, a portion of RAM is filled with data from a chip on the motherboard (the BIOS chip), and the CPU is turned on and starts processing. These are the factory settings.
The data from the BIOS chip that is copied to RAM consists of a library of instructions to access hardware devices (hard disks, CD/ROM, USB storage, network cards etc.),
and a program using that library to load what is called the bootsector, the first sector on the boot device, into RAM, and transfer control to it (with a jump instruction).
BOOTLOADER
The bootsector data that the BIOS program loaded from the boot device is very small - only 440 bytes - but with the help of the BIOS library, this is enough to be able to load more sectors and execute these. The bootsector and the data it loads is called the bootloader, which is in charge of loading the Operating System.
In effect, the bootloader is a more dynamic version of the BIOS: the BIOS program resides in flash memory, whereas the bootloader resides on hard disks, USB sticks, SSD drives etc., and thus can be larger and more complex.
OPERATING SYSTEM
In it's turn, The operating system (OS) is simply a more advanced version of the bootloader, as it can load and run multiple programs from multiple locations at the same time.
--
The BIOS knows about drives.
The Bootloader knows about drives and partitions.
The OS knows about drives, partitions, and file systems.

CPU,as you've noticed, reads the program from RAM, instruction by instruction. When an instruction is executed, it might refer to data stored in memory, which it either fetches explicitly to the registers (internal storage of the CPU, quite small - on x86_64 that's like several 64-bit registers + other stuff like segment registers, IP, SP etc) with a separate instruction, or the data read from the memory (we are talking about small amount of data). That's all it really does.
Loading a file from a disk would be done by asking the appropriate controller to fetch the data into a specific place in memory. CPU is connected to buses which will carry instructions to appropriate controllers.
As to interrupts these are special things - CPU has several interrupt lines which can be activated by various devices, for example your network card. When it receives such an interrupt, it is usually handled by an interrupt handler, which is just a program located in a well-known place in memory. They can be registered by, for example, operating system. Each interrupt line has its own interrupt handler. When interrupt happens, the CPU saves the current state of the program it happens to be executing, handles interrupt, restores the state and resumes the program.

You seem to be asking about addressing modes. At the risk of gross oversimplification (ignoring caching, segments, and logical memory), memory stored as a sequential array accessed by an integer address.
The CPU has a number of internal storage areas called registers. We will call them R0 to Rn. The processor assigns some registers dedicated purposes. One of those registers is the PC.
One common addressing mode is deferred. I indicate this mode as (Rn). An instruction like this:
MOV (R0), R1
uses the value contained in R0 as a memory address, fetches the value stored that memory location, and stores a copy of that value in R1.
An instruction sequence like this:
MOV (R0), R1
MOV (R2), R3
is stored in memory as data (ignoring protection), code, data, and variables all use the same type of memory. In other words, any memory location can be interpreted as code, data, or variable.
The CPU executes the next instruction located at (PC). After executing the instruction, the CPU automatically increments the PC to point to the next instruction.

can we run kernel as an application, if we can load the kernel program into the address space

It sounds stupid, but is it possible to load any kernel program as an application on already running OS (not as virtual machine).
Like if we load the program into the process address space and run it.

[..] is it possible to load any kernel program as an application on already running OS [..]
No, since a kernel usually contains code to manage system resources, but the host kernel is already managing these. So this either leads to catastrophic failure, or - because we have memory protection and privilege levels and such - to access violations:
As a small example: Probably all kernels need to configure the interrupt service configuration of the underlying hardware (to get a timer tick, for example).
On x86, this is done by creating an interrupt descriptor table and loading the address of that table using the lidt instruction. When issued in an application process (which the host kernel will have running in ring 3, the least privileged level) the processor will refuse to execute that instruction, because it may only be issued in ring 0, and instead generate an general protection fault. The host kernel will be called to handle that situation (because when that kernel started it registered an interrupt descriptor table for exactly that purpose). The only way for the host kernel to react to that situation is to abort the process that caused the access violation, because otherwise it would risk system stability and integrity.
Similar issues arise when dealing with segmentation, paging, accessing memory mapped devices and access to peripherals in general.
That said, it is possible to create a kernel that can be run as a user space process, an example that I personally have worked with is RODOS, which can be run as a Linux process. To make this possible it is necessary to split hardware dependent stuff (which is a large portion) from the independent code (like scheduling, interprocess communication, ...) and provide stubs that reuse functionality of the host operating system to simulate some sort of hardware. (Of course, a such prepared kernel can only run on the host system if it is compiled for that use. You cannot use the same binary as you would use on raw hardware.)

How are the stack pointer and program status word maintained in multiprocessor architecture?

In a multi-processor architecture, how are registers organized?
For example, in a 4 cores processor, a minimum of 4 processes can run at a time.
How are stack pointer, program status registers and program counter organized?
What about other general purpose registers?
My guess is, each core will have a separate set of registers.

Imagine 4 completely separate computers, each with a single-core CPU. A 4-core computer is like that; except:
All CPUs share the same physical address space (and can all use the same RAM, PCI devices, etc)
Interrupt/IRQ controllers may be designed so the OS can tell it which CPU/s should be interrupted by the IRQ
CPUs are typically able to signal each other (e.g. "inter-processor interrupts")
Some CPUs may share some caches
Some CPUs may share some control registers (e.g. for things like power management, cache configuration, etc)
For modern CPUs, some CPUs may share some or all execution units (SMT, hyper-threading, etc)
For modern systems (where memory controller is built into the physical chip) some CPUs may share the same memory controller
Most of this is "invisible" to most software. Unless you're writing part of an OS that controls power management, you don't need to care if power management is shared between CPUs or not; unless you're writing an OSs/kernel's low level IRQ handling you don't need to care how IRQs reach device drivers, etc.
The same applies to how many CPUs actually exist. The OS/kernel normally ensures that applications only need to care about higher level abstractions (e.g. "threads"). How this higher level abstraction works depends on the OS - normally (for most OSs) the OS/kernel attempts to provide the illusion that all threads are running at the same time by switching between them "quickly enough" (where if there's only 4 CPUs a maximum of 4 threads actually do run at the exact same time), but it's usually far more complex than this (involving things like thread priorities, pre-emption rules, etc) and (even though it's relatively rare) it may be very different (e.g. for some systems the same thread may be run on multiple CPUs at the same time for fault tolerance/redundancy purposes; for some systems there might just be a queue of functions and their data, where multiple functions run at the same time; etc).

Multiprocessor means that there are at least two discrete processors on the same platform -- usually on the same motherboard
A subset is distributed multiprocessing, where two PC's for example are programmed to appear as a single system with two processors
Multicore means that the most or all of the CPU is replicated many times on single chip.
- this also means that stack, status, program counter and all generic purpose registers are replicated.
Hyperthreading is a technique, where each stage of the pipeline executes commands from different processes.
Multiprocessing means in OS level that everything a process consists of, is switched every now and then.
Multithreading is a lightweight variant of multiprocessing, where the threads e.g. share the same code segment and same data segment, same file descriptors etc. but have unique stacks (and of course unique status registers and program counters)
Also means multiprocessing in general (hardware architecture)

How exactly does OS protect kernel

my question is how exactly does operating system protect it's kernel part.
From what I've found there are basically 2 modes kernel and user. And there should be some bits in memory segments which tels if a memory segment is kernel or user space segment. But where is the origin of those bits? Is there some "switch" in compiler that marks programs as kernel programs? And for example if driver is in kernel mode how does OS manages its integration to system so there is not malicious software added as a driver?
If someone could enlighten me on this issue, I would be very grateful, thank you

The normal technique is by using a feature of the virtual memmory manager present in most modern cpus.
The way that piece of hardware works is that it keeps a list of fragments of memory in a cache, and a list of the addresses to which they correspond. When a program tries to read some memory that is not present in that cache, the MMU doesn't just go and fetch the memory from main ram, because the addresses in the cacher are only 'logical' addresses. Instead, it invokes another program that will interpret the address and fetch that memory from wherever it should be.
That program, called a pager, is supplied by the kernel, and special flags in the MMU prevent that program from being overridden.
If that program determines that the address corresponds to memory the process should get to use, it supplies the MMU with the physical address in main memory that corresponds to the logical address the user program asked for, the MMU fetches it into its cache, and resumes running the user program.
If that address is a 'special' address, like for a memory mapped file, then the kernel fetches the corresponding part of the file into the cache and lets the program run along with that.
If the address is in the range that belongs to the kernel, or that the program hasn't allocated that address to itself yet, the pager raises a SEGFAULT, killing the program.
Because the addresses are logical addresses, not physical addresses, different user programs may use the same logical addresses to mean different physical addresses, the kernel pager program and the MMU make this all transparent and automatic.
This level of protection is not available on older CPU's (like 80286 cpus) and some very low power devices (like ARM CortexM3 or Attiny CPUs) because there is no MMU, all addresses on these systems are physical addresses, with a 1 to 1 correspondence between ram and address space

The “switch” is actually in the processor itself. Some instructions are only available in kernel mode (a.k.a. ring 0 on i386). Switching from kernel mode to user mode is easy. However, there are not so many ways to switch back to kernel mode. You can either:
send an interrupt to the processor
make a system call.
In either case, the operation has the side effect of transferring the control to some trusted, kernel code.

When a computer boots up, it starts running code from some well known location. That code ultimately ends up loading some OS kernel to memory and passing control to it. The OS kernel then sets up the CPU memory map via some CPU specific method.

And for example if driver is in kernel mode how does OS manages its integration to system so there is not malicious software added as a driver?
It actually depends on the OS architecture. I will give you two examples:
Linux kernel: A driver code can be very powerful. The level of protections are following:
a) A driver is allowed to access limited number of symbols in the kernel, specified using EXPORT_SYMBOL. The exported symbols are generally functions. But nothing prevents a driver from trashing a kernel using wild pointers. And the security using EXPORT_SYMBOL is nominal.
b) A driver can only be loaded by the privileged user who has root permission on the box. So as long as root privileges are not breached system is safe.
Micro kernel like QNX: The operating system exports enough interface to the user so that a driver can be implemented as a user space program. Hence the driver at least cannot easily trash the system.