How does hypervisor work in virtualization? - operating-system

In virtualization, the hypervisor doesn't need to use system call?
is it always running in Kernel mode or it can run also in user mode?

On x86, a hypervisor runs in VMX root mode. Within this mode, transitions between kernel and user mode work just as they do in VMX non-root mode (i.e., in a guest) or when VMX is off.
However, in my experience, most hypervisors always run in kernel mode.

Related

What is the benefit of IA-32e mode guest of the vm-enter?

Regarding Intel VMX, I found that there is IA-32e mode guest bit field in vm-entry controls.
Based on the Intel manual, when this bit is set, the VM entries start from IA-32e mode.
What is the benefit of starting VCPU from ia-32e mode?
Is it just one of VM optimization to fast-forward some initial boot steps of VCPU?
Also, if this bit has not been set, what is the default mode of the VCPU when it initially enters to the VM?

Is the OS (operating system) running when the hardware is running?

I learned two things in class.
When a user mode process is running, the OS is not being executed. So when a user process invokes system calls or page faults, then I guess the system is switched to privileged mode and the OS runs without the user processor running anymore?
Another thing is that when the hardware interrupts, the mode is switched to privileged mode and the OS takes over. Does this also mean that when the hardware is running, the OS is not being executed?
Correct me if I am wrong, and thanks in advance for your help.
Here is your misteak:
When a user mode process is running, the OS is not being executed.
There is no such thing as a USER MODE PROCESS. Processes change modes all the time. It is impossible for a process to run only in user mode.
The operating system kernel gets invoked either through exceptions triggered by the running process or by interrupts triggered by external events. Otherwise, the operating system kernel is not executing.

How does VirtualBox-like virtualization work? (some technical details required)

First consider the situation when there is only one operating system installed. Now I run some executable. Processor reads instructions from the executable file and preforms these instructions. Even though I can put whatever instructions I want into the file, my program can't read arbitrary areas of HDD (and do many other potentially "bad" things).
It looks like magic, but I understand how this magic works. Operating system starts my program and puts the processor into some "unprivileged" state. "Unsafe" processor instructions are not allowed in this state and the only way to put the processor back to "privileged" state is give the control back to kernel. Kernel code can use all the processor's instructions, so it can do potentially unsafe things my program "asked" for if it decides it is allowed.
Now suppose we have VMWare or VirtualBox on Windows host. Guest operating system is Linux. I run a program in guest, it transfers control to guest Linux kernel. The guest Linux kernel's code is supposed to be run in processor's "privileged" mode (it must contain the "unsafe" processor instructions!). But I strongly doubt that it has an unlimited access to all the computer's resources.
I do not need too much technical details, I only want to understand how this part of magic works.
This is a great question and it really hits on some cool details regarding security and virtualization. I'll give a high-level overview of how things work on an Intel processor.
How are normal processes managed by the operating system?
An Intel processor has 4 different "protection rings" that it can be in at any time. The ring that code is currently running in determines the specific assembly instructions that may run. Ring 0 can run all of the privileged instructions whereas ring 3 cannot run any privileged instructions.
The operating system kernel always runs in ring 0. This ring allows the kernel to execute the privileged instructions it needs in order to control memory, start programs, write to the HDD, etc.
User applications run in ring 3. This ring does not permit privileged instructions (e.g. those for writing to the HDD) to run. If an application attempts to run a privileged instruction, the processor will take control from the process and raise an exception that the kernel will handle in ring 0; the kernel will likely just terminate the process.
Rings 1 and 2 tend not to be used, though they have a use.
Further reading
How does virtualization work?
Before there was hardware support for virtualization, a virtual machine monitor (such as VMWare) would need to do something called binary translation (see this paper). At a high level, this consists of the VMM inspecting the binary of the guest operating system and emulating the behavior of the privileged instructions in a safe manner.
Now there is hardware support for virtualization in Intel processors (look up Intel VT-x). In addition to the four rings mentioned above, the processor has two states, each of which contains four rings: VMX root mode and VMX non-root mode.
The host operating system and its applications, along with the VMM (such as VMWare), run in VMX root mode. The guest operating system and its applications run in VMX non-root mode. Again, both of these modes each have their own four rings, so the host OS runs in ring 0 of root mode, the host OS applications run in ring 3 of root mode, the guest OS runs in ring 0 of non-root mode, and the guest OS applications run in ring 3 of non-root mode.
When code that is running in ring 0 of non-root mode attempts to execute a privileged instruction, the processor will hand control back to the host operating system running in root mode so that the host OS can emulate the effects and prevent the guest from having direct access to privileged resources (or in some cases, the processor hardware can just emulate the effect itself without getting the host involved). Thus, the guest OS can "execute" privileged instructions without having unsafe access to hardware resources - the instructions are just intercepted and emulated. The guest cannot just do whatever it wants - only what the host and the hardware allow.
Just to clarify, code running in ring 3 of non-root mode will cause an exception to be sent to the guest OS if it attempts to execute a privileged instruction, just as an exception will be sent to the host OS if code running in ring 3 of root mode attempts to execute a privileged instruction.

How are operating system containers different from virtual machines?

Everywhere I can see is how Docker can be different from virtual machine but nowhere there is a answer on how basic OS containers are different from virtual machine.
If we consider the basics, it looks like both are same i.e. an operating system is running within a operating system.
Would anybody explain the underlying difference?
Virtual machines
Virtual machines use hardware virtualization. There is an additional layer between the original hardware and the virtual one, that the virtual machine thinks it's real.
This model doesn't reutilize anything from the host's OS. This way, you can run a Windows VM on a Linux host and vice-versa.
System Containers
Systems containers use operating-system-level virtualization. It reutilizes the host kernel from the host OS, and subdivide the real hardware directly to the containers. There isn't an additional layer to access the real hardware and, for this reason, the overhead (or loss of performance) is practically zero.
On the other hand, you can't run a Windows container inside a Linux host OS, since the kernel isn't the same.

Regd Harware assisted Virtualization

I am trying to understand hardware assisted virtualization for a project with ARM CortexA8 and using the ARM Trustzone feature. I am new to this topic therefore I started with Wiki entries to understand more.
Wikipedia explains hardware assisted virtialization and adds a line in the definitionas:
Full virtualization is used to simulate a complete hardware
environment, or virtual machine, in which an unmodified guest
operating system (using the same instruction set as the host machine)
executes in complete isolation.
The text in bold is a bit confusing. How is the same instruction set of the processor used to provide two isolated environment? Can someone explain it? ArmTrustzone manual also talk of a "virtual processor core" to provide security. Please throw some light.
thanks
The phrase "using the same instruction set as the host machine" means that the guest OS is not aware of the virtualization layer and behaves as if it is executed on a real machine (with the same instruction set). This is in contrast to the para-virtualization paradigm in which the guest OS is aware of virtualization and calls some specific VMM functions, i.e. hypercalls.
No, CPU has not additional instructions. Virtual machine instruction set is translated by a hypervisor component called VMM (virtual machine manager) to be executed on the physical CPU.
Physical CPU with assisted Virtualization introduced only a new ring 0 mode called VMX that allow the virtual machine to execute some instructions in ring 0.