Virtualization CPU Emulation - virtualization

I have a question about CPU virtualization from a virtual machine. I am not able to understand the difference between on-the-fly to native code translation and trap-and-emulate translation.
As far as I understand, in the first case suppose I emulate binary code from a different platform the code is converted to the equivalent x86 instruction if I have an x86 CPU. Now in the trap-and-emulate method the virtual machine receives the ISA call from the guest OS and translates it to the equivalent ISA call for the host OS.
Why do we need to translate from ISA to ISA? Suppose I am running an Ubuntu guest on a Windows host. The Ubuntu ISA call is different from the Windows ISA call? I understand that the Guest is not able to access System ISA on the host, only the monitor can do that. But why there is a need of conversion to the Host ISA? The ISA depends also on the operating system?

"On-the-fly to native" translation (often called JIT compilation/translation) is used when running code from one ISA on another ISA, such as running M68K code on an x86 CPU.
It's in no way virtualization, but emulation.
Trap-and-emulate is a way to run "privileged" code in an unprivileged environment (example: running a kernel as an application).
The way it works is that you start executing the privileged code, and once it tries to execute a privileged instruction (lidt in x86 for example), the host OS will issue a trap. In the handler for that trap, you could emulate that specific privileged instruction, and then let the guest kernel continue executing.
The advantage of this is that you will reach close to native speeds for CPU emulation.
However, just emulating the ISA is only a "small" part of emulating a complete system. Emulating/virtualization of the MMU is much more complex to get right, and to get running fast.

Related

It is necessary for the switch from Kernel mode to User mode to be a previleged instruction?

When a CPU is executing kernel code, it is in privileged mode. Now when the CPU switches from kernel mode (privileged mode) to user mode, it is still in kernel mode. This made me feel that it is necessary for the CPU to be in Kernel mode during the switch. i.e. the instruction necessary for the switch is a privileged instruction.
Usually the CPU has a special mode bit, and a privileged instruction is needed to change this bit during the switch from kernel mode to user mode.
I have seen the xv6 implementation on the x86 architecture, where we have instructions like IRET which is a privileged instruction. But what I have read is specific to x86 architecture.
My question is, is there any example where the mode switching from Kernel mode to user mode is done by an instruction which is not privileged?
Here in this answer, I see it is written that the switching instruction need not necessarily be privileged. But I do not get the intuition behind the same.
Is it such that the instruction for the switch though is usually executed in kernel mode, but it is actually an unprivileged instruction and as we know an unprivileged instruction can run even in kernel mode.
Also here in this answer, I find that older editions of my textbook said that - switch from kernel to user mode is done by a privileged instruction, but this statement subsequently got changed in the newer editions. Do not know whether it is a typo which got corrected...

iMX6: MSI-X not working in Linux PCIe device driver

I'm trying to get MSI-X working on an iMX6 (Freescale/NXP/Qualcomm) CPU in Linux v4.1 for a PCIe character device driver. Whenever I call either pci_enable_msix() or pci_enable_msix_range() or pci_enable_msix_exact() I get an EINVAL value returned. I do have the CONFIG_PCI_MSI option selected in the kernel configuration and I am also able to get single MSI working with pci_enable_msi(), but I cannot get multiple MSI working either.
I have tested my driver code on an Intel i7 running kernel v3 with the same PCIe hardware attached and I was able to get MSI-X working without any problems so I know my code is correctly written and the hardware is correctly functioning.
When running on the iMX6 I can use lspci -v to view that the hardware has MSI-X capabilities and see the number of IRQs it allows. I can even get the same correct number in my driver when calling pci_msix_vec_count().Questions
Are there any other kernel configuration flags I need to set?
Is there anything specific to the iMX6 CPU I need to consider?
Does anyone have any experience with the iMX6 and either MSI-X or
multiple MSI?

Operating system-loader

My question is how operating system loads
User space application to RAM. I know how
Bootloader works when we first turn computer on Bios simply reads 512 kb data till aa55 bootloader signature and loads bootloader to ram. Do regular userspace programms are handled in this way? If yes how? Because bootloader activated by bios and how user space program handled by operating system? More specifacally how execv() load program to RAM and start execution point for user space ?
Thanks in advance
Userspace programs are not handled like the bios, the Kernel will be involved in running a userspace program.
In general:
When a program is executed in shell, the shell will invoke system calls to create a new task in a new address space, read in the executable binary, and begin executing it.
To understand the details, you need to understand:
The elf format. Of course there are also other formats which can be used in Linux, elf is just the most common one, and a good starting point. Understanding elf will help you understand how the kernel loads the executable binary into memory precisely.
Linux process management; this will help you to understand how a program starts to run.
Reading the related codes in the kernel. fs/exec.c will be of great help.
The procedure varies among operating systems. Some systems have a background command interpreter that exists through the life of a process and within the process itself. When a program is run, the command interpreter stays in the background (in protected from user mode access). When the program completes, the command interpreter comes to the foreground and can run another program in the same process.
In the Eunuchs-world, the command interpreter is just a user-mode program. Whenever it runs a program it kicks off another process.
Both of these types of systems use a loader to configure the process address space for running a program. The executable file is a set of instructions that define how to lay out the address space,
This is significantly different from a bootloader. A bootloader blindly loads a block of stored data into memory. A program loader contains complex instructions for laying out a process address space that include handling shared libraries and doing address fixups.

Type-1 VMM and Ring 1

Recently, I am doing homework about Virtualization. My question is, how VMM transfer control to the guest kernel and run that code in Ring 1?
Type-1 VMM: This is the classical trap-and-emulate VMM. The VMM runs directly on hardware, acts as a "host operating system" in Ring 0. Guest kernel and guest applications run upon VMM, in Ring 1 and Ring 3 respectively.
When Guest applications make a syscall, it will trap to Ring 0 VMM, (CPU is designed to do this).
VMM will then detect that this is a syscall, and then transfer control to the guest kernel syscal handler and execute it in Ring 1.
When it is done, the guest kernel performs syscall-return, this is a privileged call, which will trap again into VMM.
VMM then do a real return to the guest user space in ring 3. (CPU is also designed to do this.)
My question is about step 2. How does the VMM transfer control to guest kernel and force the CPU to ring 1? It couldn't be a simple "call" since then the guest kernel code will run in ring 0. It must be some kind of "syscall-return" or some special context switch instructions.
Do you have some idea? Thank you!
Simply running the guest OS with a CS selector with RPL=1 (on x86 though). Returning from more privileged ring to lower one is generally done using iret.
Xen is one of the VMMs that run guest OSes in ring 1. In Xen, instructions such as the HLT instruction, (the instruction in ring 1 where the guest OSes run) is replaced by a hyper-call. In this case, instead of calling the HLT instruction, as is done eventually in the Linux kernel, the xen_idle() method is called. It performs a hypercall instead, namely, the HYPERVISOR_sched_op(SCHEDOP_block, 0) hypercall that manages the privilege ring switching. For more info see:
http://www.linuxjournal.com/article/8909

Start service in kernel mode (Vista)

I'd like to start service before user mode is loaded (in kernel mode).
The reason is I wanna run several system applications(asm code to write data to BIOS) that are not allowed in user mode (privileges problem).
That's why I got an idea: 1. Write windows service 2. Start and run it in kernel mode
Is it possible?
Are there any other ways to solve the problem?
I don't usually use Vista (use linux instead), that's why I'm asking.
Windows services are user-mode applications. To run in kernel-mode you should write a driver. (So-called "legacy" driver will be enough, see Driver Development Part 1: Introduction to Drivers).