I am trying to implement interrupts in x86 operating system project. However, after loading interrupt descriptor table with lidt, I issue sti command and this "sti" command reboots the computer. And also, I am in the protected mode. Any idea what might be happening?
Some things cause exceptions. When the CPU can't start the corresponding exception handler it falls back to a generic "double fault" exception, and when the CPU can't start that exception handler the CPU falls back to a "triple fault" condition which mostly means that the computer is reset.
It's likely that there are pending IRQs (that occurred while interrupts were masked with "cli" and have been waiting for CPU to be ready to receive them); so when you do "sti" the interrupt controller sees the CPU is ready to receive an IRQ now and immediately sends one to the CPU; and likely that the interrupt handler for whichever IRQ the CPU receives is causing an exception (that leads to double fault, that leads to triple fault/reset).
The easiest way to figure out what is happening is to run it under an emulator that tells you what happened in its logs. The alternative is to write usable exception handler/s for any exceptions that are involved (most likely, a general protection fault exception handler); so that the exception handler can give you information about what went wrong (e.g. the "error code" provided by the CPU to the general protection fault handler may indicate which IDT entry the CPU tried to use for the IRQ).
Note that during boot the best sequence is to mask all IRQs in the interrupt controller/s, then let firmware handle any pending IRQs (e.g. with interrupts enabled, do some "NOP" instructions). That way there can't be any pending IRQs when you "sti" later (and you can unmask individual IRQ sources when you actually want them unmasked - e.g. when you install a device driver that uses a specific IRQ). Sadly most people (tutorials, GRUB, etc) do everything wrong and just "cli" without masking IRQs in the interrupt controller/s (and then do things like remap the PIC chips, etc; which makes things even more confusing), and then end up having to cope with the consequences of doing everything wrong. ;-)
Related
I'm learning computer organization and structure (I'm using Linux OS with x86-64 architecture). we've studied that when an interrupt occurs in user mode, the OS is notified and it switches between the user stack and the kernel stack by loading the kernels rsp from the TSS, afterwards it saves the necessary registers (such as rip) and in case of software interrupt it also saves the error-code. in the end, just before jumping to the adequate handler routine it zeroes the TF and in case of hardware interrupt it zeroes the IF also. I wanted to ask about few things:
the error code is save in the rip, so why loading both?
if I consider a case where few interrupts happen together which causes the IF and TF to turn on, if I zero the TF and IF, but I treat only one interrupt at a time, aren't I leave all the other interrupts untreated? in general, how does the OS treat few interrupts that occur at the same time when using the method of IDT with specific vector for each interrupt?
does this happen because each program has it's own virtual memory and thus the interruption handling processes of all the programs are unrelated? where can i read more about it?
how does an operating system keep other necessary progresses running while handling the interrupt?
thank you very much for your time and attention!
the error code is save in the rip, so why loading both?
You're misunderstanding some things about the error code. Specifically:
it's not generated by software interrupts (e.g. instructions like int 0x80)
it is generated by some exceptions (page fault, general protection fault, double fault, etc).
the error code (if used) is not saved in the RIP, it's pushed on the stack so that the exception handler can use it to get more information about the cause of the exception
2a. if I consider a case where few interrupts happen together which causes the IF and TF to turn on, if I zero the TF and IF, but I treat only one interrupt at a time, aren't I leave all the other interrupts untreated?
When the IF flag is clear, mask-able IRQs (which doesn't include other types of interrupts - software interrupts, exceptions) are postponed (not disabled) until the IF flag is set again. They're "temporarily untreated" until they're treated later.
The TF flag only matters for debugging (e.g. single-step debugging, where you want the CPU to generate a trap after every instruction executed). It's only cleared in case the process (in user-space) was being debugged, so that you don't accidentally continue debugging the kernel itself; but most processes aren't being debugged like this so most of the time the TF flag is already clear (and clearing it when it's already clear doesn't really do anything).
2b. in general, how does the OS treat few interrupts that occur at the same time when using the method of IDT with specific vector for each interrupt? does this happen because each program has it's own virtual memory and thus the interruption handling processes of all the programs are unrelated? where can i read more about it?
There's complex rules that determine when an interrupt can interrupt (including when it can interrupt another interrupt). These rules mostly only apply to IRQs (not software interrupts that the kernel won't ever use itself, and not exceptions which are taken as soon as they occur). Understanding the rules means understanding the IF flag and the interrupt controller (e.g. how interrupt vectors and the "task priority register" in the local APIC influence the "processor priority register" in the local APIC, which determines which groups of IRQs will be postponed when the IF flag is set). Information about this can be obtained from Intel's manuals, but how Linux uses it can only be obtained from Linux source code and/or Linux specific documentation.
On top of that there's "whatever mechanisms and practices the OS felt like adding on top" (e.g. deferred procedure calls, tasklets, softIRQs, additional stack management) that add more complications (which can also only be obtained from Linux source code and/or Linux specific documentation).
Note: I'm not a Linux kernel developer so can't/won't provide links to places to look for Linux specific documentation.
how does an operating system keep other necessary progresses running while handling the interrupt?
A single CPU can't run 2 different pieces of code (e.g. an interrupt handler and user-space code) at the same time. Instead it runs them one at a time (e.g. runs user-space code, then switches to an IRQ handler for very short amount of time, then returns to the user-space code). Because the IRQ handler only runs for a very short amount of time it creates the illusion that everything is happening at the same time (even though it's not).
Of course when you have multiple CPUs, different CPUs can/do run different pieces of code at the same time.
I'm not an expert, but just a hobbyist. I was playing with 68000 architecture in the past and I've been always thinking of its TRAP instruction. This instruction is always described as a "bridge" to an OS (in some systems however it's not used in this regard, but that's a different story). How this is achieved? TRAP itself is a privileged instruction, so how this OS invoking mechanism works in user mode? My guess is that the privilege violation exception is triggered and the exception handler checks what particular instruction has caused the exception. If it's a TRAP instruction then the instruction is simply executed (maybe TRAP's operand i.e. TRAP vector number is checked as well), of course now in the supervisor mode. Am I right?
The TRAP instruction is not privileged, you can call it from either user mode or supervisor mode.
It's the TRAP instruction itself that will force the CPU to supervisor mode, and then depending of the #xx number you used will jump to any of the 16 possible callbacks from the memory area $80 to $BC.
TRAP also pushes to the stack the PC and SR values, so when the last function call returns it goes back to whatever mode was setup before you called TRAP.
I am trying to understand how a virtual machine monitor (VMM) virtualizes the CPU.
My understanding right now is that the CPU issues a protection fault interrupt when a privileged instruction is about to be executed while the CPU is in user mode. In high level languages like C, privileged instructions are wrapped inside system calls. For example, when an application needs the current date and time (instructions that interact with I/O devices are privileged), it calls a certain library function. The assembled version of this library function contains an instruction called 'int' that causes a trap in the CPU. The CPU switches from user mode to privileged mode and jumps to the trap handler the OS has provided. Each system call has its own trap handler. In this example, the trap handler reads the date and time from the hardware clock and returns, then the CPU switches itself from privileged to user mode. (source: http://elvis.rowan.edu/~hartley/Courses/OperatingSystems/Handouts/030Syscalls.html)
However, I am not quite sure this understanding is correct. This article mentions the notion that the (privileged) x86 popf instruction does not cause a trap, and thus complicates things for the VMM: http://www.csd.uwo.ca/courses/CS843a/papers/intro-vm.pdf. In my understanding the popf instruction should not cause a trap but a protection fault interrupt, when explicitly called by a user program and not through a system call.
So my two concrete questions are:
What happens when a user program executes a privileged instruction while the CPU is in user mode?
What happens when a user program performs a system call?
In no particular order:
Your confusion is mainly caused by the fact that the operating systems community does not have standardized vocabulary. Here are some terms that get slung around that sometimes mean the same thing, sometimes not: exception, fault, interrupt, system call, and trap. Any individual author will generally use the terms consistently, but different authors define them differently.
There are 3 different kinds of events that cause entry into privileged mode.
An asynchronous interrupt (caused, for example, by an i/o device needing service.)
A system call instruction (int on the x86). (More generally in the x86 manuals these are called traps and include a couple of other instructions (for debuggers mostly.))
An instruction that does something exceptional (illegal instruction, protection fault, divide-by-0, page fault, ...). (Different authors calls these exceptions, faults or traps. x86 manuals call these faults.)
Each interrupt, trap or fault has a different number associated with it.
In all cases:
The processor enters privileged mode.
The user-mode registers are saved somewhere.
The processor finds the base address of the interrupt vector table, and uses the interrupt/trap/fault number as an offset into the table. This gives a pointer to the service routine for that interrupt/trap/fault.
The processor jumps to the service routine. Now we are in protected mode, the user level state is all saved somewhere we can get at it, and we're in the correct code inside the operating system.
When the service routine is finished it calls an interrupt-return instruction (iret on x86.) (This is the subtle distinction between a fault and a trap on x86: faults return to the instruction that caused the fault, traps return to the instruction after the trap.)
Note the confusing name "interrupt vector table." Even though it is called an interrupt table, it is used for faults and traps as well. (Which leads some authors to call everything an interrupt.)
The popf issue is rather subtle. This is essentially a bug in the x86 architecture. When popf executes from user mode it does not cause a trap or fault (or exception or interrupt or whatever you want to call it.) It simply acts as a noop.
Does this matter? Well, for a normal OS it doesn't really matter. If, on the other hand, you are implementing a virtual machine monitor (like VMWare or Xen or Hyper-V), the VMM is running in protected mode, and you'd like to run the guest operating systems in user mode and efficiently emulate any protected mode code. When the guest operating system uses a popf instruction you want it to generate a general protection fault, but it doesn't. (The cli and sti instructions do generate a general protection fault if called from user mode, which is what you want.)
I'm not an expert on computer architecture. But I have several opinions for your consideration:
The CPU has two kinds of instructions
normal instructions, e.g., add, sub, etc.
privileged instructions, e.g., initiate I/O, load/store from protected memory etc.
The machine (CPU) has two kinds of modes (set by status bit in a protected register):
user mode: processor executes normal instructions in the user’s program
kernel mode: processor executes both normal and privileged instructions (OS == kernel)
Operating systems hide privileged instructions as system calls. And if user program calls them, it will cause an exception (throws a software interrupt), which
vectors to a kernel handler, trap to kernel modes and switch contexts.
Upon encountering a privileged instruction in user mode, processor trap to kernel mode. Depending on what happened it would be one of several traps, such as a memory access violation, an illegal instruction violation, or a register access violation. The trap switches the processor’s execution to kernel mode and switches control to the operating system, which then decides on a course of action. The address is defined by the trap vector, which is set up when the operating system starts up.
I was reading up on interrupts. It is possible to suspend non-critical interrupts via a special interrupt mask. This is called interrupt masking. What i dont know is when/why you might want to or need to temporarily suspend interrupts? Possibly Semaphores, or programming in a multi-processor environment?
The OS does that when it prepares to run its own "let's orchestrate the world" code.
For example, at some point the OS thread scheduler has control. It prepares the processor registers and everything else that needs to be done before it lets a thread run so that the environment for that process and thread is set up. Then, before letting that thread run, it sets a timer interrupt to be raised after the time it intends to let the thread have on the CPU elapses.
After that time period (quantum) has elapsed, the interrupt is raised and the OS scheduler takes control again. It has to figure out what needs to be done next. To do that, it needs to save the state of the CPU registers so that it knows how to undo the side effects of the code it executes. If another interrupt is raised for any reason (e.g. some async I/O completes) while state is being saved, this would leave the OS in a situation where its world is not in a valid state (in effect, saving the state needs to be an atomic operation).
To avoid being caught in that situation, the OS kernel therefore disables interrupts while any such operations that need to be atomic are performed. After it has done whatever needs doing and the system is in a known state again, it reenables interrupts.
I used to program on an ARM board that had about 10 interrupts that could occur. Each particular program that I wrote was never interested in more than 4 of them. For instance there were 2 timers on the board, but my programs only used 1. I would mask the 2nd timer's interrupt. If I didn't mask that timer, it might have been enabled and continued making interrupts which would slow down my code.
Another example was that I would use the UART receive REGISTER full interrupt and so would never need the UART receive BUFFER full interrupt to occur.
I hope this gives you some insight as to why you might want to disable interrupts.
In addition to answers already given, there's an element of priority to it. There are some interrupts you need or want to be able to respond to as quickly as possible and others you'd like to know about but only when you're not so busy. The most obvious example might be refilling the write buffer on a DVD writer (where, if you don't do so in time, some hardware will simply write the DVD incorrectly) versus processing a new packet from the network. You'd disable the interrupt for the latter upon receiving the interrupt for the former, and keep it disabled for the duration of filling the buffer.
In practise, quite a lot of CPUs have interrupt priority built directly into the hardware. When an interrupt occurs, the disabled flags are set for lesser interrupts and, often, that interrupt at the same time as reading the interrupt vector and jumping to the relevant address. Dictating that receipt of an interrupt also implicitly masks that interrupt until the end of the interrupt handler has the nice side effect of loosening restrictions on interrupting hardware. E.g. you can simply say that signal high triggers the interrupt and leave the external hardware to decide how long it wants to hold the line high for without worrying about inadvertently triggering multiple interrupts.
In many antiquated systems (including the z80 and 6502) there tends to be only two levels of interrupt — maskable and non-maskable, which I think is where the language of enabling or disabling interrupts comes from. But even as far back as the original 68000 you've got eight levels of interrupt and a current priority level in the CPU that dictates which levels of incoming interrupt will actually be allowed to take effect.
Imagine your CPU is in "int3" handler now and at that time "int2" happens and the newly happened "int2" has a lower priority compared with "int3". How would we handle with this situation?
A way is when handling "int3", we are masking out other lower priority interrupters. That is we see the "int2" is signaling to CPU but the CPU would not be interrupted by it. After we finishing handling the "int3", we make a return from "int3" and unmasking the lower priority interrupters.
The place we returned to can be:
Another process(in a preemptive system)
The process that was interrupted by "int3"(in a non-preemptive system or preemptive system)
An int handler that is interrupted by "int3", say int1's handler.
In cases 1 and 2, because we unmasked the lower priority interrupters and "int2" is still signaling the CPU: "hi, there is a something for you to handle immediately", then the CPU would be interrupted again, when it is executing instructions from a process, to handle "int2"
In case 3, if the priority of “int2” is higher than "int1", then the CPU would be interrupted again, when it is executing instructions from "int1"'s handler, to handle "int2".
Otherwise, "int1"'s handler is executed without interrupting (because we are also masking out the interrupters with priority lower then "int1" ) and the CPU would return to a process after handling the “int1” and unmask. At that time "int2" would be handled.
In my systems programming class we are working on a small, simple hobby OS. Personally I have been working on an ATA hard disk driver. I have discovered that a single line of code seems to cause a fault which then immediately reboots the system. The code in question is at the end of my interrupt service routine for the IDE interrupts. Since I was using the IDE channels, they are sent through the slave PIC (which is cascaded through the master). Originally my code was only sending the end-of-interrupt byte to the slave, but then my professor told me that I should be sending it to the master PIC as well.
SO here is my problem, when I un-comment the line which sends the EOI byte to the master PIC, the systems triple faults and then reboots. Likewise, if I leave it commented the system stays running.
_outb( PIC_MASTER_CMD_PORT, PIC_EOI ); // this causes (or at least sets off) a triple fault reboot
_outb( PIC_SLAVE_CMD_PORT, PIC_EOI );
Without seeing the rest of the system, is it possible for someone to explain what could possibly be happening here?
NOTE: Just as a shot in the dark, I replaced the _outb() call with another _outb() call which just made sure that the interrupts were enable for the IDE controller, however, the generated assembly would have been almost identical. This did not cause a fault.
*_outb() is a wrapper for the x86 OUTB instruction.
What is so special about my function to send EOI to the master PIC that is an issue?
I realize without seeing the code this may be impossible to answer, but thanks for looking!
Triple faults usually point to a stack overflow or odd stack pointer. When a fault or interrupt occurs, the system immediately tries to push some more junk onto the stack (before invoking the fault handler). If the stack is hosed, this will cause another fault, which then tries to push more stuff on the stack, which causes another fault. At this point, the system gives up on you and reboots.
I know this because I actually have a silly patent (while working at Dell about 20 years ago) on a way to cause a CPU reset without external hardware (used to be done through the keyboard controller):
MOV ESP,1
PUSH EAX ; triple fault and reset!
An OUTB instruction can't cause a fault on its own. My guess is you are re-enabling an interrupt, and the interrupt gets triggered while something is wrong with your stack.
When you re-enable the PIC, are you doing it with the CPU's interrupt flag set, or cleared (ie. are you doing it sometime after a CLI opcode, or, sometime after an STI opcode)?
Assuming that the CPU's interrupt flag is enabled, your act of re-enabling the PIC allows any pending interrupts to reach the CPU: which would interrupt your code, dispatch to a vector specified by the IDT, etc.
So I expect that it's not your opcode that's directly causing the fault: rather, what's faulting is code that's run as the result of an interrupt which happens as a result of your re-enabling the PIC.