Where to return from an interrupt - operating-system

I've read (and studied) about Interrupt Handling.
What I always fail to understand, is how do we know where to return to (PC / IP) from the Interrupt Handler.
As I understand it:
An Interrupt is caused by a device (say the keyboard)
The relevant handler is called - under the running process. That is, no context switch to the OS is performed.
The Interrupt Handler finishes, and passes control back to the running application.
The process depicted above, which is my understanding of Interrupt Handling, takes place within the current running process' context. So it's akin to a method call, rather than to a context switch.
However, being that we didn't actually make the CALL to the Interrupt Handler, we didn't have a chance to push the current IP to the stack.
So how do we know where to jump back from an Interrupt. I'm confused.
Would appreciate any explanation, including one-liners that simply point to a good pdf/ppt addressing this question specifically.
[I'm generally referring to above process under Linux and C code - but all good answers are welcomed]

It's pretty architecture dependent.
On Intel processors, the interrupt return address is pushed on the stack when an interrupt occurs. You would use an iret instruction to return from the interrupt context.
On ARM, an interrupt causes a processor mode change (to the INT, FIQ, or SVC mode, for example), saving the current CPSR (current program status register) into the SPSR (saved program status register), putting the current execution address into the new mode's LR (link register), and then jumping to the appropriate interrupt vector. Therefore, returning from an interrupt is done by moving the SPSR into the CPSR and then jumping to an address saved in LR - usually done in one step with a subs or movs instruction:
movs pc, lr

When an interrupt is triggered, the CPU pushes several registers onto the stack, including the instruction pointer (EIP) of the code that was executing before the interrupt. You can put iret and the end of your ISR to pop these values, and restore EIP (as well as CS, EFLAGS, SS and ESP).
By the way, interrupts aren't necessarily triggered by devices. In Linux and DOS, user space programs use interrupts (via int) to make system calls. Some kernel code uses interrupts, for example intentionally triple faulting in order to force a shutdown.

The interrupt triggering mechanism in the CPU pushes the return address on the stack (among other things).

Related

how does the operating system treat few interrupts and keep processes going?

I'm learning computer organization and structure (I'm using Linux OS with x86-64 architecture). we've studied that when an interrupt occurs in user mode, the OS is notified and it switches between the user stack and the kernel stack by loading the kernels rsp from the TSS, afterwards it saves the necessary registers (such as rip) and in case of software interrupt it also saves the error-code. in the end, just before jumping to the adequate handler routine it zeroes the TF and in case of hardware interrupt it zeroes the IF also. I wanted to ask about few things:
the error code is save in the rip, so why loading both?
if I consider a case where few interrupts happen together which causes the IF and TF to turn on, if I zero the TF and IF, but I treat only one interrupt at a time, aren't I leave all the other interrupts untreated? in general, how does the OS treat few interrupts that occur at the same time when using the method of IDT with specific vector for each interrupt?
does this happen because each program has it's own virtual memory and thus the interruption handling processes of all the programs are unrelated? where can i read more about it?
how does an operating system keep other necessary progresses running while handling the interrupt?
thank you very much for your time and attention!
the error code is save in the rip, so why loading both?
You're misunderstanding some things about the error code. Specifically:
it's not generated by software interrupts (e.g. instructions like int 0x80)
it is generated by some exceptions (page fault, general protection fault, double fault, etc).
the error code (if used) is not saved in the RIP, it's pushed on the stack so that the exception handler can use it to get more information about the cause of the exception
2a. if I consider a case where few interrupts happen together which causes the IF and TF to turn on, if I zero the TF and IF, but I treat only one interrupt at a time, aren't I leave all the other interrupts untreated?
When the IF flag is clear, mask-able IRQs (which doesn't include other types of interrupts - software interrupts, exceptions) are postponed (not disabled) until the IF flag is set again. They're "temporarily untreated" until they're treated later.
The TF flag only matters for debugging (e.g. single-step debugging, where you want the CPU to generate a trap after every instruction executed). It's only cleared in case the process (in user-space) was being debugged, so that you don't accidentally continue debugging the kernel itself; but most processes aren't being debugged like this so most of the time the TF flag is already clear (and clearing it when it's already clear doesn't really do anything).
2b. in general, how does the OS treat few interrupts that occur at the same time when using the method of IDT with specific vector for each interrupt? does this happen because each program has it's own virtual memory and thus the interruption handling processes of all the programs are unrelated? where can i read more about it?
There's complex rules that determine when an interrupt can interrupt (including when it can interrupt another interrupt). These rules mostly only apply to IRQs (not software interrupts that the kernel won't ever use itself, and not exceptions which are taken as soon as they occur). Understanding the rules means understanding the IF flag and the interrupt controller (e.g. how interrupt vectors and the "task priority register" in the local APIC influence the "processor priority register" in the local APIC, which determines which groups of IRQs will be postponed when the IF flag is set). Information about this can be obtained from Intel's manuals, but how Linux uses it can only be obtained from Linux source code and/or Linux specific documentation.
On top of that there's "whatever mechanisms and practices the OS felt like adding on top" (e.g. deferred procedure calls, tasklets, softIRQs, additional stack management) that add more complications (which can also only be obtained from Linux source code and/or Linux specific documentation).
Note: I'm not a Linux kernel developer so can't/won't provide links to places to look for Linux specific documentation.
how does an operating system keep other necessary progresses running while handling the interrupt?
A single CPU can't run 2 different pieces of code (e.g. an interrupt handler and user-space code) at the same time. Instead it runs them one at a time (e.g. runs user-space code, then switches to an IRQ handler for very short amount of time, then returns to the user-space code). Because the IRQ handler only runs for a very short amount of time it creates the illusion that everything is happening at the same time (even though it's not).
Of course when you have multiple CPUs, different CPUs can/do run different pieces of code at the same time.

Context switch on time interrupt

Quoting the following paragraph from Operating Systems: Three Easy Pieces,
Note that there are two types of register saves/restores that happen
during this protocol. The first is when the timer interrupt occurs; in
this case, the user registers of the running process are implicitly
saved by the hardware, using the kernel stack of that process. The
second is when the OS decides to switch from A to B; in this case, the
kernel registers are explicitly saved by the software (i.e., the OS),
but this time into memory in the process structure of the process.
Reading other literature on context switch I understand that timer interrupt throws the cpu into kernel mode which then saves the process context into kernel stack.
Why is the author talking about a multiple context save emphasising on hardware/software?
The author emphasizes on hardware/software part of it because basically its context saving that is being done , sometimes by hardware and sometimes by software.
When a timer interrupt occurs, the user registers are saved by hardware(meaning saved by the CPU itself) on the kernel stack of that process. When the interrupt handler code is finished, the user registers will be restored using the kernel stack of that process,thereby restoring user stack and process successfully returns to user mode from kernel mode.
In case of a context switch from process A to process B, the very kernel stacks of the two processes A and B are switched,inside the kernel, which indirectly means saving and restoring of kernel registers. The term Software is used because the scheduler process after selecting which process to run next calls a function(thats why software), that does the switching of kernel stacks. The context switch code need not worry about user register values - those are already safely saved away in the kernel stack by that point.
The first is when the timer interrupt occurs; in this case, the user registers of the running process are implicitly saved by the hardware, using the kernel stack of that process.
Often, only SOME of the registers are saved and this is usually to an interrupt stack.
The second is when the OS decides to switch from A to B; in this case, the kernel registers are explicitly saved by the software (i.e., the OS), but this time into memory in the process structure of the process.
Usually this switch occurs in HARDWARE through a special instruction. Maybe they are referring to the fact that the switch is triggered through software as opposed to an interrupt that is triggered by hardware.
Also thanks for that reference. I have just started to go through it. It MUCH better than most of the OS books that only serve to confuse.

how does the processor know an instruction is making a system call

system call -- It is an instruction that generates an interrupt that causes OS to gain
control of processor.
so if a running process issue a system call (e.g. create/terminate/read/write etc), a interrupt is generated which cause the KERNEL TO TAKE CONTROL of the processor which then executes the required interrupt handler routine. correct?
then can anyone tell me how the processor known that this instruction is supposed to block the process, go to privileged mode, and bring kernel code.
I mean as a programmer i would just type stream1=system.io.readfile(ABC) or something, which translates to open and read file ABC.
Now what is monitoring the execution of this process, is there a magical power in the cpu to detect this?
As from what i have read a PROCESSOR can only execute only process at a time, so WHERE IS THE MONITOR PROGRAM RUNNING?
How can the KERNEL monitor if a system call is made or not when IT IS NOT IN RUNNING STATE!!
or does the computer have a SYSTEM CALL INSTRUCTION TABLE which it compares with before executing any instruction?
please help
thanku
The kernel doesn't monitor the process to detect a system call. Instead, the process generates an interrupt which transfers control to the kernel, because that's what software-generated interrupts do according to the instruction set reference manual.
For example, on Unix the process stuffs the syscall number in eax and runs an an int 0x80 instruction, which generates interrupt 0x80. The CPU reacts to this by looking in the Interrupt Descriptor Table to find the kernel's handler for that interrupt. This handler is the entry point for system calls.
So, to call _exit(0) (the raw system call, not the glibc exit() function which flushes buffers) in 32-bit x86 Linux:
movl $1, %eax # The system-call number. __NR_exit is 1 for 32-bit
xor %ebx,%ebx # put the arg (exit status) in ebx
int $0x80
Let's analyse each questions you have posed.
Yes, your understanding is correct.
See, if any process/thread wants to get inside kernel there are only two mechanisms, one is by executing TRAP machine instruction and other is through interrupts. Usually interrupts are generated by the hardware, so any other process/threads wants to get into kernel it goes through TRAP. So as usual when TRAP is executed by the process it issues interrupt (mostly software interrupt) to your kernel. Along with trap you will also mentions the system call number, this acts as input to your interrupt handler inside kernel. Based on system call number your kernel finds the system call function inside system call table and it starts to execute that function. Kernel will set the mode bit inside cs register as soon as it starts to handle interrupts to intimate the processor as current instruction is a privileged instruction. By this your processor will comes to know whether the current instruction is privileged or not. Once your system call function finished it's execution your kernel will execute IRET instruction. Which will clear mode bit inside CS register to inform whatever instruction from now inwards are from user mode.
There is no magical power inside processor, switching between user and kernel context makes us to think that processor is a magical thing. It is just a piece of hardware which has the capability to execute tons of instructions at a very high rate.
4..5..6. Answers for all these questions are answered in above cases.
I hope I've answered your questions up to some extent.
The interrupt controller signals the CPU that an interrupt has occurred, passes the interrupt number (since interrupts are assigned priorities to handle simultaneous interrupts) thus the interrupt number to determine wich handler to start. The CPu jumps to the interrupt handler and when the interrupt is done, the program state reloaded and resumes.
[Reference: Silberchatz, Operating System Concepts 8th Edition]
What you're looking for is mode bit. Basically there is a register called cs register. Normally its value is set to 3 (user mode). For privileged instructions, kernel sets its value to 0. Looking at this value, processor knows which kind of instruction it is. If you're interested digging more please refer this excellent article.
Other Ref.
Where is mode bit
Modern hardware supports multiple user sessions. If your hw supports multi user mode, i provides a mechanism called interrupt. An interrupt basically stops the execution of the current code to execute other code (e.g kernel code).
Which code is executed is decided by parameters, that get passed to the interrupt, by the code that issues the interrupt. The hw will increase the run level, load the kernel code into the memory and forces the cpu to execute this code. When the kernel code returns, it again directly informs the hw and the run level gets decreased.
The HW will then restore the cpu state before the interrupt and set the cpu the the next line in the code that started the interrupt. Done.
Since the code is actively calling the hw, which again actively calls the kernel, no monitoring needs to be done by the kernel itself.
Side note:
Try to keep your question short. Make clear what you want. The first answer was correct for the question you posted, you just didnt phrase it well. Make clear that you are new to the topic and need a detailed explanation of basic concepts instead of explaining what you understood so far and don't use caps lock.
Please accept the answer cnicutar provided. thank you.

How to save value of the user stack pointer into variable from the IRQ mode

I am trying to build simple and small preemptive OS for the ARM processor (for experimenting with the ARM architecture).
I have my TCB that have pointer to the proper thread's stack which I am updating/reading from in my dispatch() method - something like this (mixed C and assembly)
asm("
ldr r5, =oldSP
ldr sp, [r5]
");
myThread->sp = oldSP;
myThread = getNewThread();
newSP = myThread->sp
asm("
ldr r5, =newSP
str sp, [r5]
");
When I call this dispatch() from user mode (explicit call), everything works all right - threads are losing and getting processor as they should.
However, I am trying to build preemptive OS, so I need timer IRQ to call dispatch - and that is my problem - in irq mode, r13_usr register is hidden, so I can't access it. I can't change to the SVC mode either - it is hidden there, too.
One solution that I see is switching to the user mode (after I entered dispatch method), updating/changing sp fields and switching back to the irq mode to continue where I left. Is that possible?
Another solution is to try not to enter IRQ mode again - I just need to handle hardware things (set proper status bit in timer periphery), call dispatch() (still in irq mode) in which I will mask timer interrupt, change to the user mode, do context switch, unmask timer interrupt and continue. Resumed thread should continue where it was suspended (before entering IRQ). Is this correct? (This should be correct if on interrupt, processor pushes r4-r11 and lr into user stack, but I think I am wrong here...)
Thank you.
I think I might have answered a similar question here: "ARM - access R13 and R14 from Supervisor Mode"
In your case, just use "IRQ mode" instead of "Supervisor mode", but I think the same principle applies.
Long & short of it, if you switch from IRQ to user mode, that's a one-way trap door, you can't simply switch back to IRQ mode under software control.
But by manipulating the CPSR and switching to system mode, you can get r13_usr and then switch back to the previous mode (in your case, IRQ mode).
Normally you setup all of this on boot, the various stack registers, handlers, etc. If there is a reason to go to user mode and get out of it, you can use swi from user mode and have the swi handler do whatever it was you wanted to do in user mode or task switch to a svc mode handler or something like that.
Which ARM variant do you use? On modern variants like Cortex_M3 you have the MRS/MSR instruction. This way you can access the MSP and PSP registers by moving them to/from a general purpose register.
CMSIS even defines __get_MSP() and __get_PSP() as C functions, as well as their __set[...] counterparts.
EDIT: This seems to work in Thumb-2 only. Sorry for the noise.

Interrupt masking: why?

I was reading up on interrupts. It is possible to suspend non-critical interrupts via a special interrupt mask. This is called interrupt masking. What i dont know is when/why you might want to or need to temporarily suspend interrupts? Possibly Semaphores, or programming in a multi-processor environment?
The OS does that when it prepares to run its own "let's orchestrate the world" code.
For example, at some point the OS thread scheduler has control. It prepares the processor registers and everything else that needs to be done before it lets a thread run so that the environment for that process and thread is set up. Then, before letting that thread run, it sets a timer interrupt to be raised after the time it intends to let the thread have on the CPU elapses.
After that time period (quantum) has elapsed, the interrupt is raised and the OS scheduler takes control again. It has to figure out what needs to be done next. To do that, it needs to save the state of the CPU registers so that it knows how to undo the side effects of the code it executes. If another interrupt is raised for any reason (e.g. some async I/O completes) while state is being saved, this would leave the OS in a situation where its world is not in a valid state (in effect, saving the state needs to be an atomic operation).
To avoid being caught in that situation, the OS kernel therefore disables interrupts while any such operations that need to be atomic are performed. After it has done whatever needs doing and the system is in a known state again, it reenables interrupts.
I used to program on an ARM board that had about 10 interrupts that could occur. Each particular program that I wrote was never interested in more than 4 of them. For instance there were 2 timers on the board, but my programs only used 1. I would mask the 2nd timer's interrupt. If I didn't mask that timer, it might have been enabled and continued making interrupts which would slow down my code.
Another example was that I would use the UART receive REGISTER full interrupt and so would never need the UART receive BUFFER full interrupt to occur.
I hope this gives you some insight as to why you might want to disable interrupts.
In addition to answers already given, there's an element of priority to it. There are some interrupts you need or want to be able to respond to as quickly as possible and others you'd like to know about but only when you're not so busy. The most obvious example might be refilling the write buffer on a DVD writer (where, if you don't do so in time, some hardware will simply write the DVD incorrectly) versus processing a new packet from the network. You'd disable the interrupt for the latter upon receiving the interrupt for the former, and keep it disabled for the duration of filling the buffer.
In practise, quite a lot of CPUs have interrupt priority built directly into the hardware. When an interrupt occurs, the disabled flags are set for lesser interrupts and, often, that interrupt at the same time as reading the interrupt vector and jumping to the relevant address. Dictating that receipt of an interrupt also implicitly masks that interrupt until the end of the interrupt handler has the nice side effect of loosening restrictions on interrupting hardware. E.g. you can simply say that signal high triggers the interrupt and leave the external hardware to decide how long it wants to hold the line high for without worrying about inadvertently triggering multiple interrupts.
In many antiquated systems (including the z80 and 6502) there tends to be only two levels of interrupt — maskable and non-maskable, which I think is where the language of enabling or disabling interrupts comes from. But even as far back as the original 68000 you've got eight levels of interrupt and a current priority level in the CPU that dictates which levels of incoming interrupt will actually be allowed to take effect.
Imagine your CPU is in "int3" handler now and at that time "int2" happens and the newly happened "int2" has a lower priority compared with "int3". How would we handle with this situation?
A way is when handling "int3", we are masking out other lower priority interrupters. That is we see the "int2" is signaling to CPU but the CPU would not be interrupted by it. After we finishing handling the "int3", we make a return from "int3" and unmasking the lower priority interrupters.
The place we returned to can be:
Another process(in a preemptive system)
The process that was interrupted by "int3"(in a non-preemptive system or preemptive system)
An int handler that is interrupted by "int3", say int1's handler.
In cases 1 and 2, because we unmasked the lower priority interrupters and "int2" is still signaling the CPU: "hi, there is a something for you to handle immediately", then the CPU would be interrupted again, when it is executing instructions from a process, to handle "int2"
In case 3, if the priority of “int2” is higher than "int1", then the CPU would be interrupted again, when it is executing instructions from "int1"'s handler, to handle "int2".
Otherwise, "int1"'s handler is executed without interrupting (because we are also masking out the interrupters with priority lower then "int1" ) and the CPU would return to a process after handling the “int1” and unmask. At that time "int2" would be handled.