tasklet, taskqueue, work-queue -- which to use?

tasklet, taskqueue, work-queue -- which to use? - linux-device-driver

I am going through ldd3 for last few months. I read first few chapters many times.
These two links are using diffrent way, one is using work queue other is using task-queue. To implement a bottom half.
http://www.tldp.org/LDP/lkmpg/2.4/html/x1210.html
http://www.linuxtopia.org/online_books/linux_kernel/linux_kernel_module_programming_2.6/x1256.html
I have some doubt about tasklet, taskqueue, work-queue all seems to be doing some task at free time :--
a) What exactly the diffrence between these three ?
b) Which should be used for interrupt handler bottom half ?
confused ...???

Tasklet and work-queue are normally used in bottom half but they can be used anywhere, their is no limitation on them
Regarding the difference.
1) The Tasklet are used in interrupt context. All the tasklet code must be atomic,so all rules that are applied on atomic context are applied to it.
For eg. They cannot sleep(as they cannot be reschecduled) or hold a lock for long time.
2) Unlike Tasklet work-queue executes is in process context means they can sleep and hold the lock for longtime.
In short tasklet are used for fast execution as they cannot sleep where as workqueue are used in case of normal execution of bottom half. Both are executed at later time by the kernel.

Softirq and tasklet both are interrupt context tasklet which is executed in interrupt context and workques are executed in process context code.Process context code is allowed to sleep in execution but interrupt context code is not allowed to sleep while execution (Only another interrupt can preempt scheduled interrupt context bottom half. )
Which bottom half mechanism you use is totally depend on driver you are writing and its requirement.
For Ex. If you are writing nw driver which is sending packets to and from HW on interrupt basis you would like to complete this activity without any delay so only options available is softirq or tasklets.
Note: Better you go through Linux Kernel Development by Robert Love chapter 8.I have also read LDD but still Linux Kernel Development by Robert Love is better for interrupt related understanding.

Related

Context switch by random system call

I know that an interrupt causes the OS to change a CPU from its current task and to run a kernel routine. I this case, the system has to save the current context of the process running on the CPU.
However, I would like to know whether or not a context switch occurs when any random process makes a system call.

I would like to know whether or not a context switch occurs when any random process makes a system call.
Not precisely. Recall that a process can only make a system call if it's currently running -- there's no need to make a context switch to a process that's already running.
If a process makes a blocking system call (e.g, sleep()), there will be a context switch to the next runnable process, since the current process is now sleeping. But that's another matter.

There are generally 2 ways to cause a content switch. (1) a timer interrupt invokes the scheduler that forcibly makes a context switch or (2) the process yields. Most operating systems have a number of system services that will cause the process to yield the CPU.

well I got your point. so, first I clear a very basic idea about system call.
when a process/program makes a syscall and interrupt the kernel to invoke syscall handler. TSS loads up Kernel stack and jump to syscall function table.
See It's actually same as running a different part of that program itself, the only major change is Kernel play a role here and that piece of code will be executed in ring 0.
now your question "what will happen if a context switch happen when a random process is making a syscall?"
well, nothing will happen. Things will work in same way as they were working earlier. Just instead of having normal address in TSS you will have address pointing to Kernel stack and SysCall function table address in that random process's TSS.

Can the Linux Linked List API be used safely inside of an interrupt handler?

I am writing a device driver for a custom piece of hardware using the Linux kernel 2.6.33. I need am using DMA to transfer data to and from the device. For the output DMA, I was thinking that I would keep track of several output buffers using the Linked List API (struct list_head, list_add(), etc.).
When the device finished the DMA transfer, it raises an interrupt. The interrupt handler would then retrieve item in the linked list to transfer, and remove it from the list.
My question is, is this actually a safe thing to do inside of an interrupt handler? Or are there inherent race conditions in this API that would make it not safe?
The small section in Linux Device Drivers, 3rd Ed. doesn't make mention of this. The section in Essential Linux Device Drivers is more complete but also does not touch on this subject.
Edit:
I am beginning to think that it may very well not be race condition free as msh suggests, due to a note listed in the list_empty_careful() function:
* NOTE: using list_empty_careful() without synchronization
* can only be safe if the only activity that can happen
* to the list entry is list_del_init(). Eg. it cannot be used
* if another CPU could re-list_add() it.
http://lxr.free-electrons.com/source/include/linux/list.h?v=2.6.33;a=powerpc#L202
Note that I plan to add to the queue in process context and remove from the queue in interrupt context. Do you really not need synchronization around the functions for a list?

It is perfectly safe to use kernel linked lists in interrupt context, but but retrieving anything in interrupt handlers is a bad idea. In the interrupt handler you should acknowledge interrupt, schedule "bottom half" and quit. All processing should be done by the "bottom half" (bottom half is just a piece of deferred work - there are several suitable mechanisms - tasklets, work queue, etc).

how does the processor know an instruction is making a system call

system call -- It is an instruction that generates an interrupt that causes OS to gain
control of processor.
so if a running process issue a system call (e.g. create/terminate/read/write etc), a interrupt is generated which cause the KERNEL TO TAKE CONTROL of the processor which then executes the required interrupt handler routine. correct?
then can anyone tell me how the processor known that this instruction is supposed to block the process, go to privileged mode, and bring kernel code.
I mean as a programmer i would just type stream1=system.io.readfile(ABC) or something, which translates to open and read file ABC.
Now what is monitoring the execution of this process, is there a magical power in the cpu to detect this?
As from what i have read a PROCESSOR can only execute only process at a time, so WHERE IS THE MONITOR PROGRAM RUNNING?
How can the KERNEL monitor if a system call is made or not when IT IS NOT IN RUNNING STATE!!
or does the computer have a SYSTEM CALL INSTRUCTION TABLE which it compares with before executing any instruction?
please help
thanku

The kernel doesn't monitor the process to detect a system call. Instead, the process generates an interrupt which transfers control to the kernel, because that's what software-generated interrupts do according to the instruction set reference manual.
For example, on Unix the process stuffs the syscall number in eax and runs an an int 0x80 instruction, which generates interrupt 0x80. The CPU reacts to this by looking in the Interrupt Descriptor Table to find the kernel's handler for that interrupt. This handler is the entry point for system calls.
So, to call _exit(0) (the raw system call, not the glibc exit() function which flushes buffers) in 32-bit x86 Linux:
movl $1, %eax # The system-call number. __NR_exit is 1 for 32-bit
xor %ebx,%ebx # put the arg (exit status) in ebx
int $0x80

Let's analyse each questions you have posed.
Yes, your understanding is correct.
See, if any process/thread wants to get inside kernel there are only two mechanisms, one is by executing TRAP machine instruction and other is through interrupts. Usually interrupts are generated by the hardware, so any other process/threads wants to get into kernel it goes through TRAP. So as usual when TRAP is executed by the process it issues interrupt (mostly software interrupt) to your kernel. Along with trap you will also mentions the system call number, this acts as input to your interrupt handler inside kernel. Based on system call number your kernel finds the system call function inside system call table and it starts to execute that function. Kernel will set the mode bit inside cs register as soon as it starts to handle interrupts to intimate the processor as current instruction is a privileged instruction. By this your processor will comes to know whether the current instruction is privileged or not. Once your system call function finished it's execution your kernel will execute IRET instruction. Which will clear mode bit inside CS register to inform whatever instruction from now inwards are from user mode.
There is no magical power inside processor, switching between user and kernel context makes us to think that processor is a magical thing. It is just a piece of hardware which has the capability to execute tons of instructions at a very high rate.
4..5..6. Answers for all these questions are answered in above cases.
I hope I've answered your questions up to some extent.

The interrupt controller signals the CPU that an interrupt has occurred, passes the interrupt number (since interrupts are assigned priorities to handle simultaneous interrupts) thus the interrupt number to determine wich handler to start. The CPu jumps to the interrupt handler and when the interrupt is done, the program state reloaded and resumes.
[Reference: Silberchatz, Operating System Concepts 8th Edition]

What you're looking for is mode bit. Basically there is a register called cs register. Normally its value is set to 3 (user mode). For privileged instructions, kernel sets its value to 0. Looking at this value, processor knows which kind of instruction it is. If you're interested digging more please refer this excellent article.
Other Ref.
Where is mode bit

Modern hardware supports multiple user sessions. If your hw supports multi user mode, i provides a mechanism called interrupt. An interrupt basically stops the execution of the current code to execute other code (e.g kernel code).
Which code is executed is decided by parameters, that get passed to the interrupt, by the code that issues the interrupt. The hw will increase the run level, load the kernel code into the memory and forces the cpu to execute this code. When the kernel code returns, it again directly informs the hw and the run level gets decreased.
The HW will then restore the cpu state before the interrupt and set the cpu the the next line in the code that started the interrupt. Done.
Since the code is actively calling the hw, which again actively calls the kernel, no monitoring needs to be done by the kernel itself.
Side note:
Try to keep your question short. Make clear what you want. The first answer was correct for the question you posted, you just didnt phrase it well. Make clear that you are new to the topic and need a detailed explanation of basic concepts instead of explaining what you understood so far and don't use caps lock.
Please accept the answer cnicutar provided. thank you.

How do OSes Handle context switching?

As I can understand, every OS need to have some mechanism to periodically check if it should run some tasks and suspend others.
One way would be some kind of timer on whose expiry the OS will check if it should run/suspend some task.
Generally, say on a ARM system that would probably be some kind of ISR.
My real question, is that I've been ABLE to only visualize this and not see it somewhere. Could some one point to some free/open RTOS code where I can actually see the code that handles the preemption/scheduling?

freertos.org. The entire OS is open source, and right there for you to see. And there are dozens of different ports to compare and contrast. For the context switch code, you will want to look in the ports directory, in any one of many files called port.c, port.asm, etc. And yes, in the case of freertos all context switches are performed in interrupts (a tick timer ISR, or any other SysCall interrupt).
A context switch is very-much processor specific, as the list of registers to save and the assembly code to save them varies between processor families, and sometimes within a given family. As a result each port has a separate file for this code.
The scheduling (selection of next task to run), on the other hand, is done in a file called tasks.c, which is common to all ports and references the port-specific code.

It is not the case than an RTOS simply context switches periodically - that is how most GPOS work. In an RTOS the scheduler runs on any scheduling event. These include system-tick, but also message post, event trigger, semaphore give, or mutex unlock for example.
On ARM Cortex-M the CMSIS 3.x includes an RTOS API (intended primarily for RTOS developers rather than a complete RTOS itself), the source for this will include a context switching mechanism.
If you want a detailed description for a simple RTOS you might consider reading µC/OS-II: The Real-Time Kernel or the slightly more sophisticated µC/OS-III: The Real-Time Kernel .
FreeRTOS is increasingly popular, though perhaps a little unconventional architecturally. A more complete (in that it is not just a scheduling kernel but a more complete OS) and very powerful option is eCos.

You can take a look at xv6.
Its not an RTOS, it is just a skeleton OS(based on V6 unix) meant for academic purpose.
In the XV6 book take a look at chapter 4, there is explanation along with the code as to how scheduling is done on a small OS like xv6.XV6 puts a process to sleep when it is waiting for disk or some I/O operation, there is also timer interupt every 100msec to switch a process.
There is also explanation with code on how the context switching takes place, what information is saved( context frame of a process), how the switch from user to kernel mode happens when the scheduler has to run.
The best part is that the amount of reading you have to do to understand these concepts is very less unlike some reference book on OS :) The code is relatively small, you can infact run the XV6 on qemu set breakpoints in the sched , swtch and other functions and actually see the information saved during a context switch.(how to run xv6 in this link)
You dont have to read previous chapters to understand the chapter4. There isnt much dependency,xv6 uses struct proc to identify a process, ptable for all the current running process in the system, proc->conext -refers to the state the process is in (register value etc) , this is saved by the scheduler.
Cheers :)

Interrupt masking: why?

I was reading up on interrupts. It is possible to suspend non-critical interrupts via a special interrupt mask. This is called interrupt masking. What i dont know is when/why you might want to or need to temporarily suspend interrupts? Possibly Semaphores, or programming in a multi-processor environment?

The OS does that when it prepares to run its own "let's orchestrate the world" code.
For example, at some point the OS thread scheduler has control. It prepares the processor registers and everything else that needs to be done before it lets a thread run so that the environment for that process and thread is set up. Then, before letting that thread run, it sets a timer interrupt to be raised after the time it intends to let the thread have on the CPU elapses.
After that time period (quantum) has elapsed, the interrupt is raised and the OS scheduler takes control again. It has to figure out what needs to be done next. To do that, it needs to save the state of the CPU registers so that it knows how to undo the side effects of the code it executes. If another interrupt is raised for any reason (e.g. some async I/O completes) while state is being saved, this would leave the OS in a situation where its world is not in a valid state (in effect, saving the state needs to be an atomic operation).
To avoid being caught in that situation, the OS kernel therefore disables interrupts while any such operations that need to be atomic are performed. After it has done whatever needs doing and the system is in a known state again, it reenables interrupts.

I used to program on an ARM board that had about 10 interrupts that could occur. Each particular program that I wrote was never interested in more than 4 of them. For instance there were 2 timers on the board, but my programs only used 1. I would mask the 2nd timer's interrupt. If I didn't mask that timer, it might have been enabled and continued making interrupts which would slow down my code.
Another example was that I would use the UART receive REGISTER full interrupt and so would never need the UART receive BUFFER full interrupt to occur.
I hope this gives you some insight as to why you might want to disable interrupts.

In addition to answers already given, there's an element of priority to it. There are some interrupts you need or want to be able to respond to as quickly as possible and others you'd like to know about but only when you're not so busy. The most obvious example might be refilling the write buffer on a DVD writer (where, if you don't do so in time, some hardware will simply write the DVD incorrectly) versus processing a new packet from the network. You'd disable the interrupt for the latter upon receiving the interrupt for the former, and keep it disabled for the duration of filling the buffer.
In practise, quite a lot of CPUs have interrupt priority built directly into the hardware. When an interrupt occurs, the disabled flags are set for lesser interrupts and, often, that interrupt at the same time as reading the interrupt vector and jumping to the relevant address. Dictating that receipt of an interrupt also implicitly masks that interrupt until the end of the interrupt handler has the nice side effect of loosening restrictions on interrupting hardware. E.g. you can simply say that signal high triggers the interrupt and leave the external hardware to decide how long it wants to hold the line high for without worrying about inadvertently triggering multiple interrupts.
In many antiquated systems (including the z80 and 6502) there tends to be only two levels of interrupt — maskable and non-maskable, which I think is where the language of enabling or disabling interrupts comes from. But even as far back as the original 68000 you've got eight levels of interrupt and a current priority level in the CPU that dictates which levels of incoming interrupt will actually be allowed to take effect.

Imagine your CPU is in "int3" handler now and at that time "int2" happens and the newly happened "int2" has a lower priority compared with "int3". How would we handle with this situation?
A way is when handling "int3", we are masking out other lower priority interrupters. That is we see the "int2" is signaling to CPU but the CPU would not be interrupted by it. After we finishing handling the "int3", we make a return from "int3" and unmasking the lower priority interrupters.
The place we returned to can be:
Another process(in a preemptive system)
The process that was interrupted by "int3"(in a non-preemptive system or preemptive system)
An int handler that is interrupted by "int3"， say int1's handler.
In cases 1 and 2, because we unmasked the lower priority interrupters and "int2" is still signaling the CPU: "hi, there is a something for you to handle immediately", then the CPU would be interrupted again, when it is executing instructions from a process, to handle "int2"
In case 3, if the priority of “int2” is higher than "int1", then the CPU would be interrupted again, when it is executing instructions from "int1"'s handler, to handle "int2".
Otherwise, "int1"'s handler is executed without interrupting (because we are also masking out the interrupters with priority lower then "int1" ) and the CPU would return to a process after handling the “int1” and unmask. At that time "int2" would be handled.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse