How to communicate between LKM and pthread? - linux-device-driver

We need to develop a Linux Kernel Module that will handle a hardware interrupt and wake a user pthread (or ideally a C++11 thread). Is that possible?
Where should I start looking for how to do it?

Yes. Possible.
LKM need to intimate the user space once the interrupt is occurred in your case.
In ISR, a fifo kind of mechanism can be used to notify to user space. Where as a thread(say pthread) is in blocked read on that fifo can start processing once the LKM writes in to it.

Related

How does kernel gain control?

As far as I know that external events such as hardware interrupts or system calls can trigger control transfer from user space to kernel space. But what if there is no external event or system calls, will the running process keep running forever?
It could, but typically the kernel programs a timer device to cause an interrupt at some point in the future. When that interrupt arrives, the kernel has the opportunity to decide what to do next.

Notify userland from a kernel module

I'm implementing a kernel module that drives GPIOs. I offer the possibility for the userland to perform actions on it via ioctls, but I'd like to get deeper and set up a "notification" system, where the kernel module will contact directly userland on a detected event. For example, a value change on a GPIO (already notified by interrupt in the kernel module).
The main purpose is to avoid active polling loops in userland, and I really don't know how to interface kernel module and userland to keep speed, efficiency, and more or less passive.
I can't find anything on a good practice in this case. Some people talk about having a character interface (via a file in /dev) and performing a blocking read() from userland, and so get notified when the read returns.
This method should be good enough, but in case of very fast GPIO value changes, the userland would maybe be too slow to handle a notification and finally would be crushed by tons of notifications it can't handle.
So I'm looking for a method like userland callback functions, that could be called from the kernel module on an event.
What do you guys think is the best solution ? Is there any existing way of solving this specific problem ?
Thank you :)
Calling from the kernel to userspace is certainly possible, for instance spawning a userspace process (consider the kernel launches init, udev and some more) or using IPC (netlink and others).
However, this is not what you want.
As people mentioned to you, the way to go is to have a char device and then use standard and well-known select/poll semantics. I don't think you should be worried about this being slow, assuming your userspace program is well-designed.
In fact, this design is so common that there is an existing framework called UIO or Userspace I/O (see here and here).
I'm sorry, I don't know if you could call userland callbacks from kernel space, but you can make your user space application to listen on different signals like SIGKILL, SIGTERM, etc. which you can send to a user space process from kernel space.
There are also SIGUSR1 and SIGUSR2, which are reserved for custom use/implementation. Your application could listen on SIGUSR1 and/or SIGUSR2. Then you only have to check, why you were notified.
I know, it's not exactly what you wanted, but maybe it's a little help. ;)
I finally changed for something else, as spawning userland processes was far too slow and error likely.
I changed my software design to make the userland call an ioctl to get last events. The ioctl is blocking via wait queues, sleeping while the event queue is empty.
Thanks for your answer guys !

Do we have to enable or disable PCI interrupts on every layer, or only at the closest to hardware?

I'm implementing a PCIe driver, and I'd like to understand at what level the interrupts can be or should be enabled/disabled. I intentionally do not specify OS, as I'm assuming it should be relevant for any platform. By levels I mean the following:
OS specific interrupts handling framework
Interrupts can be disabled or enabled in the PCI/PCIe configuration space registers, e.g. COMMAND register
Interrupts also can be masked at device level, for instance we can
configure device not trigger certain interrupts to the host
I understand that whatever interrupt type is being used on PCIe (INTx emulation, MSI or MSI-X), it has to be delivered to the host OS.
So my question is really -- do we actually have to enable or disable interrupts on every layer, or it's sufficient only at the closest to hardware, e.g. in relevant PCI registers?
Disabling interrupts at the various levels usually has completely different purposes.
Disabling interrupts:
In the OS (really, this means in the CPU) - This is generally about avoiding race conditions. In particular, if state/memory corruption could occur during a particular section of code if the CPU happened to be interrupted, then that section of code will need to disable interrupt handling. Interrupt handlers must not acquire normal locks (by definition they can't be suspended), and they must not attempt to acquire a spin-lock that is held by the thread currently scheduled on the same CPU (because that thread is blocked from progressing by the very same interrupt handler!) so ensuring data safety with interrupt handlers can be tricky. Handling interrupts promptly is generally a good thing, so you want to absolutely minimise such sections in any code you write. Do as much of your interrupt handling in secondary interrupt handlers as possible to avoid such situations. Secondary interrupt handlers are really just callbacks on a regular OS thread which doesn't have any of the restrictions of a primary interrupt handler.
PCI/PCIe configuration - It's my understanding this is mainly about routing interrupts, and is something you normally do once when your driver loads (or is activated by a client) and again when your driver unloads (or is deactivated). This may also be affected by power management events. In some OSes, the PCI(e) level is actually handled for you when you activate PCI device interrupts via higher-level APIs.
On-device - This is usually an optimisation to avoid interrupting the CPU when it doesn't need to be interrupted. The most common scenario is that an event happens on the device, so an interrupt is generated. The driver's primary interrupt handler checks the device registers if the driver needs to do any processing. If so, it disables interrupts on the device, and schedules the driver's secondary interrupt handler to run. The OS eventually runs the secondary handler, which processes whatever information the device has provided, until it runs out of things to do. Then it enables interrupts again, checks once more if there's any work pending from the device and if there are none, it terminates. (If there are items to process in this last check, it re-disables interrupts and starts over from the beginning.) The idea is that until the secondary interrupt handler has finished processing, there really is no point triggering the primary interrupt handler, and a waste of resources, if additional events arrive, because the driver is already busy processing the event queue. The final check after re-enabling interrupts is to avoid a race condition between an event arriving and re-enabling interrupts.
I hope that answers your question.

Why do we need software interupt to start the execution of the system call?

This may be very foolish question to ask.
However I want to clarify my doubts as i am new to this thing.
As per my understanding the CPU executes the instruction of a process step by step by incrementing the program counter.
Now suppose we have a system call as one of the instruction, then why do we need to give a software interrupt when this instruction is encountered? Can't this system call (sequence of instructions) be executed just as other instructions are executing, because as far i understand the interrupts are to signal certain asynchronous events. But here the system call is a part of the process instruction, which is not asynchronous.
It doesn't require an interrupt. You could make an OS which uses a simple call. But most don't for several reasons. Among them might be:
On many architectures, interrupts elevate or change the CPU's access level, allowing the OS to implement protection of its memory from the unprivileged user code.
Preemptive operating systems already make use of interrupts for scheduling processes. It might be convenient to use the same mechanism.
Interrupts are something present on most architectures. Alternatives might require significant redesign across architectures.
Here is one example of a "system call" which doesn't use an interrupt (if you define a system call as requesting functionality from the OS):
Older versions of ARM do not provide atomic instructions to increment a counter. This means that an atomic increment requires help from the OS. The naive approach would be to make it a system call which makes the process uninterruptible during the load-add-store instructions, but this has a lot of overhead from the interrupt handler. Instead, the Linux kernel has chosen to map a small bit of code into every process at a fixed address. This code contains the atomic increment instructions and can be called directly from user code. The kernel scheduler takes care of ensuring that any operations interrupted in this block are properly restarted.
First of all, system calls are synchronous software interrupts, not asynchronous. When the processor executes the trap machine instruction to go to kernel space, some of the kernel registers get changed by the interrupt handler functions. Modification of these registers requires privileged mode execution, i.e. these can not be changed using user space code.
When the user-space program cannot read data directly from disk, as it doesn't have control over the device driver. The user-space program should not bother with driver code. Communication with the device driver should take place through kernel code itself. We tend to believe that kernel code is pristine and entirely trustworthy; user code is always suspect.
Hence, it requires privileged instructions to change the contents of register and/or accessing driver functionalities; the user cannot execute system call functions as a normal function call. Your processor should know whether you are in the kernel mode to access these devices.
I hope this is clear to some extent.

Are events built on polling?

an event is when you click on something, and code is run right away
polling is when the application constantly checks if your mouse button is held down, and if it's held down in a certain spot, code is run
do events really exist in computing, or is it all a layer built on polling?
This is a complicated question, and the answer depends on how far down you go (in abstraction layers) to answer it. Ultimately, your USB keyboard device is being polled once per millisecond by the computer to ask what keys are being held down. This information gets passed to the keyboard driver through a CPU interrupt when the USB device (in the computer) gets a packet of data from the keyboard. From then on, interrupts are used to pass the data from process to process (through the GUI framework) and eventually reach your application.
As Marc Cohen said in his answer, CPU interrupts are also raised to signal I/O completion. This is an example of something which has no polling until you get to the hardware level, where checks are performed (perhaps once per clock cycle? Someone with more experience with computer architecture should answer) to see if the event has taken place.
It's a common technique to simulate events by polling but that's often very inefficient and leads to a dilemma where you have a tradeoff between event resolution and polling overhead but that doesn't mean true events don't exist.
A CPU interrupt, which could be raised to signal an external event, like I/O completion, is an example of an event all the way down at the hardware layer.
Well, both operating system and application level depend on events not polling. Polling is usually possible where states cannot be maintained. On desktop applications and OS levels however, applications have states; so, they use events for their processes, not polling.