how drivers work (e.g. PCIe and USB) - operating-system

I am curious about how drivers in general work. I do understand basic concepts and also how a single driver operates. Where I am confused is how they work when multiple drivers are involved.
Let me explain my question through an example:
Suppose I have a PCIe and USB interface in HW. The primary interface to host (where driver, OS, applications reside) is PCIe. USB interface is accessible to host through PCIe.
So, in this case, I would have driver for PCIe as well for USB.
When data has to be transferred through USB by application, application would invoke system/OS calls. This would eventually land up in USB driver.
Is this correct?
Once USB driver has completed processing, PCIe calls have to be called. Who does it? is it OS or USB driver itself?
I would assume that, it would be OS as otherwise it would break basic modular philosophy. But driver calling OS seems counterintuitive as I always assumed flow to be from application to OS to driver and HW.
Can anyone please throw some light on this topic?

Much like in user space code, there exist standardized APIs for access various types of hardware in kernel land(exact usage varies by OS). As a result it isn't really that much of a stretch for one device driver to access another device's driver via these standardized APIs. (Warning: USB is a very complex protocol, and many details have been glossed over to keep a long post shorter)
The original question focused on PCIe to USB cards. In this example I think it's helpful to think of there being three "layers" of drivers. The first layer is the PCIe bus controller driver, which controls PCIe bus specific functions such as mapping out MMIO for PCIe devices and supporting interrupts from those PCIe devices. The second layer is the USB host controller layer, which provides the functions for issuing standardized USB transactions. Finally, the USB device driver (like a USB driver keyboard) sits on top of the stack using the standardized USB transaction to implement the functionality of the specific USB peripheral device. Calls from the keyboard driver will call functions down in the USB host controller driver, which in turn may even call down to the PCIe driver. All of this is done in the kernel space, even though many separate drivers are employed.
Most PCIe devices do the bulk of their communication with the CPU via MMIO access, which appear as memory reads/writes to the processor. Generally no specific driver function is needed to perform the MMIO transfer of data from PCIe to CPU (although there may be some simple access functions to do endian correction or deal with cache issues).
USB host controller drivers are interesting in that they conform to a standard (such as XHCI, the USB 3.0 standard, which I'll use in this example) which dictates a standard device memory map and behavior. Thus there usually is some chip specific driver performs non-standard initialization to the USB host controller device. Additionally, these chip specific drivers will both retrieve the location of the XHCI standardized MMIO region and provide a way to receive interrupts from the XHCI controller (in this example from PCIe interrupts).
Next, this standardized memory region and interrupt mechanism is passed to a generic XHCI host controller driver. The generic XHCI code does not care if the device is PCIe, it just cares that it gets passed a memory region that follows the XHCI standard and that it receives the correct interrupts The XHCI driver provides the generic USB transfer functions which in turn the USB keyboard device can use to initiate USB transactions.
For the most part, the XHCI driver is just going to do read/writes to the MMIO region that was passed in. This allows the same common XHCI code to service a wide array USB host controllers, many of which are not PCIe devices. Thus effectively allowing the XHCI driver to abstract away the underlying hardware implementing the USB controller. Thus, for the example posed by the original question, the USB host controller standards are designed to hide the underlying hardware mechanisms to make for a more modular USB driver system.

Related

How does a modern operating system like Windows or Linux know the chipset specific memory map?

The memory map of the peripherals are defined by the chipset. However, modern operating systems like linux and Windows can boot from pretty much every chip (if compiled for the right architecture). As far as I know, the memory mapped devices like the USB Host are not included in the architecture standard. How can the OS still boot, load the drivers, and function? I suppose there must be some specification where the chipset is described.
Formulated a little different: How does the identification of the chipset work, what standards define the communication between the chipset and the processor so that it works on different hardwares and how does the kernel know the right physical addresses for the different peripherals?
Open systems typically use a device tree, which is a specification of the attached hardware, and how it is attached. There is another system, ACPI which supports legacy PCs. Either system permits an OS to locate and configure the buses and associated peripherals it needs.
It is never 100% as easy as that. For example, it is fine for the OS to know there is a scsi controller on bus 1 at address 1000; but if the code for the scsi driver isn't in the loaded os image, then this knowledge is of little use, as it has no way to load the driver.
The intel specification for ACPI attempts to fix this by having tiny driver implementations baked into the firmware of either the platform, the device itself, or both. Since the device doesn't necessarily know what sort of cpu it will run on, these mini drivers are written in a virtual instruction set which the host OS requires an interpreter for.
UEFI provides an alternate way have addressing the boot dependency via a more generic mechanism to use mini-boot drivers for the same purpose.

Why does the driver request the USB device to send the USB descriptors?

As far as I understand the USB devices introduce themselves by sending a device descriptor to the host which uses the information embedded in the descriptor to find and load the right driver/drivers. What I don't understand is why the drivers need the configuration, interface, endpoint and string descriptors from the device. I know the descriptors describe the device as a whole e.g. number of configurations, interfaces, endpoints, types, the size of the packets, the purpose of each byte in the packet etc. Why can't the drivers include this information from the start? Why does the USB device hold this information?
I guess the main reason is because it was designed that way. The designers could just as easily have gone the other way as you say.
Possibly more helpfully, I can think of a few reasons why they did decide on it this way:
Consider the context in which USB came about. PC peripheral connectivity was a mess. You had serial (UART) ports, PS/2 keyboard & mouse, parallel ports in various modes (e.g. ECP and EPP), game port & MIDI, the various SCSI variants, etc. Most of these did not allow for self-describing devices in a standardised way, demanding custom drivers for each type of device, and mostly manual driver loading. Device descriptors alone would mostly solve the problem of selecting the correct driver, but not necessarily the issue of needing a custom driver for each device.
Various standard device class specifications (e.g. HID, audio) define their own class-specific descriptors for communicating device variations to the standard driver. For this reason alone the generalised descriptor mechanism is useful.
Composite devices in their present form would effectively not be possible without standardised interface descriptors. You'd presumably have to make each composite device act as a hub with multiple devices attached.
Many (most?) standard device class specifications mandate a certain minimum number of endpoints where the standardised protocol is implemented, (e.g. bulk-only USB mass storage defines 1 bulk input and 1 bulk output endpoint) while devices are free to add more for vendor-specific extensions, or…
…future expansion of the standard device class or USB standard itself while maintaining backwards compatibility both ways (old driver/new device, new driver/old device). Think of UASP devices that fall back to regular bulk-only mass storage transport.
While not strictly necessary for making all those things happen, making devices self-describing in this way does seem to have been a success overall.

How does the OS detect hardware?

Does the OS get this information from the BIOS or does it scan the buses on its own to detect what hardware is installed on the system. Having looked around online different sources say different things. Some saying the BIOS detects the hardware and then stores it in memory which the OS then reads, others saying the OS scans buses (e.g pci) to learn of the hardware.
I would have thought with modern OSs it would ignore the BIOS and do it itself.
Any help would be appreciated.
Thanks.
Generally speaking, most modern OSes (Windows and Linux) will re-scan the detected hardware as part of the boot sequence. Trusting the BIOS to detect everything and have it setup properly has proven to be unreliable.
In a typical x86 PC, there are a combination of techniques used to detect attached hardware.
PCI and PCI Express busses has a standard mechanism called Configuration Space that you can scan to get a list of attached devices. This includes devices installed in a PCI/PCIe slot, and also the controller(s) in the chipset (Video Controller, SATA, etc).
If an IDE or SATA controller is detected, the OS/BIOS must talk to the controller to get a list of attached drives.
If a USB controller is detected, the OS/BIOS loaded a USB protocol stack, and then enumerates the attached hubs and devices.
For "legacy" ISA devices, things are a little more complicated. Even if your motherboard does not have an ISA slot on it, you typically still have a number of "ISA" devices in the system (Serial Ports, Parallel Ports, etc). These devices typically lack a truly standardized auto-detection method. To detect these devices, there are 2 options:
Probe known addresses - Serial Ports are usually at 0x3F8, 0x2F8, 0x3E8, 0x2E8, so read from those addresses and see if there is something there that looks like a serial port UART. This is far from perfect. You may have a serial port at a non-standard address that are not scanned. You may also have a non-serial port device at one of those addresses that does not respond well to being probed. Remember how Windows 95 and 98 used to lock up a lot when detecting hardware during installation?
ISA Plug-n-Play - This standard was popular for a hot minute as ISA was phased out in favor of PCI. You probably will not encounter many devices that support this. I believe ISA PnP is disabled by default in Windows Vista and later, but I am struggling to find a source for that right now.
ACPI Enumeration - The OS can rely on the BIOS to describe these devices in ASL code. (See below.)
Additionally, there may be a number of non-PnP devices in the system at semi-fixed addresses, such as a TPM chip, HPET, or those "special" buttons on laptop keyboards. For these devices to be explained to the OS, the standard method is to use ACPI.
The BIOS ACPI tables should provide a list of on-motherboard devices to the OS. These tables are written in a language called ASL (or AML for the compiled form). At boot time, the OS reads in the ACPI tables and enumerates any described devices. Note that for this to work, the motherboard manufacturer must have written their ASL code correctly. This is not always the case.
And of course, if all of the auto-detection methods fail you, you may be forced to manually install a driver. You do this through the Add New Hardware Wizard in Windows. (The exact procedure varies depending on the Windows version you have installed.)
I see a lot of info about system hardware, except for memory ,one of the main important part besides the cpu, which funnily isn't really mentioned as well.
This is fair because perhaps there's so many things to enumerate, you kind of lose sight of the forest through the trees.
For memory on x86/64 platforms you will want to query either BIOS or EFI for a memory map. for BIOS this is int 0x15 handle 0xe820. EFI has it's own mechanism which provides similar information.
This will show you which memory ranges are reserved by hardware etc. in order for your OS to know to leave them alone. (ok you have to built that part too of course ;D)
For other platforms, often the OS will be configured for a fixed memory size, like in embedded platforms. There is no BIOS for you, and performing a sort of bruteforce on memory is unreliable at best. (as far as i know! - not much experience outside of x86/64!!!)
For the CPU you will definitely want to look into MSRs, control registers and CPUID functions to enumerate the CPU and see what it's capable of. you can query if for example 64 bit mode is supported, and some other features which might not be present on all cpus.
For other hardware like pci etc, i would recommend like myron-semack said to look into PCI specification, pci-express, and importantly ACPI as implementing that will make you handle hardware and powermanagement a . bit more generically / according to newer standards.

Interfacing socket code with a Linux PCI driver

I have two devices that are interfaced with PCI. I also have code for both devices that uses generic socket code. (The devices were originally connected by MII/Ethernet.)
Now, I need to write a PCI device driver to transport packets back and forth between the two devices.
How do I access the file descriptors opened by the socket code? Is this the same as accessing a character device file?
The PCI driver has to somehow capture packets from read() and write() in the code.
Thanks!
The answers to your questions are: (1) You don't, and (2) no.
File descriptors are a user-space concept, and kernel drivers don't interact with user-space concepts. (Yes, they're implemented by the kernel, but other device drivers don't get to play with them directly, and shouldn't play with them even indirectly.)
What you do is implement methods that will receive data that is buffered in a kernel-accessible memory space, and send that to your hardware, and then receive data from your hardware and write it (when asked) to a buffer in kernel-accessible memory.
You'll do this by implementing the character device driver APIs as well as the PCI device driver APIs and then registering your driver as a PCI device, and then a character device. While some of these methods may refer to file structures, they will not be the user-land structures that you know and love.
For devices that implement the Ethernet protocols, life is easier because you implement the Net Device Interface instead. This way all you have to write is the parts necessary to get data to and from your hardware.
What you'll need is specifications for the device hardware, how you control the hardware using PCI registers and regions.
The good news is, you don't have to do this alone -- there's a large community of kernel developers, and several good (and current) books on developing for the Linux kernel (see below).
References
Understanding the Linux Kernel
Linux Device Drivers
Essential Linux Device Drivers

Direct communication between two PCI devices

I have a NIC card and a HDD both connected on PCIe slots in a Linux machine. Ideally, I'd like to funnel incoming packets to the HDD without involving the CPU, or involving it minimally. Is it possible to set up direct communication along the PCI bus like that? Does anyone have pointers as to what to read up on to get started on a project like this?
Thanks all.
Not sure if you are asking about PCI or PCIe. You used both terms, and the answer is different for each.
If you are talking about a legacy PCI bus: The answer is "yes". Board to board DMA is doable. Video capture boards may DMA video frames directly into your graphics card memory for example.
In your example, the video card could DMA directly to a storage device. However, the data would be quite "raw". Your NIC would have no concept of a filesystem for example. You also need to make sure you can program the NIC's DMA engine to sit within the confines of your SATA controller's registers. You don't want to walk off the end of the BAR!
If you are talking about a modern PCIe bus: The answer is "typically no, but it depends". Peer-to-peer bus transactions are a funny thing in the PCI Express Spec. Root complex devices are not required to support it.
In my testing, peer-to-peer DMA will work, if your devices are behind a PCIe switch (not directly plugged into the motherboard). However, if your devices are connected directly to the chipset (Root Complex), peer-to-peer DMA will not work, except in some special cases. The most notable special case would be the video capture example I mentioned earlier. The special cases are mentioned in the chipset datasheets.
We have tested the peer-to-peer PCIe DMA with a few different Intel and AMD chipsets and found consistent behavior. Have not tested the most recent generations of chipsets though. (We have discussed the lack of peer-to-peer PCIe DMA support with Intel, not sure if our feedback has had any impact on their Engineering dept.)
Assuming that both the NIC card and the HDD are End Points (or Legacy Endpoints) you cannot funnel traffic without involving the Root Complex (CPU).
PCIe, unlike PCI or PCI-X, is not a bus but a link, thus any transaction from an Endpoint device (say the NIC) would have to travel through the Root Complex (CPU) in order to get to another branch (HDD).