Mapping of Host System Memory to PCI domain Address

Mapping of Host System Memory to PCI domain Address - pci

My understanding of PCI
The Host CPU is responsible for assigning the PCI domain address to all other devices on PCI bus by setting the devices BAR register in PCI configuration space
The Host CPU can map the PCI address domain to its domain(i.e System domain), so that Host initiated "PCI Memory transactions" with devices on PCI bus can be achieved using simple load/ store instructions of the host CPU
Question ->
Is it possible that even the system memory i.e. the main memory of the host(actual ram) be mapped to PCI domain address, so that when Host system is a target device of the "PCI memory transaction" initiated by a device on PCI bus, the main memory is read/ written without the intervention of the Host CPU?
Additional Information: I am working of embedded system with 3 SH4 processors communicating using PCI bus

There are two kind of memory mapping in PCIe world. One is inbound mapping and the other is outbound mapping.
Inbound mapping : the memory space is located on the device and the host CPU can look up the mapped memory space.
Outbound mapping : the memory space is located on the host CPU and the device can look up the mapped memory space.
Both of them seem to be same but it is a important difference. With this feature, you don't need to any additional memory copy to communicate between the host CPU and the device.

I realize this is an old question but I'll like to answer it anyway. When you say "transaction initiated by a device on PCI bus", I assume you mean a read/write initiated by the device to access system memory (RAM). This is called bus mastering on the device (also referred to as DMA), and it can be done by having the host CPU allocate a DMA buffer (ie. with dma_alloc_coherent()), and having the driver provide this DMA address to the device. Then yes, the device can read/write to system memory without host CPU intervention.

Yes it is very much possible. You can also use DMA functionality of PCIE to 'bypass' the CPU.

Related

how drivers work (e.g. PCIe and USB)

I am curious about how drivers in general work. I do understand basic concepts and also how a single driver operates. Where I am confused is how they work when multiple drivers are involved.
Let me explain my question through an example:
Suppose I have a PCIe and USB interface in HW. The primary interface to host (where driver, OS, applications reside) is PCIe. USB interface is accessible to host through PCIe.
So, in this case, I would have driver for PCIe as well for USB.
When data has to be transferred through USB by application, application would invoke system/OS calls. This would eventually land up in USB driver.
Is this correct?
Once USB driver has completed processing, PCIe calls have to be called. Who does it? is it OS or USB driver itself?
I would assume that, it would be OS as otherwise it would break basic modular philosophy. But driver calling OS seems counterintuitive as I always assumed flow to be from application to OS to driver and HW.
Can anyone please throw some light on this topic?

Much like in user space code, there exist standardized APIs for access various types of hardware in kernel land(exact usage varies by OS). As a result it isn't really that much of a stretch for one device driver to access another device's driver via these standardized APIs. (Warning: USB is a very complex protocol, and many details have been glossed over to keep a long post shorter)
The original question focused on PCIe to USB cards. In this example I think it's helpful to think of there being three "layers" of drivers. The first layer is the PCIe bus controller driver, which controls PCIe bus specific functions such as mapping out MMIO for PCIe devices and supporting interrupts from those PCIe devices. The second layer is the USB host controller layer, which provides the functions for issuing standardized USB transactions. Finally, the USB device driver (like a USB driver keyboard) sits on top of the stack using the standardized USB transaction to implement the functionality of the specific USB peripheral device. Calls from the keyboard driver will call functions down in the USB host controller driver, which in turn may even call down to the PCIe driver. All of this is done in the kernel space, even though many separate drivers are employed.
Most PCIe devices do the bulk of their communication with the CPU via MMIO access, which appear as memory reads/writes to the processor. Generally no specific driver function is needed to perform the MMIO transfer of data from PCIe to CPU (although there may be some simple access functions to do endian correction or deal with cache issues).
USB host controller drivers are interesting in that they conform to a standard (such as XHCI, the USB 3.0 standard, which I'll use in this example) which dictates a standard device memory map and behavior. Thus there usually is some chip specific driver performs non-standard initialization to the USB host controller device. Additionally, these chip specific drivers will both retrieve the location of the XHCI standardized MMIO region and provide a way to receive interrupts from the XHCI controller (in this example from PCIe interrupts).
Next, this standardized memory region and interrupt mechanism is passed to a generic XHCI host controller driver. The generic XHCI code does not care if the device is PCIe, it just cares that it gets passed a memory region that follows the XHCI standard and that it receives the correct interrupts The XHCI driver provides the generic USB transfer functions which in turn the USB keyboard device can use to initiate USB transactions.
For the most part, the XHCI driver is just going to do read/writes to the MMIO region that was passed in. This allows the same common XHCI code to service a wide array USB host controllers, many of which are not PCIe devices. Thus effectively allowing the XHCI driver to abstract away the underlying hardware implementing the USB controller. Thus, for the example posed by the original question, the USB host controller standards are designed to hide the underlying hardware mechanisms to make for a more modular USB driver system.

Difference Between PCI and PCIe

I have started reading about PCI and PCIe. I came across a point "From a software standpoint, PCI and PCI Express devices are essentially the same. PCIe devices had the same configuration space, BARs, and (usually) support the same PCI INTx interrupts".
PCIe uses a serial interface while PCI uses a parallel interface. Then how can a linux driver written for the PCI can be used for a PCIe device? I am confused. Please help.
regards,
Ajmal

PCI and PCIe are completely different at physical layer. PCI is parallel where as PCIe is serial. PCI bus is shared by all the PCI devices whereas PCIe has dedicated channel for data transfer.
These differences are taken care at the software layer. So, programmer doesn't need to worry about it.

PCI supports 256B config space. PCIe has 4K config space and is backward compartable for the first 256B
Yeah, PCI is parallel and PCIe is serial and that change is inherent in PHY layer.
PCI supports INTx (1-4) SW interrupts while PCIe supports PCI interrupts and additional 32 interrupts from PCI-X and 2K interrupts
from PCIe.
AER (Advanced Error Reporting) is supported by PCIe

How to get hardware ID of network interface card in UEFI program?

The form of hardware id of nic is like PCI\VEN_8086&DEV_153A&SUBSYS_309717AA&REV_04
I want to get it in UEFI program, but I haven't gotten any tips in UEFI specification.

What you need is EFI_PCI_IO_PROTOCOL.
Refer to UEFI spec 2.6 "13.4 EFI PCI I/O Protocol".
Get all PCI devices handles by calling gBS->LocateHandleBuffer().
Get EFI_PCI_IO_PROTOCOL attached on PCI device handle. (gBS->HandleProtocol)
Call EFI_PCI_IO_PROTOCOL.Pci() to load the PCI configuration space. Everything you need (Device id, Vendor id, Subsystem, Revision) can be found in PCI configuration space.

Memory mapped IO - who maps the addresses to the physical address space?

When we say that a device is memory mapped,
Who maps the addresses to the devices?
How are these address spaces decided in terms of location and size?
Where are these maps stored?
Do these address spaces vary across system boots?

Roughly,
The MMU hardware.
The kernel manages the MMU tables used by the MMU hardware.
In a per-process structure. Under Linux, look at /proc/<pid>/maps to see all memory-mapped files and devices.
They can, so you should not count on them being fixed.
For further reading, I suggest the Memory Mapping and DMA chapter from Linux Device Drivers, this FAQ, and this stackoverflow question.

Direct communication between two PCI devices

I have a NIC card and a HDD both connected on PCIe slots in a Linux machine. Ideally, I'd like to funnel incoming packets to the HDD without involving the CPU, or involving it minimally. Is it possible to set up direct communication along the PCI bus like that? Does anyone have pointers as to what to read up on to get started on a project like this?
Thanks all.

Not sure if you are asking about PCI or PCIe. You used both terms, and the answer is different for each.
If you are talking about a legacy PCI bus: The answer is "yes". Board to board DMA is doable. Video capture boards may DMA video frames directly into your graphics card memory for example.
In your example, the video card could DMA directly to a storage device. However, the data would be quite "raw". Your NIC would have no concept of a filesystem for example. You also need to make sure you can program the NIC's DMA engine to sit within the confines of your SATA controller's registers. You don't want to walk off the end of the BAR!
If you are talking about a modern PCIe bus: The answer is "typically no, but it depends". Peer-to-peer bus transactions are a funny thing in the PCI Express Spec. Root complex devices are not required to support it.
In my testing, peer-to-peer DMA will work, if your devices are behind a PCIe switch (not directly plugged into the motherboard). However, if your devices are connected directly to the chipset (Root Complex), peer-to-peer DMA will not work, except in some special cases. The most notable special case would be the video capture example I mentioned earlier. The special cases are mentioned in the chipset datasheets.
We have tested the peer-to-peer PCIe DMA with a few different Intel and AMD chipsets and found consistent behavior. Have not tested the most recent generations of chipsets though. (We have discussed the lack of peer-to-peer PCIe DMA support with Intel, not sure if our feedback has had any impact on their Engineering dept.)

Assuming that both the NIC card and the HDD are End Points (or Legacy Endpoints) you cannot funnel traffic without involving the Root Complex (CPU).
PCIe, unlike PCI or PCI-X, is not a bus but a link, thus any transaction from an Endpoint device (say the NIC) would have to travel through the Root Complex (CPU) in order to get to another branch (HDD).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse