Is PCI "CF8h/CFCh" IO port addresses only applicable to processors with an IO address space? - pci

Some CPU like x86 processor has two address spaces. One for memory and one for IO. And different instructions to access them.
And the PCI 3.0 spec also mentions some important IO addresses:
Two DWORD I/O locations are used to generate configuration
transactions for PC-AT compatible systems. The first DWORD location
(CF8h) references a read/write register that is named CONFIG_ADDRESS.
The second DWORD address (CFCh) references a read/write register named
CONFIG_DATA.
So it seems PCI 3.0 spec is tightly coupled to processors that does implement IO address space. And that's the a priori knowledge that SW/FW writers should know.
So what about the other processor archs that don't have IO address space? Like ARM. How can they interact with the PCI configuration space?

The paragraph immediately preceding the one quoted in the question directly addresses the question. It says:
Systems must provide a mechanism that allows software to generate PCI configuration transactions. ...
For PC-AT compatible systems, the mechanism for generating configuration transactions is defined and specified in this section. ...
For other system architectures, the method of generating configuration
transactions is not defined in this specification.
In other words, systems that are not PC-AT compatible must provide a mechanism, but it is specified elsewhere. The PCI spec isn't tightly coupled to PC-AT systems, but it doesn't define the mechanism for other types of systems.
The paragraph in the question only applies to PC-AT compatible systems.

Below quote from here clears things up:
The method for generating configuration cycles is host dependent. In
IA machines, special I/O ports are used. On other platforms, the PCI
configuration space can be memory-mapped to certain address locations
corresponding to the PCI host bridge in the host address domain.
And
I/O space can be accessed differently on different platforms.
Processors with special I/O instructions, like the Intel processor
family, access the I/O space with in and out instructions. Machines
without special I/O instructions will map to the address locations
corresponding to the PCI host bridge in the host address domain. When
the processor accesses the memory-mapped addresses, an I/O request
will be sent to the PCI host bridge, which then translates the
addresses into I/O cycles and puts them on the PCI bus.
So for non-IA platform, MMIO can just be used instead. And the platform specs should document that memory-mapped address for the PCI host bridge as the a priori knowledge for SW/FW writers.
ADD 1 - 14:36 2023/2/5
From the digital design's perspective, the host CPU and the PCIe subsystem are just two separate IP blocks. And the communication between them is achieved by a bunch of digital signals in the form of address/data/control lines. As long as the signals can be conveyed, the communication can be made.
For x86 CPUs, the memory address space and IO address space are just different usage of address lines down to the earth. I don't think there's any strong reason that memory addresses cannot be used to communicate with PCIe subsystem. I think it's a more logical choice back then to use I/O addresses for PCIe because PCIe is deemed as I/O.
So the real critical thing I think, is to convey the digital signals in proper format between IPs. PCIe is independent of CPU architectures and cares nothing about what lines to be used. For ARM, there's nothing unnatural to use memory addresses, i.e., MMIO. After all it's digital signals and are capable of passing necessary information properly.

Related

Where are the peripherals register in STM32? are they in the cortex-m core or in the peripheral unit itself?

I have two questions.
memory region of the cortex-m core cpu
1- is the memory of the stm32 microcontrollers inside the cortex-m core or outside of it? and if it is inside the cortex-core why is it not shown in the block diagram of the cortex-m core generic user guide?block diagram of the cortex-m core
2-I'm trying to understand the stm32 architecture but I'm facing an ambiguity.
usart block diagram
as you can see in the picture the reference manual says that the USART unit has some registers(i.e Data Register).
but these registers also exist in the memory region of the cortex-m core(if the answer to the first question is "inside").where are they really? are there two registers for each register? or are they resided in the cortex-m core or in the peripheral itself?
is it related to the memory-mapped i/o definition?
The only storage that's inside the CPU core is the registers (including general-purpose and special-purpose registers). Everything else is external, including RAM and ROM.
The peripheral control registers exist, essentially, inside the peripheral. However they are accessed by the CPU in the same way that it accesses RAM or ROM; that's the meaning of the memory map, it shows you which addresses refer to RAM, ROM, peripheral registers, and other things. (Note that most of the memory map is unused - a 32-bit address space is capable of addressing 4GB of memory, and no microcontroller that I know of has anything like that much storage.) The appropriate component 'responds' to read and write requests on the memory bus depending on the address.
For a basic overview the Wikipedia page on memory-mapped IO is reasonably good.
Note that none of this is specific to the Cortex-M. Almost all modern microprocessor designs use memory mapping. Note also that the actual bus architecture of the Cortex-M is reasonably complex, so any understanding you gain from the Wikipedia article will be of an abstraction of the true implementation.
Look at the image below, showing the block diagram of an STM32 Cortex-M4 processor.
I have highlighted the CPU Core (top left); and other components you can find inside the microcontroller.
The CPU "core", as its name implies, is just the "core"; but the microcontroller also integrates a Flash memory, a RAM, and a number of peripherals; almost everything outside the core (except debugging lines) is accessed by means of the bus matrix, this is equally true for ROM, RAM, and integrated peripherals.
Note that the main difference between a "microprocessor" and a "microcontroller" is that a the latter has dedicated peripherals on board.
Peripherals on STM32 devices are accessed by the CPU via memory-mapped I/O, look at the picture below:
As you can see, despite a linear address space from 0x00000000 to 0xFFFFFFFF, the address space is partitioned in "segments", f.e., program memory starting at 0x00000000, SRAM at 0x20000000, peripherals at 0x40000000. Specific peripheral registers can be read/written by pointers at specific offsets from the base address.
For this device, USARTS are mapped to APB1 area, thus in address range 0x40000000-0x4000A000. Note that the actual peripheral addresses can be different from device to device.
Peripherals are connected to core via buses. The address decoder knows which address is handled by the particular bus.
Not only peripherals are connected via buses. Memories are connected the same way. Busses are interconnected via the bridges. Those bridges know how to direct the traffic.
From the core point of view the peripheral register works the same way as the memory cell.
What about the gaps. Usually if the address decoder does not understand the address it will generate the exception - hardware error (in the ARM terminology called HardFault)
Details are very complicated and unless you are going to design your own chip not needed for the register level programmer.

How CPU generates logical addresses?

CPU generates logical addresses. These logical addresses then converted into physical addresses by special unit MMU. This is written in so many books including Galvin (slides 6-7).
But I want to know how CPU generates logical address and what does it mean?
It is just a simplification.
CPU doesn't generate logical addresses. They are stored in your executable file. CPU reads your program and extracts these addresses.
Here (slide 7) Galvin says:
In MMU scheme, the value in the relocation register is added to
every address generated by a user process at the time it is sent to
memory.
The user program deals with logical addresses; it never sees the
real physical addresses.
The CPU does not generate logical addresses. Logical to physical address mapping is defined by the operating system. The operating system sets up page tables that define the mapping.
The processors defines structure of the page tables. The operating system defines the content of the page tables.

Memory mapped IO - how is it done?

I've read about the difference between port mapped IO and memory mapped IO, but I can't figure out how memory mapped Io is implemented in modern operating systems (windows or linux)
What I know is that a part of the physical memory is reserved to communicate with the hardware and there's a MMIO Unit involved in taking care of the bus communication and other memory-related stuff
How would a driver communicate with underlying hardware? What are the functions that the driver would use? Are the addresses to communicate with a video card fixed or is there some kind of "agreement" before using them?
I'm still rather confused
The following statement in your question is wrong:
What I know is that a part of the physical memory is reserved to communicate with the hardware
A part of the physical memory is not reserved for communication with the hardware. A part of the physical address space, to which the physical memory and memory mapped IO are mapped, is. This memory layout is permanent, but user programs do not see it directly - instead, they run into their own virtual address space to which the kernel can decide to map, wherever it wants, physical memory and IO ranges.
You may want to read the following articles which I believe contain answers to most of your questions:
http://duartes.org/gustavo/blog/post/motherboard-chipsets-memory-map
http://duartes.org/gustavo/blog/post/memory-translation-and-segmentation
http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory
http://en.wikipedia.org/wiki/Memory-mapped_I/O
http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/IO/mapped.html
Essentially it is just a form of accessing the data, as if you are saving / reading from the memory. But the hardware will snoop on the address bus, and when it sees the address targetting for him, it will just receive the data on the data bus.
Are you asking about Memory mapped files, or memory mapped port-IO?
Memory mapped files are done by paging out the pages and intercepting page-faults to those addresses. This is all done by the OS by negotiation between the file-system manager and the page-fault handler.
Memory mapped port-IO is done at the CPU level by overloading address lines as port-IO lines which allow writes to memory to be translated onto the QPI bus lines as port-IO. This is all done by the processor interacting with the motherboard. The only other thing that the OS needs to do is to tell the MMU not to coalese reads and writes through the PAE must-writethrough and no-cache bits.

what is the difference between memory mapped io and io mapped io

Pls explain the difference between memory mapped IO and IO mapped IO
Uhm,... unless I misunderstood, you're talking about two completely different things. I'll give you two very short explanations so you can google up what you need to now.
Memory-mapped I/O means mapping I/O hardware devices' memory into the main memory map. That is, there will be addresses in the computer's memory that won't actually correspond to your RAM, but to internal registers and memory of peripheral devices. This is the machine architecture Pointy was talking about.
There's also mapped I/O, which means taking (say) a file, and having the OS load portions of it in memory for faster access later on. In Unix, this can be accomplished through mmap().
I hope this helped.
On x86 there are two different address spaces, one for memory, and another one for I/O ports.
The port address space is limited to 65536 ports, and is accessed using the IN/OUT instructions.
As an example, a video card's VGA functionality can be accessed using some I/O ports, but the framebuffer is memory-mapped.
Other CPU architectures only have one address space. In those architectures, all devices are memory-mapped.
Memory mapped I/O is mapped into the same address space as program memory and/or user memory, and is accessed in the same way.
Port mapped I/O uses a separate, dedicated address space and is accessed via a dedicated set of microprocessor instructions.
As 16-bit processors will slowly become obsolete and replaced with 32-bit and 64-bit in general use, reserving ranges of memory address space for I/O is less of a problem, as the memory address space of the processor is usually much larger than the required space for all memory and I/O devices in a system.
Therefore, it has become more frequently practical to take advantage of the benefits of memory-mapped I/O.
The disadvantage to this method is that the entire address bus must be fully decoded for every device. For example, a machine with a 32-bit address bus would require logic gates to resolve the state of all 32 address lines to properly decode the specific address of any device. This increases the cost of adding hardware to the machine.
The advantage of IO Mapped IO system is that less logic is needed to decode a discrete address and therefore less cost to add hardware devices to a machine. However more instructions could be needed.
Ref:- Check This link
I have one more clear difference between the two. The memory mapped I/O device is that I/O device which respond when IO/M is low. While a I/O (or peripheral) mapped I/O device is that which respond when IO/M is high.
Memory mapped I/O is mapped into the same address space as program memory and/or user memory, and is accessed in the same way.
I/O mapped I/O uses a separate, dedicated address space and is accessed via a dedicated set of microprocessor instructions.
The difference between the two schemes occurs within the Micro processor’s / Micro controller’s. Intel has, for the most part, used the I/O mapped scheme for their microprocessors and Motorola has used the memory mapped scheme.
https://techdhaba.com/2018/06/16/memory-mapped-i-o-vs-i-o-mapped-i-o/

How exactly does OS protect kernel

my question is how exactly does operating system protect it's kernel part.
From what I've found there are basically 2 modes kernel and user. And there should be some bits in memory segments which tels if a memory segment is kernel or user space segment. But where is the origin of those bits? Is there some "switch" in compiler that marks programs as kernel programs? And for example if driver is in kernel mode how does OS manages its integration to system so there is not malicious software added as a driver?
If someone could enlighten me on this issue, I would be very grateful, thank you
The normal technique is by using a feature of the virtual memmory manager present in most modern cpus.
The way that piece of hardware works is that it keeps a list of fragments of memory in a cache, and a list of the addresses to which they correspond. When a program tries to read some memory that is not present in that cache, the MMU doesn't just go and fetch the memory from main ram, because the addresses in the cacher are only 'logical' addresses. Instead, it invokes another program that will interpret the address and fetch that memory from wherever it should be.
That program, called a pager, is supplied by the kernel, and special flags in the MMU prevent that program from being overridden.
If that program determines that the address corresponds to memory the process should get to use, it supplies the MMU with the physical address in main memory that corresponds to the logical address the user program asked for, the MMU fetches it into its cache, and resumes running the user program.
If that address is a 'special' address, like for a memory mapped file, then the kernel fetches the corresponding part of the file into the cache and lets the program run along with that.
If the address is in the range that belongs to the kernel, or that the program hasn't allocated that address to itself yet, the pager raises a SEGFAULT, killing the program.
Because the addresses are logical addresses, not physical addresses, different user programs may use the same logical addresses to mean different physical addresses, the kernel pager program and the MMU make this all transparent and automatic.
This level of protection is not available on older CPU's (like 80286 cpus) and some very low power devices (like ARM CortexM3 or Attiny CPUs) because there is no MMU, all addresses on these systems are physical addresses, with a 1 to 1 correspondence between ram and address space
The “switch” is actually in the processor itself. Some instructions are only available in kernel mode (a.k.a. ring 0 on i386). Switching from kernel mode to user mode is easy. However, there are not so many ways to switch back to kernel mode. You can either:
send an interrupt to the processor
make a system call.
In either case, the operation has the side effect of transferring the control to some trusted, kernel code.
When a computer boots up, it starts running code from some well known location. That code ultimately ends up loading some OS kernel to memory and passing control to it. The OS kernel then sets up the CPU memory map via some CPU specific method.
And for example if driver is in kernel mode how does OS manages its integration to system so there is not malicious software added as a driver?
It actually depends on the OS architecture. I will give you two examples:
Linux kernel: A driver code can be very powerful. The level of protections are following:
a) A driver is allowed to access limited number of symbols in the kernel, specified using EXPORT_SYMBOL. The exported symbols are generally functions. But nothing prevents a driver from trashing a kernel using wild pointers. And the security using EXPORT_SYMBOL is nominal.
b) A driver can only be loaded by the privileged user who has root permission on the box. So as long as root privileges are not breached system is safe.
Micro kernel like QNX: The operating system exports enough interface to the user so that a driver can be implemented as a user space program. Hence the driver at least cannot easily trash the system.