What are phy page (physical layer/phy register) as in "phy_write_paged" function - linux-device-driver

I am reading RTL_ReakTek driver code for NIC driver r8169 and it does some phy registers writing/phy config register writing/ with functions like these
pci_write_config_byte(tp->pci_dev, PCI_LATENCY_TIMER, 0x40);// It must be for phy config register write/ But what is phy_write_paged/
are there any memory pages? in physical layer handling by Operating system, if yes please tell me is it same as the concept of kernel pages for virtual memory mapping into kernel memory. I assumes driver need to do with
MMIO registers
Phy registers
Phy Config registers
PHY Paged memory represention
For handling devices
Please explain what all these above are? and how they are handled.

PHY registers are accessed via packets on a serial management bus known as MDIO, SMI or MIIM, depending on who you ask. The original packet format on this bus as defined by Clause 22 of IEEE 803.3 supports access to up to 32 registers on 32 different PHY addresses. The first 16 registers are defined by IEEE 802.3 and the remaining 16 are defined by the PHY vendor.
If the PHY supported more than 32 registers, the vendor could define one of the vendor-specified registers as a "page select" register to select different banks of 32 vendor-specified registers. That is what the phy_read_paged and phy_write_paged functions do. They select the page, read or write the register, and restore the original page, all while holding a lock on the bus to prevent interference from other code that is trying to access the registers.
A later revision of IEEE 802.3 defined an optional extension of the MDIO packet format in Clause 45 that allowed each of the 32 PHY addresses to support up to 32 device addresses (the different addresses are for different, defined uses) with up to 65536 "MMD" registers for each device address. An even later revision defined registers 13 and 14 of the original Clause 22 registers to be used to access the MMD registers indirectly. The phy_read_mmd and phy_write_mmd functions are for accessing these MMD registers (if supported). Some PHY chips may support Clause 45 directly, others may use Clause 22 registers 13 and 14 to access the MMD registers, others may have some custom way to access the MMD registers, and others might not support the MMD registers at all. The phy_read_mmd and phy_write_mmd take care of the differences in methods for accessing the MMD registers.

Related

Is PCI "CF8h/CFCh" IO port addresses only applicable to processors with an IO address space?

Some CPU like x86 processor has two address spaces. One for memory and one for IO. And different instructions to access them.
And the PCI 3.0 spec also mentions some important IO addresses:
Two DWORD I/O locations are used to generate configuration
transactions for PC-AT compatible systems. The first DWORD location
(CF8h) references a read/write register that is named CONFIG_ADDRESS.
The second DWORD address (CFCh) references a read/write register named
CONFIG_DATA.
So it seems PCI 3.0 spec is tightly coupled to processors that does implement IO address space. And that's the a priori knowledge that SW/FW writers should know.
So what about the other processor archs that don't have IO address space? Like ARM. How can they interact with the PCI configuration space?
The paragraph immediately preceding the one quoted in the question directly addresses the question. It says:
Systems must provide a mechanism that allows software to generate PCI configuration transactions. ...
For PC-AT compatible systems, the mechanism for generating configuration transactions is defined and specified in this section. ...
For other system architectures, the method of generating configuration
transactions is not defined in this specification.
In other words, systems that are not PC-AT compatible must provide a mechanism, but it is specified elsewhere. The PCI spec isn't tightly coupled to PC-AT systems, but it doesn't define the mechanism for other types of systems.
The paragraph in the question only applies to PC-AT compatible systems.
Below quote from here clears things up:
The method for generating configuration cycles is host dependent. In
IA machines, special I/O ports are used. On other platforms, the PCI
configuration space can be memory-mapped to certain address locations
corresponding to the PCI host bridge in the host address domain.
And
I/O space can be accessed differently on different platforms.
Processors with special I/O instructions, like the Intel processor
family, access the I/O space with in and out instructions. Machines
without special I/O instructions will map to the address locations
corresponding to the PCI host bridge in the host address domain. When
the processor accesses the memory-mapped addresses, an I/O request
will be sent to the PCI host bridge, which then translates the
addresses into I/O cycles and puts them on the PCI bus.
So for non-IA platform, MMIO can just be used instead. And the platform specs should document that memory-mapped address for the PCI host bridge as the a priori knowledge for SW/FW writers.
ADD 1 - 14:36 2023/2/5
From the digital design's perspective, the host CPU and the PCIe subsystem are just two separate IP blocks. And the communication between them is achieved by a bunch of digital signals in the form of address/data/control lines. As long as the signals can be conveyed, the communication can be made.
For x86 CPUs, the memory address space and IO address space are just different usage of address lines down to the earth. I don't think there's any strong reason that memory addresses cannot be used to communicate with PCIe subsystem. I think it's a more logical choice back then to use I/O addresses for PCIe because PCIe is deemed as I/O.
So the real critical thing I think, is to convey the digital signals in proper format between IPs. PCIe is independent of CPU architectures and cares nothing about what lines to be used. For ARM, there's nothing unnatural to use memory addresses, i.e., MMIO. After all it's digital signals and are capable of passing necessary information properly.

How does SEND bandwidth improve when the registered memory is aligned to system page size? (In Mellanox IBD)

Operating System: RHEL Centos 7.9 Latest
Operation:
Sending 500MB chunks 21 times from one System to another connected via Mellanox Cables.
(Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6])
(The registered memory region (500MB) is reused for all the 21 iterations.)
The gain in Message Send Bandwidth when using aligned_alloc() (with system page size 4096B) instead of malloc() for the registered memory is around 35Gbps.
with malloc() : ~86Gbps
with aligned_alloc() : ~121Gbps
Since the CPU is not involved for these operations, how is this operation faster with aligned memory?
Please provide useful reference links if available that explains this.
What change does aligned memory bring to the read/write operations?
Is it the address translation within the device that gets improved?
[Very limited information is present over the internet about this, hence asking here.]
RDMA operations use either MMIO or DMA to transfer data from the main memory to the NIC via the PCI bus - DMA is used for larger transfers.
The behavior you're observing can be entirely explained by the DMA component of the transfer. DMA operates at the physical level, and a contiguous region in the Virtual Address Space is unlikely to be mapped to a contiguous region in the physical space. This fragmentation incurs costs - there's more translation needed per unit of transfer, and DMA transfers get interrupted at physical page boundaries.
[1] https://www.kernel.org/doc/html/latest/core-api/dma-api-howto.html
[2] Memory alignment

Where are the peripherals register in STM32? are they in the cortex-m core or in the peripheral unit itself?

I have two questions.
memory region of the cortex-m core cpu
1- is the memory of the stm32 microcontrollers inside the cortex-m core or outside of it? and if it is inside the cortex-core why is it not shown in the block diagram of the cortex-m core generic user guide?block diagram of the cortex-m core
2-I'm trying to understand the stm32 architecture but I'm facing an ambiguity.
usart block diagram
as you can see in the picture the reference manual says that the USART unit has some registers(i.e Data Register).
but these registers also exist in the memory region of the cortex-m core(if the answer to the first question is "inside").where are they really? are there two registers for each register? or are they resided in the cortex-m core or in the peripheral itself?
is it related to the memory-mapped i/o definition?
The only storage that's inside the CPU core is the registers (including general-purpose and special-purpose registers). Everything else is external, including RAM and ROM.
The peripheral control registers exist, essentially, inside the peripheral. However they are accessed by the CPU in the same way that it accesses RAM or ROM; that's the meaning of the memory map, it shows you which addresses refer to RAM, ROM, peripheral registers, and other things. (Note that most of the memory map is unused - a 32-bit address space is capable of addressing 4GB of memory, and no microcontroller that I know of has anything like that much storage.) The appropriate component 'responds' to read and write requests on the memory bus depending on the address.
For a basic overview the Wikipedia page on memory-mapped IO is reasonably good.
Note that none of this is specific to the Cortex-M. Almost all modern microprocessor designs use memory mapping. Note also that the actual bus architecture of the Cortex-M is reasonably complex, so any understanding you gain from the Wikipedia article will be of an abstraction of the true implementation.
Look at the image below, showing the block diagram of an STM32 Cortex-M4 processor.
I have highlighted the CPU Core (top left); and other components you can find inside the microcontroller.
The CPU "core", as its name implies, is just the "core"; but the microcontroller also integrates a Flash memory, a RAM, and a number of peripherals; almost everything outside the core (except debugging lines) is accessed by means of the bus matrix, this is equally true for ROM, RAM, and integrated peripherals.
Note that the main difference between a "microprocessor" and a "microcontroller" is that a the latter has dedicated peripherals on board.
Peripherals on STM32 devices are accessed by the CPU via memory-mapped I/O, look at the picture below:
As you can see, despite a linear address space from 0x00000000 to 0xFFFFFFFF, the address space is partitioned in "segments", f.e., program memory starting at 0x00000000, SRAM at 0x20000000, peripherals at 0x40000000. Specific peripheral registers can be read/written by pointers at specific offsets from the base address.
For this device, USARTS are mapped to APB1 area, thus in address range 0x40000000-0x4000A000. Note that the actual peripheral addresses can be different from device to device.
Peripherals are connected to core via buses. The address decoder knows which address is handled by the particular bus.
Not only peripherals are connected via buses. Memories are connected the same way. Busses are interconnected via the bridges. Those bridges know how to direct the traffic.
From the core point of view the peripheral register works the same way as the memory cell.
What about the gaps. Usually if the address decoder does not understand the address it will generate the exception - hardware error (in the ARM terminology called HardFault)
Details are very complicated and unless you are going to design your own chip not needed for the register level programmer.

Is micro kernel possible without MMU?

In the following link;
https://www.openhub.net/p/f9-kernel
F9 Microkernel runs on Cortex M, but Cortex M series doesn't have MMU. My knowledge on MMU and Virtual Memory are limited hence the following quesitons.
How the visibility of entire physical memory is prevented for each process without MMU?
Is it possible to achieve isolation with some static memory settings without MMU. (with enough on chip RAM to run my application and kernel then, just different hard coded memory regions for my limited processes). But still I don't will this prevent the access?
ARM Cortex-M processors lack of MMU, and there is optional memory protection unit (MPU) in some implementations such as STMicroelectronics' STM32F series.
Unlike other L4 kernels, F9 microkernel is designed for MPU-only environments, optimized for Cortex M3/M4, where ARMv7 Protected Memory System Architecture (PMSAv7) model is supported. The system address space of a PMSAv7 compliant system is protected by a MPU. Also, the available RAM is typically small (about 256 Kbytes), but a larger Physical address space (up to 32-bit) can be used with the aid of bit-banding.
MPU-protected memory is divided up into a set of regions, with the number of regions supported IMPLEMENTATION DEFINED. For example, STM32F429, provides 8 separate memory regions. In PMSAv7, the minimum protect region size is 32 bytes, and maximum is up to 4 GB. MPU provides full access over:
Protection region
Overlapping protection region
Access permissions
Exporting memory attributes to the system
MPU mismatches and permission violations invoke the programmable priority MemManage fault handler.
Memory management in F9 microkernel, can split into three conceptions:
memory pool, which represents the area of PAS with specific attributes (hardcoded in mem map table).
address space - sorted list of fpages bound to particular thread(s).
flexible page - unlike traditional pages in L4, fpage represent in MPU region instead.
Yes, but ....
There is no requirement for an MMU at all, things just get less convenient and flexible. Practically, anything that provides some form of isolation (e.g. MPU) might be good enough to make a system work - assuming you do need isolation at all. If you don't need it for some reason and just want the kernel to do scheduling, then a kernel can do this without an MMU or MPU also.

Multicore CPUs, Different types of CPUs and operating systems

An operating system should support CPU architecture and not specific CPU, for example if some company has Three types of CPUs all based of x86 architecture,
one is a single core processor, the other one a dual core and the last one has five cores, The operating system isn't CPU type based, it's architecture based, so how would the kernel know if the CPU it is running on supports multi-core processing or how many cores does it even have....
also for example Timer interrupts, Some versions of Intel's i386 processor family use PIT and others use the APIC Timer, to generate periodic timed interrupts, how does the operating system recognize that if it wants for example to config it... ( Specifically regarding timers I know they are usually set by the BIOS but the ISR handles for Timed interrupts should also recognize the timer mechanism it is running upon in order to disable / enable / modify it when handling some interrupt )
Is there such a thing as a CPU Driver that is relevant to the OS and not the BIOS?, also if someone could refer me to somewhere I could gain more knowledge about how Multi-core processing is triggered / implemented by the kernel in terms of "code" It would be great
The operating system kernel almost always has an abstraction layer called the HAL, which provides an interface above the hardware the rest of the kernel can easily use. This HAL is also architecture-dependent and not model-dependent. The CPU architecture has to define some invokation method that will allow the HAL to know about which features are present and which aren't present in the executing processor.
On the IA32/64 architecture, the is an instruction known as CPUID. You may ask another question here:
Was CPUID present from the beginning?
No, CPUID wasn't present in the earliest CPUs. In fact, it came a lot later with the developement in i386 processor. The 21st bit in the EFLAGS register indicates support for the CPUID instruction, according to Intel Manual Volume 2A.
PUSHFD
Using the PUSHFD instruction, you can copy the contents of the EFLAGS register on the stack and check if the 21st bit is set.
How does CPUID return information, if it is just an instruction?
The CPUID instruction returns processor identification and feature information in the EAX, EBX, ECX, and EDX registers. Its output depends on the values put into the EAX and ECX registers before execution.
Each value (which is valid for CPUID) that can be put in the EAX register is known as a CPUID leaf. Some leaves have subleaves, .i.e. they depend on an sub-leaf value in the ECX register.
How is multi-core support detected at the OS kernel level?
There is a standard known as ACPI (Advanced Configuration and Power Interface) which defines a set of ACPI tables. These include the MADT or multiple APIC descriptor table. This table contains entries that have information about local APICs, I/O APICs, Interrupt Redirections and much more. Each local APIC is associated with only one logical processor, as you should know.
Using this table, the kernel can get the APIC-ID of each local APIC present in the system (only those ones whose CPUs are working properly). The APIC id itself is divided into topological Ids (bit-by-bit) whose offsets are taken using CPUID. This allows the OS know where each CPU is located - its domain, chip, core, and hyperthreading id.