PCIe Understanding - pci

As this domain is new for me, I have some confusions understanding PCIe.
I was previously working on some protocols like I2c,spi,uart,can and most of these protocols have well defined docs(a max of 300 pages).
In almost all these protocols mentioned, from a software perspective, the application had to just write to a data register and the rest will be taken care by the hardware.
Like for example, in Uart, we just load data into the data register and the data is sent out with a start, parity and stop bit.
I have read a few things about PCIe online and here is the understanding i have so far.
During system boot, the BIOS firmware will figure out the memory space required by the PCIe device by a magic write and read procedure to the BAR in the PCIe device(endpoint).
Once it figures out that, it will allocate an address space for the device in the system memory map(no actual RAM is used in the HOST, memory resides only in the endpoint.The enpoint is memory mapped into the Host).
I see that the PCIe has a few header fields that the BIOS firmware figures out during the bus enumeration phase.
Now,if the Host wants to set a bit in a configuration register located at address 0x10000004(address mapped for the enpoint), the host would do something like(assume just 1 enpoint exists with no branches):
*(volatile uint32 *)0x10000004 |= (1<<Bit_pos);
1.How does the Root complex know where to direct these messages because the BAR is in the enpoint.
Does the RC broadcast to all enpoints and then the enpoints each compare the address to the address programmed in BAR to see if it must accept it or not?(like an acceptence filter in CAN).
Does the RC add all the PCIe header related info(the host just writes to the address)?
If Host writes to 0x10000004, will it write to register at location 0x4 in the endpoint?
How does the host know the enpoint is given an address space starting from 0x10000000?
Is the RC like a router?
The above queries were related to, only if a config reg in the enpoint was needed to be read or written to.
The following queries below are related to data transfer from the host to the enpoint.
1.Suppose the host asks the enpoint to save a particular data present in the dram to a SSD,and since the SSD is conneted to the PCIe slot, will PCIe also perform DMA transfers?
Like, are the special BAR in the enpoint that the host writes with a start address in the Dram that has to be moved to ssd, which in turn triggers the PCIe to perform a DMA tranfer from host to enpoint?
I am trying to understand PCIe relative any other protocols i have worked on so far. This seems a bit new to me.

The RC is generally part of the CPU itself. It serves as a bridge that routes the request of the CPU downstream, and also from the endpoint to the CPU upstream.
PCIe endpoints have Type 0 headers and Bridges/Switches have Type 1 header. Type 1 headers have base(min address) and limit registers(max address). Type 0 headers have BAR registers that are programmed during the enumeration phase.
After the enumeration phase is complete, and all the endpoints have their BARs programmed, the Base and Limit registers in the Type 1 header of the RC and Bridges/Switches are programmed.
Ex: Assume a system that has only 1 endpoint connected directly to the RC with no intermediate Bridges/Switches, whose BAR has the value A00000.
If it requests 4Kb of address space in the CPU(MMIO), the RC would have its Base register as A00000 and Limit register as AFFFFF(It is always 1 MB aligned,though the space requested by the endpoint is much less than 1MB).
If the CPU writes to the register A00004, the RC will look at the base and limit register to find out if the address falls in its range and route the packet downstream to the endpoint.
Endpoints use BAR to find out if they must accept the packets or not.
RC, Bridges and Switches use Base and Limit registers to route packets to the correct downstream port. Mostly, a switch can have multiple downstream ports and each port will have its own Type 1 header,whose Base and Limit register will be programmed with respect to the endpoints connected to its port. This is used for routing the packets.
Data transfer between CPU memory and endpoints is via PCIe Memory Writes. Each PCIe packet has a max payload capacity of 4K. If more than 4K has to be sent to the endpoint, it is via multiple Memory Writes.
Memory Writes are posted transactions(no ACK from the endpoint is needed).

Related

How PCIE Root complex moves DMA transaction from PCIe endpoint to Host memory

I have very basic doubt ,how PCIE Root complex moves DMA transaction from PCIe endpoint to Host memory.
Suppose ,Pcie EP(End Point) want to initiate a DMA write transaction to HOST memory from its local memory.
So DMA read channel present on PcieEP ,will read data from its local memory,then PCIe module in the PcieEP convert this to Pci TLP transaction and direct it to PCIE root complex.
So my Query is
Know how PCIE rootcomplex ,will come to know that it has to redirect this packet to HOST Memory ?
How is the hardware connection from PCIeroot complex to Host Memory ? Will there be DMA Write in PCIe root complex to write this data to Host Memory .
The PCIe RC will receive the TLP and it will have a address translation function which optionally translates the address and send the packet to its user side interface. And usually after the PCIe RC, there is IOMMU logic which converts PCIe address to host physical address (and checks permissions). The IOMMU has for PCIe uses address translation table on memory for for each {bus, device, function} pairs or even PSID(process space id) and then that packet will have new physical address and go to an interconnect (usually supporting cache coherency). The interconnect receives the packet from iommu (the iommu becomes a master to the interconnect), and that interface node has system memory map having information where the addressed target is located within the interconnect. The system address map should be set by the firmware before OS runs. (usually there is interrupt controller - Interrupt translation service for arm system - after iommu and before the interconnect to intercept MSI-message signaled interrupt- and generate interrupt to the main interrupt controller).

SCTP : transmitting with both interfaces at the same time

On my machine, I have 2 interfaces connected to another machine with 2 interfaces as well. I want to use both interfaces at the same time to transfer data. From SCTP view, each machine is an endpoint. So, I used a one-to-one socket. On the server side, I tried to bind INADDR_ANY as well as bind() the first and bindx() the second. On the client side, I tried connect() and connectx(). Whatever I tried, SCTP use only one of the two interfaces at a given time.
I also tested the sctp function on Iperf and the test app in the source code. Nothing works.
What am I missing here? Do you have to send each packet by hand from one or the other address and to one or the other address?
There surely must have a function where you can build several streams where each stream allows the communication between a pair of specific addresses. Then when you send a packet, SCTP chooses automatically which stream to send the packet in.
Thanks in advance!
What you are asking for called concurrent multipath transfer, feature that isn't supported by SCTP (at least not per RFC 4960).
As described in RFC 4960 by default SCTP transmits data over the primary path. Other paths are meant to be monitored by heartbeats and used when transmission over primary path fails.

How does PCIe endpoint remembers its Bus Device Function Number?

How does the PCIe endpoint claim the configuration transaction since there is no register (in Type0 config space) defined by PCIe specification which holds the Bus Device and Function number.
The device must capture the destination address from the first config transaction it receives and store it for use in outgoing transactions. Since PCIe is actually point-to-point, not a bus, a device only receives config transactions that are intended for it.

What is General Call Address and what is the purpose of it in I2C?

I wonder what is General Call Address in I2C (0x00). If we have a master and some slaves can we communicate with these slaves through our master with this address?
Section 3.2.10 of I2C specification v.6 (https://www.i2c-bus.org/specification/) clearly describes the purpose of general call.
3.2.10General call address
The general call address is for addressing every device connected to the I2C-bus at the
same time. However, if a device does not need any of the data supplied within the general
call structure, it can ignore this address. If a device does require data from a general call
address, it behaves as a slave-receiver. The master does not actually know how many
devices are responsive to the general call. The second and following bytes are received
by every slave-receiver capable of handling this data. A slave that cannot process one of
these bytes must ignore it. The meaning of the general call address is always specified in
the second byte (see Figure 30).
You can use it to communicate with your slaves, but three restrictions applied.
General call can only write data to slave, not read.
Every slave should receive general call, you cannot address specific device with it, or you have to encode device address in general call message body, and decode it in the slave.
There are standard general call message format. You should not use standard codes for for your own functions.

pcie raw throughput test

I am doing a PCIE throughput test via a kernel module, the test result numbers are quite strange (write is 210MB/s but read is just 60MB/s for PCIE gen1 x1). I would like to ask for your suggestions and correction if there are wrong approaches in my test configuration.
My test configuration is as follow:
One board is configured as the Root Port, one board is configured as
the Endpoint. PCIE link is gen 1, width x1, MPS 128B. Both boards run
Linux OS
At Root Port side, we allocate a memory buffer and its size is 4MB.
We map the inbound PCIE memory transaction to this buffer.
At Endpoint side, we do DMA read/write to the remote buffer and
measure throughput. With this test the Endpoint will always be the
initiator of transactions.
The test result is 214MB/s for EP Write test and it is only 60MB/s
for EP Read test. The Write test throughput is reasonable for PCIe
Gen1 x1, but the EP Read throughput is too low.
For the RP board, I tested it with PCIE Ethernet e1000e card and get maximum throughput ~900Mbps. I just wonder in the case of Ethernet TX path, the Ethernet card (plays Endpoint role) also does EP Read request and can get high throughput (~110MB/s) with even smaller DMA transfer, so there must be something wrong with my DMA EP Read configuration.
The detail of the DMA Read test can be summarized with below pseudo code:
dest_buffer = kmalloc(1MB)
memset(dest_buffer, 0)
dest_phy_addr = dma_map_single(destination_buffer)
source_phy_addr = outbound region of Endpoint
get_time(t1)
Loop 100 times
Issue DMA read from source_phy_addr to dest_phy_addr
wait for DMA read completion
get_time(t2)
throughput = (1MB * 100)/(t2 - t1)
Any recommendations and suggestion are appreciated. Thanks in advanced!