Where are the peripherals register in STM32? are they in the cortex-m core or in the peripheral unit itself? - stm32

I have two questions.
memory region of the cortex-m core cpu
1- is the memory of the stm32 microcontrollers inside the cortex-m core or outside of it? and if it is inside the cortex-core why is it not shown in the block diagram of the cortex-m core generic user guide?block diagram of the cortex-m core
2-I'm trying to understand the stm32 architecture but I'm facing an ambiguity.
usart block diagram
as you can see in the picture the reference manual says that the USART unit has some registers(i.e Data Register).
but these registers also exist in the memory region of the cortex-m core(if the answer to the first question is "inside").where are they really? are there two registers for each register? or are they resided in the cortex-m core or in the peripheral itself?
is it related to the memory-mapped i/o definition?

The only storage that's inside the CPU core is the registers (including general-purpose and special-purpose registers). Everything else is external, including RAM and ROM.
The peripheral control registers exist, essentially, inside the peripheral. However they are accessed by the CPU in the same way that it accesses RAM or ROM; that's the meaning of the memory map, it shows you which addresses refer to RAM, ROM, peripheral registers, and other things. (Note that most of the memory map is unused - a 32-bit address space is capable of addressing 4GB of memory, and no microcontroller that I know of has anything like that much storage.) The appropriate component 'responds' to read and write requests on the memory bus depending on the address.
For a basic overview the Wikipedia page on memory-mapped IO is reasonably good.
Note that none of this is specific to the Cortex-M. Almost all modern microprocessor designs use memory mapping. Note also that the actual bus architecture of the Cortex-M is reasonably complex, so any understanding you gain from the Wikipedia article will be of an abstraction of the true implementation.

Look at the image below, showing the block diagram of an STM32 Cortex-M4 processor.
I have highlighted the CPU Core (top left); and other components you can find inside the microcontroller.
The CPU "core", as its name implies, is just the "core"; but the microcontroller also integrates a Flash memory, a RAM, and a number of peripherals; almost everything outside the core (except debugging lines) is accessed by means of the bus matrix, this is equally true for ROM, RAM, and integrated peripherals.
Note that the main difference between a "microprocessor" and a "microcontroller" is that a the latter has dedicated peripherals on board.
Peripherals on STM32 devices are accessed by the CPU via memory-mapped I/O, look at the picture below:
As you can see, despite a linear address space from 0x00000000 to 0xFFFFFFFF, the address space is partitioned in "segments", f.e., program memory starting at 0x00000000, SRAM at 0x20000000, peripherals at 0x40000000. Specific peripheral registers can be read/written by pointers at specific offsets from the base address.
For this device, USARTS are mapped to APB1 area, thus in address range 0x40000000-0x4000A000. Note that the actual peripheral addresses can be different from device to device.

Peripherals are connected to core via buses. The address decoder knows which address is handled by the particular bus.
Not only peripherals are connected via buses. Memories are connected the same way. Busses are interconnected via the bridges. Those bridges know how to direct the traffic.
From the core point of view the peripheral register works the same way as the memory cell.
What about the gaps. Usually if the address decoder does not understand the address it will generate the exception - hardware error (in the ARM terminology called HardFault)
Details are very complicated and unless you are going to design your own chip not needed for the register level programmer.

Related

How does SEND bandwidth improve when the registered memory is aligned to system page size? (In Mellanox IBD)

Operating System: RHEL Centos 7.9 Latest
Operation:
Sending 500MB chunks 21 times from one System to another connected via Mellanox Cables.
(Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6])
(The registered memory region (500MB) is reused for all the 21 iterations.)
The gain in Message Send Bandwidth when using aligned_alloc() (with system page size 4096B) instead of malloc() for the registered memory is around 35Gbps.
with malloc() : ~86Gbps
with aligned_alloc() : ~121Gbps
Since the CPU is not involved for these operations, how is this operation faster with aligned memory?
Please provide useful reference links if available that explains this.
What change does aligned memory bring to the read/write operations?
Is it the address translation within the device that gets improved?
[Very limited information is present over the internet about this, hence asking here.]
RDMA operations use either MMIO or DMA to transfer data from the main memory to the NIC via the PCI bus - DMA is used for larger transfers.
The behavior you're observing can be entirely explained by the DMA component of the transfer. DMA operates at the physical level, and a contiguous region in the Virtual Address Space is unlikely to be mapped to a contiguous region in the physical space. This fragmentation incurs costs - there's more translation needed per unit of transfer, and DMA transfers get interrupted at physical page boundaries.
[1] https://www.kernel.org/doc/html/latest/core-api/dma-api-howto.html
[2] Memory alignment

How does CPU access BIOS instructions stored in external memory?

During the process of booting, CPU reads address of system BIOS from the Reset Vector and jumps to the location where BIOS is stored. My question here is:
*As BIOS is stored on some external memory like EEPROM (and not on main memory) , how does CPU access this external memory ?
*Is this external memory already mapped to some region of main memory?
and does the CPU just jump to this mapped region to access BIOS instructions
Or it actually accesses the instructions from external memory where BIOS is stored?
First I can refer you to a detailed article:
https://resources.infosecinstitute.com/system-address-map-initialization-x86x64-architecture-part-2-pci-express-based-systems/#gref
But I will summarize here:
When CPU is "resetted", the reset vector interrupt (a specific memory address - 0xFFFFFFF0H) is executed - and the ROM content has to be there at that specific address.
Intel Reset Vector
How is the BIOS ROM mapped into address space on PC?
Who loads the BIOS and the memory map during boot-up
0xffff0 and the BIOS (hardwired address mapping is also explained/emphasized here)
When BIOS is executed, it will also initialize hardware like VGA, and initialize DRAM memory. Sometimes RAM memory and BIOS may overlapped, and usually the OS will takeover and reimplement all the functionalities of the BIOS (whis is specific to each motherboard).
What information does BIOS load into RAM?
https://resources.infosecinstitute.com/system-address-map-initialization-in-x86x64-architecture-part-1-pci-based-systems/
Diagram below illustrate how motherboard designer will design the address ranges usable by the different hardware peripherals to lie in certain ranges, and the OS then has the responsibilities to allocate RAM ranges to lie in the unused by hardware regions. Don't forget that each core (for 32-bit) can only access 4GB memory - but phyical memory available can be much more than that. This is where pagetable comes in.
Once the pagetable is setup, then only the TLB and pagetable can be used - which is to provide indirect and efficient access to the RAM memory.
Normally the CPU access the data and information through by interfacing with the SPI in turn communicates with the EEEPROM to fulfill the task requested or deliver the information requested by the CPU.
And no, the external memory is not mapped anywhere and no the CPU does not just jump to it. It communicates with what it or the BIOS needs through SPI or I^C depending on the age of the machine.

Memory mapped IO - how is it done?

I've read about the difference between port mapped IO and memory mapped IO, but I can't figure out how memory mapped Io is implemented in modern operating systems (windows or linux)
What I know is that a part of the physical memory is reserved to communicate with the hardware and there's a MMIO Unit involved in taking care of the bus communication and other memory-related stuff
How would a driver communicate with underlying hardware? What are the functions that the driver would use? Are the addresses to communicate with a video card fixed or is there some kind of "agreement" before using them?
I'm still rather confused
The following statement in your question is wrong:
What I know is that a part of the physical memory is reserved to communicate with the hardware
A part of the physical memory is not reserved for communication with the hardware. A part of the physical address space, to which the physical memory and memory mapped IO are mapped, is. This memory layout is permanent, but user programs do not see it directly - instead, they run into their own virtual address space to which the kernel can decide to map, wherever it wants, physical memory and IO ranges.
You may want to read the following articles which I believe contain answers to most of your questions:
http://duartes.org/gustavo/blog/post/motherboard-chipsets-memory-map
http://duartes.org/gustavo/blog/post/memory-translation-and-segmentation
http://duartes.org/gustavo/blog/post/how-the-kernel-manages-your-memory
http://en.wikipedia.org/wiki/Memory-mapped_I/O
http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/IO/mapped.html
Essentially it is just a form of accessing the data, as if you are saving / reading from the memory. But the hardware will snoop on the address bus, and when it sees the address targetting for him, it will just receive the data on the data bus.
Are you asking about Memory mapped files, or memory mapped port-IO?
Memory mapped files are done by paging out the pages and intercepting page-faults to those addresses. This is all done by the OS by negotiation between the file-system manager and the page-fault handler.
Memory mapped port-IO is done at the CPU level by overloading address lines as port-IO lines which allow writes to memory to be translated onto the QPI bus lines as port-IO. This is all done by the processor interacting with the motherboard. The only other thing that the OS needs to do is to tell the MMU not to coalese reads and writes through the PAE must-writethrough and no-cache bits.

Difference between memory bus and address bus

Can someone very briefly point out the differences between the memory bus and address bus in computer architectures ? Also when you say memory bus does it imply that you are referring to the databus ?
Beautifully explained here.
In isolation, the microprocessor, the memory and the input/output
ports are interesting components, but they cannot do anything useful.
In combination, they can form a complete system if they can
communicate with each other. This communication is accomplished over
bundles of signal wires (known as buses) that connect the parts of the
system together.
There are normally three types of bus in any processor system:
An address bus: this determines the location in memory that the processor will read data from or write data to.
A data bus: this contains the contents that have been read from the memory location or are to be written into the memory location.
A control bus: this manages the information flow between components indicating whether the operation is a read or a write
and ensuring that the operation happens at the right time.
Data bus:
The data bus is an electrical path that connects the cpu,memory and the other hardware devices on the motherboard. the number of wires in the data bus affects the speed at which data can travel between components.Since each wire can transfer one bit at a time therefore,an 8-wire or one byte at a time.
Address bus:
The reason that the address bus is important is that the number of lines in it tells the maximum number of memory addresses.8 bit data is enough to represent 2(8 in power)=256.
Memory bus consists of an address bus (used to specify memory address) and data bus (used to specify value to be written to it).
When you read data from memory or write data to memory you operate with 2 different items, the address and the data. Somehow they have to be transferred between the CPU and memory. You can have two buses to transfer them independently. Or you can have just one and use it for both, one thing at a time.
Address and data buses may have different widths, that is, they may carry different number of bits.
Yes, memory bus usually means data bus (that carries the memory data).
Data bus is a bi-directional bus for fetching and storing data where as an Address bus is a unidirectional bus used to specify the address.
Excellent narration here http://www.differencebetween.com/difference-between-address-bus-and-vs-data-bus/

what is the difference between memory mapped io and io mapped io

Pls explain the difference between memory mapped IO and IO mapped IO
Uhm,... unless I misunderstood, you're talking about two completely different things. I'll give you two very short explanations so you can google up what you need to now.
Memory-mapped I/O means mapping I/O hardware devices' memory into the main memory map. That is, there will be addresses in the computer's memory that won't actually correspond to your RAM, but to internal registers and memory of peripheral devices. This is the machine architecture Pointy was talking about.
There's also mapped I/O, which means taking (say) a file, and having the OS load portions of it in memory for faster access later on. In Unix, this can be accomplished through mmap().
I hope this helped.
On x86 there are two different address spaces, one for memory, and another one for I/O ports.
The port address space is limited to 65536 ports, and is accessed using the IN/OUT instructions.
As an example, a video card's VGA functionality can be accessed using some I/O ports, but the framebuffer is memory-mapped.
Other CPU architectures only have one address space. In those architectures, all devices are memory-mapped.
Memory mapped I/O is mapped into the same address space as program memory and/or user memory, and is accessed in the same way.
Port mapped I/O uses a separate, dedicated address space and is accessed via a dedicated set of microprocessor instructions.
As 16-bit processors will slowly become obsolete and replaced with 32-bit and 64-bit in general use, reserving ranges of memory address space for I/O is less of a problem, as the memory address space of the processor is usually much larger than the required space for all memory and I/O devices in a system.
Therefore, it has become more frequently practical to take advantage of the benefits of memory-mapped I/O.
The disadvantage to this method is that the entire address bus must be fully decoded for every device. For example, a machine with a 32-bit address bus would require logic gates to resolve the state of all 32 address lines to properly decode the specific address of any device. This increases the cost of adding hardware to the machine.
The advantage of IO Mapped IO system is that less logic is needed to decode a discrete address and therefore less cost to add hardware devices to a machine. However more instructions could be needed.
Ref:- Check This link
I have one more clear difference between the two. The memory mapped I/O device is that I/O device which respond when IO/M is low. While a I/O (or peripheral) mapped I/O device is that which respond when IO/M is high.
Memory mapped I/O is mapped into the same address space as program memory and/or user memory, and is accessed in the same way.
I/O mapped I/O uses a separate, dedicated address space and is accessed via a dedicated set of microprocessor instructions.
The difference between the two schemes occurs within the Micro processor’s / Micro controller’s. Intel has, for the most part, used the I/O mapped scheme for their microprocessors and Motorola has used the memory mapped scheme.
https://techdhaba.com/2018/06/16/memory-mapped-i-o-vs-i-o-mapped-i-o/