How to boot STM32 from user flash? - stm32

STM32 microcontrollers are capable to boot from different sources such as:
User-Flash
System-memory
Embedded-SRAM.
On the firmware side, does "boot from user Flash" means executing a custom bootloader?

No.
The "Boot from User Flash" mode means that the application code that will be run after reset is located in user flash memory. The user flash memory in that mode is aliased to start at address 0x00000000 in boot memory space. Upon reset, the top-of-stack value is fetched from address 0x00000000, and code then begins execution at address 0x00000004.
In contrast, the "Boot from System Memory" mode simply means that the system memory (not the user flash) is now aliased to start at address 0x00000000. The application code in this case must have already been loaded into system memory.
The "Boot from Embedded SRAM" mode does not alias the SRAM address. When this mode is selected, the device expects the vector table to have been relocated using the NVIC exception table and offset register, and execution begins at the start of embedded SRAM. The application code in this case must have already been loaded into embedded SRAM.
For more details, refer to the Reference Manual for the specific family of STM32 devices you are using, in the section titled "Boot Configuration".

It depends.
The option "Boot from user Flash" will run whatever is in the user flash. This code could be anything you want. It could be:
a custom bootloader or
it could be application code.
The distinction is a logical one based on what you write your code to do. i.e. it is even possible to have an app that rewrites part or all of itself (what you call it is up to you).
If you do use a custom bootloader then you will normally have two projects.
The custom bootloader will have its own projects with its own generated binary image loaded into the start of flash.
The application code also having its own project and generated .bin file.
The application is loaded into a specific address that the bootloader knows about when it was built (in some free statically allocated memory block - normally on page boundaries to allow for reflashing while keeping the bootloader unchanged). These addresses can be configured in linker files.
Custom bootloaders are a great way of allowing features like:
A custom coms stack, to reflash your application via some other coms protocol.
Dual redundancy of the application code.
Note the STM32 micros "System Memory" already contains ST's own bootloader which supports a range of serial links to reflash the micro without needing a fully custom bootloader.

ST's documentation is stupid. They mention this, but they don't also tell that you need to only configure BOOT0 & BOOT1 pins in order to select the boot source.
So these two pins binary speaking give you three options that are:
(A) BOOT0 = 0, BOOT1 = whatever,
(B) BOOT0 = 1, BOOT1 = 0 and
(C) BOOT0 = 1, BOOT1 = 1.
Option A boots in FLASH, option B boots in bootloader and option C boots in SRAM.
So how to use this. First choose configuration B to start communicating with the bootloader to whom you instruct (in bootloader's own commands) to erase FLASH and upload your program inside FLASH. Then you change the configuration to A and reboot using MCU's RESET pin.
Note:
I perfer the NXP's documentation which has everything black on white.
As a developper I value my time and good documentation is a way to go.
Ditch ST and go for NXP if documentation is what you value. Here is NXP's sample documentation:
https://www.nxp.com/docs/en/user-guide/UM10562.pdf
I am really sorry ST, but you have to do better than what you are currently doing...

STM MCUs are a bit stupid in this regard. You need to configure the Option bytes in Flash so that your MCU will always run code from the user Flash after reset.
Run this code at the beginning of your main:
if ((FLASH->OPTR & FLASH_OPTR_nSWBOOT0_Msk) != 0x0 || ((FLASH->OPTR & FLASH_OPTR_nBOOT0_Msk) == 0x0))
{
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
FLASH->KEYR = 0x45670123;
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
FLASH->KEYR = 0xCDEF89AB;
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
FLASH->OPTKEYR = 0x08192A3B;
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
FLASH->OPTKEYR = 0x4C5D6E7F;
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
FLASH->OPTR = (FLASH->OPTR & ~(FLASH_OPTR_nSWBOOT0_Msk)) | FLASH_OPTR_nBOOT0_Msk;
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
FLASH->CR = FLASH->CR | FLASH_CR_OPTSTRT;
while ((FLASH->SR & FLASH_SR_BSY_Msk) != 0x0) { ; }
}

Related

Can't receive from USB bulk endpoint despite Windows enumerates and libusb reads descriptor of STM32 custom device class

For a fast ADC sampling USB device, I am using the USB 2.0 High Speed capable STM32F733 with the embedded USB-HS PHY. In USBView, I can see that the device is enumerated, the libusb code opens the device and claims interface, but when I try to receive data with libusb_bulk_transfer, the operation times out (return code -12). Things I have tried: I have confirmed than when I request data with libusb_bulk_transfer, the device is interrupted. Note: I have DMA enabled in my class configuration C file and it is not clear to me how that is triggered. I have verified that the transfersize and packet count registers are being set correctly by the LL library function, and that when I request data from
Any tips on debugging such problems will be much appreciated - this board is my undergrad thesis due in under two months!
Desktop sequence:
libusb_get_device_list, libusb_get_device_descriptor, libusb_open, libusb_get_string_descriptor_ascii, libusb_free_device_list, libusb_bulk_transfer(devh, fat_EPIN_ADDR, inframe, fat_EPIN_SIZE, &gotBytes, 100). Where gotBytes is integer, and inframe is a large array.
Device firmware:
MX_USB_DEVICE_Init();
uint8_t txBuffer[10*fat_EPIN_SIZE];
while (1)
{
USBD_LL_Transmit(&hUsbDeviceHS, Custom_fat_EPIN_ADDR, txBuffer, Custom_fat_EPIN_SIZE);
HAL_Delay(1);
}
Custom_fat_EPIN_SIZE is 0x200 and the endpoint address is 0x81 (EP IN 1)
Installed driver for device is WinUSB (verified in Device Manger to be winusb.sys), and I am linking libusb-1.0 into my desktop program. You can find my source code at https://gitlab.com/tywonemi-school-stuff/silicon-radar-fun, the firmware is My SW/v1 and the desktop software is a Qt Creator project in My SW/Viewer, of note is usb.cpp. You can also compare with testing project/HIDTest, which is code that I tested with STM32F303 nucleo dev board where I was able to read an array through IN bulk endpoint with the Viewer application. However, F3 has the USB peripheral, while F7 has OTG_USB, and I am now attempting USB 2.0 compliant HS so there may be more protocol-based pitfalls. You can also find the output of the device descriptor etc from USBView in my SW/USBView_broken.txt
EDIT 1: I have found finally some concrete error in the STM32 behavior. The DMAADDR is set for EPIN 0x81, and never increments, despite the DMA being enabled. I have went through literally every occurrence of the word "DMA" in the USB_OTG periphery.
I thought it might be that my linker script makes my array be stored in DTCM or similar, and the OTG DMA can't access it, but the address of txBuffer is 0x2003EBEC which is in SRAM2. The AHB matrix in the reference manual clearly shows, that the USB OTG HS DMA is master for a bus that SRAM2 is a slave of. And DTCM is connected too. I will look for application notes for USB OTG HS DMA - it just seems to be refusing to copy data!
I have fixed my issue by disabling the DMA setting. I have re-read the relevant portions of the reference manual and still don't know how exactly the values propagate into the Tx FIFOs. It is possible that DMA-less operation will be a major bottleneck in my project, I might return to this later.

Question about Message Signaled Interrupts (MSI) on x86 LAPIC system

Hi I'm writing a kernel and plan to use MSI interrupt for PCI devices.
However, I'm also quite confused by the documentations.
My understanding about MSI are as follow:
From PCI device point of view:
Documentations indicate that I
need to find Capabillty ID = 0x05 to locate 3 registers: Message control (MCR), Message Address (MAR) and Message Data (MDR) registers
MCR provide control functionality for MSI interrupt,
MAR provide the physical address the PCI device
will write once interrupt occurs
MDR forms out the actual data it will write into the physical address
From CPU point of view:
Documentation shows that Message Address register contains fixed top of 0xFEE, and following by destination ID (LAPIC ID) and other controlling bits as follow:
The Message Data register will contain the following information, including the interrupt vector:
After reading all of these, I am thinking if the APIC_ID is 0x0h would the Message Address conflict with the Local APIC memory mapping? Although the address of FEE00000~FEE00010 are reserved.
In addition, is it true that the vector number in MDR is corresponding to the IDT vector number. In other words, if I put MAR = 0xFEE0000C (Destination ID = 0, Using logical APIC ID) and MDR = 0x0032 (edge trigger, Vector = 50) and enable the MSI interrupt, then once the device issues an interrupt CPU would correspondingly run the function pointed by IDT[50]? After that I write 0h to EOI register to end it?
Finally, according to the documentation, the upper 32 bit of MAR is not used? Can anyone help on this?
Thanks a lot!
Your understanding of how to detect and program MSI in a PCI (or PCIe) device is correct.*
The message address controls the destination (which CPU the interrupt is sent to), while the message data contains the vector number. For normal interrupts, all bits of the message data should be 0 except for the low 8 bits, which contain the vector.
The vector is an index into the IDT, so if the message data is 0x0032, the interrupt is delivered through entry 50 of the IDT.**
If the Destination ID in an interrupt message is 0, the Message Address of the MSI does match the default address of the local APIC, but they do not conflict, because the APIC can only be written by the CPU and MSIs can only be written by devices.
On x86 platforms, the upper 32 bits of the message address must be 0. This can be done by setting the upper part of the message address to 0 or by programming the device to use a 32-bit message address (in which case the upper message address register is not used). The PCI spec was designed to work with systems where 64-bit MSI addresses are used, but x86 systems never use the upper 32 bits of the message address.
Reprogramming the APIC base address by writing to the APIC_BASE MSR does not affect the address range used for MSI; it is always 0xFEExxxxx.
* You should also look at the MSI-X capability, because some devices support MSI-X but not MSI. MSI-X is a bit more flexible, which inevitably makes it a bit more complicated.
** When using the MSI capability, the message data isn't exactly the value in the Message Data Register (MDR). The MSI capability allows the device to use several contiguous vectors. When the device sends an interrupt message, it replaces the low bits of the MDR with a different value depending on the interrupt cause within the device.

In Application Programming issue

I'm working on project on STM32L152RCT6, where i have to build a mechanism to self update the code from the newly gated file(HEX file).
For that i have implemented such mechanism like boot loader where it checks for the new firmware if there it it has to cross verify and if found valid it has to store on "Application location".
I'm taking following steps.
Boot loader address = 0x08000000
Application address = 0x08008000
Somewhere on specified location it has to check for new file through Boot loader program.
If found valid it has to be copy all the HEX on location(as per the guide).
Than running the application code through jump on that location.
Now problem comes from step 5, all the above steps I've done even storing of data has been done properly(verify in STM32 utility), but when i'm jump to the application code it won't work.
Is there i have to cross check or something i'm missing?
Unlike other ARM controllers that directly jump to address 0 at reset, the Cortex-M series takes the start address from a vector table. If the program is loaded directly (without a bootloader), the vector table is at the start of the binary (loaded or mapped to address 0). First entry at offset 0 is the initial value of the stack pointer, second entry at address 4 is called the reset vector, it contains the address of the first instruction to be executed.
Programs loaded with a bootloader usually preserve this arrangement, and put the vector table at the start of the binary, 0x08008000 in your case. Then the reset vector would be at 0x08008004. But it's your application, you should check where did you put your vector table. Hint: look at the .map file generated by the linker to be sure. If it's indeed at 0x08008000, then you can transfer control to the application reset vector so:
void (*app)(void); // declare a pointer to a function
app = *(void (**)(void))0x08008004; // see below
app(); // invoke the function through the pointer
The complicated cast in the second line converts the physical address to a pointer to a pointer to a function, takes the value pointed to it, which is now a pointer to a function, and assigns it to app.
Then you should manage the switchover to the application vector table. You can do it either in the bootloader or in the application, or divide the steps between them.
Disable all interrupts and stop SysTick. Note that SysTick is not an interrupt, don't call NVIC_DisableIRQ() on it. I'd do this step in the bootloader, so it gets responsible to disable whatever it has enabled.
Assign the new vector table address to SCB->VTOR. Beware that the boilerplate SystemInit() function in system_stm32l1xx.c unconditionally changes SCB->VTOR back to the start of the flash, i.e. to 0x08000000, you should edit it to use the proper offset.
You can load the stack pointer value from the vector table too, but it's tricky to do it properly, and not really necessary, the application can just continue to use the stack that was set up in the bootloader. Just check it to make sure it's reasonable.
Have you changed the application according to the new falsh position?
For example the Vector Table has to be set correctl via
SCB->VTOR = ...
When your bootloader starts the app it has to configure everything back to the reset state as the application may relay on the default reset values. Espessially you need to:
Return values of all hardware registers to its reset values
Switch off all peripheral clocks (do not forget about the SysTick)
Disable all enabled interrupts
Return all clock domains to its reset values.
Set the vector table address
Load the stack pointer from the beginning of the APP vector table.
Call the APP entry point.(vertor table start + 4)
Your app has to be compiled and linked using the custom linker script where the FLASH start point is 0x8008000
for example:
FLASH (rx) : ORIGIN = 0x8000000 + 32K, LENGTH = 512K - 32K
SCB->VTOR = FLASH_BASE | VECT_TAB_OFFSET;
where FLASH_BASE's value must be equal to the address of your IROM's value in KEIL
example:
#define FLASH_BASE 0x08004000
Keil configuration

STM32L073RZ (rev Z) IAP jump to bootloader (system memory)

I use the STM32L073RZ (Nucleo 64 board).
I would like to jump into the system memory in application programming (IAP).
My code works on the revision B of the STM32L073 microcontroller but fails on the latest revision, rev Z.
I read the errata sheet, no details are given, just a limitation fixed on the dual boot mechanism into system memory according to the BFB2 bit.
Is the system memory no longer supports an IAP jumping to execute its code (to flash firmwares through USB or UART without using the BOOT0 pin) ?
The function is the first line of my main program, it tests if the code has to jump to the booloader:
void jumpBootLoader(void)
{
/* to do jump? */
if ( *((unsigned long *)0x20003FF0) == 0xDEADBEEF )
{
/* erase the label */
*((unsigned long *)0x20003FF0) = 0xCAFEFEED;
/* set stack pointer to the bootloader start address */
__set_MSP(*((uint32_t*)(0x1FF00000)));
/* system memory mapped at 0x00000000 */
__HAL_SYSCFG_REMAPMEMORY_SYSTEMFLASH();
/* jump to #bootloader + 4 */
((void (*)(void))(*((uint32_t*)(0x1FF00004))))();
}
}
I call these two lines as soon as the BP1 button is pressed to trig the jump operation after resetting the µC:
*((unsigned long *)0x20003FF0) = 0xDEADBEEF;
NVIC_SystemReset();
I use the HSI 16Mhz clock source.
The solution is to jump twice to the system memory.
First Jump to bootloader startup to initialize Data in RAM until the Program counter will returned to Flash by the Dualbank management.
Second Jump: Jump to the Dualbank bypassed address
How to use: User has first to initialize a variable “ Data_Address” (must be an offset Flash sector aligned address) in Flash to distinguish between first/second Jump.
EraseInitStruct.TypeErase = FLASH_TYPEERASE_PAGES;
EraseInitStruct.PageAddress = Data_Address;
EraseInitStruct.NbPages = 1;
First_jump = *(__IO uint32_t *)(Data_Address);
if (First_jump == 0) {
HAL_FLASH_Unlock();
HAL_FLASH_Program(FLASH_TYPEPROGRAM_WORD, Data_Address, 0xAAAAAAAA);
HAL_FLASH_Lock();
/* Reinitialize the Stack pointer and jump to application address */
JumpAddress = *(__IO uint32_t *)(0x1FF00004);
}
if (First_jump != 0) {
HAL_FLASH_Unlock();
HAL_FLASHEx_Erase(&EraseInitStruct, &PAGEError);
HAL_FLASH_Lock();
/* Reinitialize the Stack pointer and jump to application address */
JumpAddress = (0x1FF00369);
}
Jump_To_Application = (pFunction) JumpAddress;
__set_MSP(*(__IO uint32_t *)(0x1FF00000));
Jump_To_Application();
First important thing: you use 0x1FF0 0000 as the addres where SP is stored, this is correct. Then you use 0x1 FF00 0004 as the address from which you load the function pointer. This is not correct - one zero too many.
Note that using __set_MSP() is generally not such a good idea if you also use MSP as your stack pointer (which you most likely are). The recent definition of this function, which marks "sp" as clobbered register, causes your change to be reverted almost immediately. Incidentally today I was doing exactly the same thing you are doing and I've found that problem. In your assembly listing you'll see that SP is saved into some other register before the msr msp, ... instruction and restored right after that.
Finally I wrote that manually (STM32F4, so different addresses):
constexpr uint32_t systemMemoryBase {0x1fff0000};
asm volatile
(
" msr msp, %[sp] \n"
" bx %[pc] \n"
:: [sp] "r" (*reinterpret_cast<const uint32_t*>(systemMemoryBase)),
[pc] "r" (*reinterpret_cast<const uint32_t*>(systemMemoryBase + 4))
);
BTW - you don't need to set memory remap for the bootloader to work.
Thanks for your help. I have my answer !
The v4.0 bootloader (initial version) does not implement the dual bank mechanism but this feature is supported by v4.1.
Software can jump to bootloader but it will execute the dual boot mechanism.
So the bootloader goes back to bank1 (or bank2 if a code is "valid").
Today it is not possible to bypass the dual bank mechanism to execute the bootloader with my configuration:
The boot0 pin is reset and the protection level is 0 (see "Table 11. Boot pin and BFB2 bit configuration" in the reference manual).
Where is your program counter when you call __HAL_SYSCFG_REMAPMEMORY_SYSTEMFLASH()?
Remapping a memory region while you're executing out of that same region will end poorly! You may need to relocate this code into SRAM, or execute this code with PC set to the fixed FLASH memory mapping (0x0800xxxx).

Linux and reading and writing a general purpose 32 bit register

I am using embedded Linux for the NIOS II processor and device tree. The GPIO functionality provides the ability to read and or write a single bit at a time. I have some firmware and PIOS that I want to read or write atomically by setting or reading all 32 bits at one time. It seems like there would be a generic device driver that if the device tree was given the proper compatibility a driver would exist that would allow opening the device and then reading and writing the device. I have searched for this functionality and do not find a driver. One existing in a branch but was removed by Linus.
My question is what is the Linux device tree way to read and write a device that is a general purpose 32 bit register/pio?
Your answer is SCULL
Character Device Drivers
You will have to write a character device driver with file operations to open and close a device. Read, write, ioctl, and copy the contents of device.
static struct file_operations query_fops =
{
.owner = THIS_MODULE,
.open = my_open,
.release = my_close,
.ioctl = my_ioctl
};
Map the address using iomem and directly read and write to that address using rawread and rawwrite. Create and register a device as follows and then it can be accessed from userspace:
register_chrdev (0, DEVICE_NAME, & query_fops);
device_create (dev_class, NULL, MKDEV (dev_major, 0), NULL, DEVICE_NAME);
and then access it from userspace as follows:
fd = open("/dev/mydevice", O_RDWR);
and then you can play with GPIO from userspace using ioctl's:
ioctl(fd, SET_STATE);