stm32 NVIC_EnableIRQ() bare metal equivalent? - stm32

I'm using the blue pill, and trying to figure out interrupts. I have an interrupt handler:
void __attribute__ ((interrupt ("TIM4_IRQHandler"))) myhandler()
{
puts("hi");
TIM4->EGR |= TIM_EGR_UG; // send an update even to reset timer and apply settings
TIM4->SR &= ~0x01; // clear UIF
TIM4->DIER |= 0x01; // UIE
}
I set up the timer:
RCC_APB1ENR |= RCC_APB1ENR_TIM4EN;
TIM4->PSC=7999;
TIM4->ARR=1000;
TIM4->EGR |= TIM_EGR_UG; // send an update even to reset timer and apply settings
TIM4->EGR |= (TIM_EGR_TG | TIM_EGR_UG);
TIM4->DIER |= 0x01; // UIE enable interrupt
TIM4->CR1 |= TIM_CR1_CEN;
My timer doesn't seem to activate. I don't think I've actually enabled it though. Have I??
I see in lots of example code commands like:
NVIC_EnableIRQ(USART1_IRQn);
What is actually going in NVIC_EnableIRQ()?
I've googled around, but I can't find actual bare-metal code that's doing something similar to mine.
I seem to be missing a crucial step.
Update 2020-09-23 Thanks to the respondents to this question. The trick is to set the bit for the interrupt number in an NVIC_ISER register. As I pointed out below, this doesn't seem to be mentioned in the STM32F101xx reference manual, so I probably would never have been able to figure this out on my own; not that I have any real skill in reading datasheets.
Anyway, oh joy, I managed to get interrupts working! You can see the code here: https://github.com/blippy/rpi/tree/master/stm32/bare/04-timer-interrupt

Even if you go bare metal, you might still want to use the CMSIS header files that provide declarations and inline version of very basic ARM Cortex elements such NVIC_EnableIRQ.
You can find NVIC_EnableIRQ at https://github.com/ARM-software/CMSIS_5/blob/develop/CMSIS/Core/Include/core_cm3.h#L1508
It's defined as:
#define NVIC_EnableIRQ __NVIC_EnableIRQ
__STATIC_INLINE void __NVIC_EnableIRQ(IRQn_Type IRQn)
{
if ((int32_t)(IRQn) >= 0)
{
__COMPILER_BARRIER();
NVIC->ISER[(((uint32_t)IRQn) >> 5UL)] = (uint32_t)(1UL << (((uint32_t)IRQn) & 0x1FUL));
__COMPILER_BARRIER();
}
}
If you want to, you can ignore __COMPILER_BARRIER(). Previous versions didn't use it.
The definition is applicable to Cortex M-3 chips. It's different for other Cortex versions.

With the libraries is still considered bare metal. Without operating system, but anyway, good that you have a desire to learn at this level. Someone has to write the libraries for others.
I was going to do a full example here, (it really takes very little code to do this), but will take from my code for this board that uses timer1.
You obviously need the ARM documentation (technical reference manual for the cortex-m3 and the architectural reference manual for armv7-m) and the data sheet and reference manual for this st part (no need for programmers manual from either company).
You have provided next to no information related to making the part work. You should never dive right into a interrupt, they are advanced topics and you should poll your way as far as possible before finally enabling the interrupt into the core.
I prefer to get a uart working then use that to watch the timer registers when the roll over, count, etc. Then see/confirm the status register fired, learn/confirm how to clear it (sometimes it is just a clear on read).
Then enable it into the NVIC and by polling see the NVIC sees it, and that you can clear it.
You didn't show your vector table this is key to getting your interrupt handler working. Much less the core booting.
08000000 <_start>:
8000000: 20005000
8000004: 080000b9
8000008: 080000bf
800000c: 080000bf
...
80000a0: 080000bf
80000a4: 080000d1
80000a8: 080000bf
...
080000b8 <reset>:
80000b8: f000 f818 bl 80000ec <notmain>
80000bc: e7ff b.n 80000be <hang>
...
080000be <hang>:
80000be: e7fe b.n 80000be <hang>
...
080000d0 <tim1_handler>:
The first word loads the stack pointer, the rest are vectors, the address to the handler orred with one (I'll let you look that up).
In this case the st reference manual shows that interrupt 25 is TIM1_UP at address 0x000000A4. Which mirrors to 0x080000A4, and that is where the handler is in my binary, if yours is not then two things, one you can use VTOR to find an aligned space, sometimes sram or some other flash space that you build for this and point there, but your vector table handler must have the proper pointer or your interrupt handler won't run.
volatile unsigned int counter;
void tim1_handler ( void )
{
counter++;
PUT32(TIM1_SR,0);
}
volatile isn't necessarily the right way to share a variable between interrupt handler and foreground task, it happens to work for me with this compiler/code, you can do the research and even better, examine the compiler output (disassemble the binary) to confirm this isn't a problem.
ra=GET32(RCC_APB2ENR);
ra|=1<<11; //TIM1
PUT32(RCC_APB2ENR,ra);
...
counter=0;
PUT32(TIM1_CR1,0x00001);
PUT32(TIM1_DIER,0x00001);
PUT32(NVIC_ISER0,0x02000000);
for(rc=0;rc<10;)
{
if(counter>=1221)
{
counter=0;
toggle_led();
rc++;
}
}
PUT32(TIM1_CR1,0x00000);
PUT32(TIM1_DIER,0x00000);
A minimal init and runtime for tim1.
Notice that the NVIC_ISER0 is bit 25 that is set to enable interrupt 25 through.
Well before trying this code, I polled the timer status register to see how it works, compare with docs, clear the interrupt per the docs. Then with that knowledge confirmed with the NVIC_ICPR0,1,2 registers that it was interrupt 25. As well as there being no other gates between the peripheral and the NVIC as some chips from some vendors may have.
Then released it through to the core with NVIC_ISER0.
If you don't take these baby steps and perhaps you have already, it only makes the task much worse and take longer (yes, sometimes you get lucky).
TIM4 looks to be interrupt 30, offset/address 0x000000B8, in the vector table. NVIC_ISER0 (0xE000E100) covers the first 32 interrupts so 30 would be in that register. If you disassemble the code you are generating with the library then we can see what is going on, and or look it up in the library source code (as someone already did for you).
And then of course your timer 4 code needs to properly init the timer and cause the interrupt to fire, which I didn't check.
There are examples, you need to just keep looking.
The minimum is
vector in the table
set the bit in the interrupt set enable register
enable the interrupt to leave the peripheral
fire the interrupt
Not necessarily in that order.

Related

STM32 - Can I subvert the cool context switching for interrupts?

The STM32 family has fantastic interrupt service, they stack a whole slew of extra registers for you, and load the LR with an artificial return to properly unstack while looking for opportunities for tail chaining, aborted entry, etc etc.
HOWEVER....it is too damn slow. I am finding (STM32F730Z8, 200 MHz clock, all code including handlers in ITCM, everything in GNU assembly) that it takes about 120-150 ns overhead to get into an interrupt.
I am still learning about these, used to the old ARM7 where you had to do it all yourself, however, in those chips, if you had a minimal handler you didn't need to stack much.
So -- can I "subvert" the context switching in hardware, and just have it leap to the handler at elevated priority, pausing only to fill the pipeline, and leaving me to take care of stacking what is needed? I don't think so, and haven't seen a way to do it, but I'm working on an extremely tight time-sensitive realtime code, and interrupt switching is eating all my time budget. I'm reverting to doing it all in low-code, polled, but I hate the jitter that gives me on response to pin edges. Help?
No, this is done in pure hardware and is the main defining feature of all "Cortex-M" processors, not just STM32.
150ns at 200MHz is 30 cycles. You can probably get it quite a bit faster.
One way is to mark the floating point unit as unused each time you finish with it, and to set a core flag to tell it not to save the floating point registers. See ARM application note 298 for details.
Another method that you might try is to move your vector table and interrupt handler code to internal SRAM. STM32 has a flash memory accelerator which avoids most wait states on internal flash by performing prefetch of sequential instructions, but an asynchronous interrupt will probably not benefit from this.
that it takes about 120-150 ns overhead to get into an interrupt.
It is not the truth at all
It takes 60ns as it takes 12 clock cycles.

How can I determine interrupt source on an stm32?

I recently ended up in the Default_Handler in my stm32 project and couldn't figure out what was casing it:
.section .text.Default_Handler,"ax",%progbits
Default_Handler:
Infinite_Loop:
b Infinite_Loop <--- here!
By default, a lot of interrupts are mapped to the default handler and the only way I could figure out what the actual interrupt reason was, would be to write handlers for all the interrupts (60+) and pause the code in the debugger. Bah!
I didn't find a good answer googling, so I thought I document the solution for others (or most likely myself in 6 months...)
So, it turns out there are some registers in the NVIC (interrupt controller) that we could use:
The above is from the STM32CubeIDE debugger. NVIC_IABRX contains a bitmask of the currently active interrupts and I can see that NVIC_IABR1 has a non-zero bit (it's 0x1000).
Each IABR reg is 32 bits wide, so with some simple bit counting I conclude that the interrupt source is 32+12 = 44. Now I need to look at the datasheet for my mcu (an stm32wb55) so see what that corresponds to:
Aha, so it's the IPCC that's causing the interrupt! To double check, I added a handler for this specific interrupt
void IPCC_C1_RX_IRQHandler(void)
{
}
And it got called!
Note: I initially just had a look at interrupt vector in the startup_stm32xxx.s file and counted from the start of that but that didn't work out since the first few interrupts are not included in the NVIC_IABRX registers.

Can we edit callback function HAL_UART_TxCpltCallback for our convenience?

I am a newbie to both FreeRTOS and STM32. I want to know how exactly callback function HAL_UART_TxCpltCallback for HAL_UART_Transmit_IT works ?
Can we edit that that callback function for our convenience ?
Thanks in Advance
You call HAL_UART_Transmit_IT to transmit your data in the "interrupt" (non-blocking) mode. This call returns immediately, likely well before your data gets fully trasmitted.
The sequence of events is as follows:
HAL_UART_Transmit_IT stores a pointer and length of the data buffer you provide. It doesn't perform a copy, so your buffer you passed needs to remain valid until callback gets called. For example it cannot be a buffer you'll perform delete [] / free on before callbacks happen or a buffer that's local in a function you're going to return from before a callback call.
It then enables TXE interrupt for this UART, which happens every time the DR (or TDR, depending on STM in use) is empty and can have new data written
At this point interrupt happens immediately. In the IRQ handler (HAL_UART_IRQHandler) a new byte is put in the DR (TDR) register which then gets transmitted - this happens in UART_Transmit_IT.
Once this byte gets transmitted, TXE interrupt gets triggered again and this process repeats until reaching the end of the buffer you've provided.
If any error happens, HAL_UART_ErrorCallback will get called, from IRQ handler
If no errors happened and end of buffer has been reached, HAL_UART_TxCpltCallback is called (from HAL_UART_IRQHandler -> UART_EndTransmit_IT).
On to your second question whether you can edit this callback "for convenience" - I'd say you can do whatever you want, but you'll have to live with the consequences of modifying code what's essentially a library:
Upgrading HAL to newer versions is going to be a nightmare. You'll have to manually re-apply all your changes you've done to that code and test them again. To some extent this can be automated with some form of version control (git / svn) or even patch files, but if the code you've modified gets changed by ST, those patches will likely not apply anymore and you'll have to do it all by hand again. This may require re-discovering how the implementation changed and doing all your work from scratch.
Nobody is going to be able to help you as your library code no longer matches code that everyone else has. If you introduced new bugs by modifying library code, no one will be able to reproduce them. Even if you provided your modifications, I honestly doubt many here will bother to apply your changes and test them in practice.
If I was to express my personal opinion it'd be this: if you think there's bugs in the HAL code - fix them locally and report them to ST. Once they're fixed in future update, fully overwrite your HAL modifications with updated official release. If you think HAL code lacks functionality or flexibility for your needs, you have two options here:
Suggest your changes to ST. You have to keep in mind that HAL aims to serve "general purpose" needs.
Just don't use HAL for this specific peripheral. This "mixed" approach is exactly what I do personally. In some cases functionality provided by HAL for given peripheral is "good enough" to serve my needs (in my case one example is SPI where I fully rely on HAL) while in some other cases - such as UART - I use HAL only for initialization, while handling transmission myself. Even when you decide not to use HAL functions, it can still provide some value - you can for example copy their IRQ handler to your code and call your functions instead. That way you at least skip some parts in development.

Is low latency mode safe to use with Linux serial ports?

Is it safe to use the low_latency tty mode with Linux serial ports? The tty_flip_buffer_push function is documented that it "must not be called from IRQ context if port->low_latency is set." Nevertheless, many low-level serial port drivers call it from an ISR whether or not the flag is set. For example, the mpc52xx driver calls flip buffer unconditionally after each read from its FIFO.
A consequence of the low latency flip buffer in the ISR is that the line discipline driver is entered within the IRQ context. My goal is to get latency of one millisecond or less, reading from a high speed mpc52xx serial port. Setting low_latency acheives the latency goal, but it also violates the documented precondition for tty_flip_buffer_push.
This question was asked on linux-serial on Fri, 19 Aug 2011.
No, low latency is not safe in general.
However, in the particular case of 3.10.5 low_latency is safe.
The comments above tty_flip_buffer_push read:
"This function must not be called from IRQ context if port->low_latency is set."
However, the code (3.10.5, drivers/tty/tty_buffer.c) contradicts this:
void tty_flip_buffer_push(struct tty_port *port)
{
struct tty_bufhead *buf = &port->buf;
unsigned long flags;
spin_lock_irqsave(&buf->lock, flags);
if (buf->tail != NULL)
buf->tail->commit = buf->tail->used;
spin_unlock_irqrestore(&buf->lock, flags);
if (port->low_latency)
flush_to_ldisc(&buf->work);
else
schedule_work(&buf->work);
}
EXPORT_SYMBOL(tty_flip_buffer_push);
The use of spin_lock_irqsave/spin_unlock_irqrestore makes this code safe to call from interrupt context.
There is a test for low_latency and if it is set, flush_to_ldisc is called directly. This flushes the flip buffer to the line discipline immediately, at the cost of making the interrupt processing longer. The flush_to_ldisc routine is also coded to be safe for use in interrupt context. I guess that an earlier version was unsafe.
If low_latency is not set, then schedule_work is called. Calling schedule_work is the classic way to invoke the "bottom half" handler from the "top half" in interrupt context. This causes flush_to_ldisc to be called from the "bottom half" handler at the next clock tick.
Looking a little deeper, both the comment and the test seem to be in Alan Cox's original e0495736 commit of tty_buffer.c. This commit was a re-write of earlier code, so it seems that at one time there wasn't a test. Whoever added the test and fixed flush_to_ldisc to be interrupt-safe did not bother to fix the comment.
So, always believe the code, not the comments.
However, in the same code in 3.12-rc* (as of October 23, 2013) it looks like the problem was opened again when the spin_lock_irqsave's in flush_to_ldisc were removed and mutex_locks were added. That is, setting UPF_LOW_LATENCY in the serial_struct flags and calling the TIOCSSERIAL ioctl will again cause "scheduling while atomic".
The latest update from the maintainer is:
On 10/19/2013 07:16 PM, Jonathan Ben Avraham wrote:
> Hi Peter,
> "tty_flip_buffer_push" is called from IRQ handlers in most drivers/tty/serial UART drivers.
>
> "tty_flip_buffer_push" calls "flush_to_ldisc" if low_latency is set.
> "flush_to_ldisc" calls "mutex_lock" in 3.12-rc5, which cannot be used in interrupt context.
>
> Does this mean that setting "low_latency" cannot be used safely in 3.12-rc5?
Yes, I broke low_latency.
Part of the problem is that the 3.11- use of low_latency was unsafe; too many shared
data areas were simply accessed without appropriate safeguards.
I'm working on fixing it but probably won't make it for 3.12 final.
Regards,
Peter Hurley
So, it looks like you should not depend on low_latency unless you are sure that you are never going to change your kernel from a version that supports it.
Update: February 18, 2014, kernel 3.13.2
Stanislaw Gruszka wrote:
Hi,
setserial has low_latency option which should minimize receive latency
(scheduler delay). AFAICT it is used if someone talk to external device
via RS-485/RS-232 and need to have quick requests and responses . On
kernel this feature was implemented by direct tty processing from
interrupt context:
void tty_flip_buffer_push(struct tty_port *port)
{
struct tty_bufhead *buf = &port->buf;
buf->tail->commit = buf->tail->used;
if (port->low_latency)
flush_to_ldisc(&buf->work);
else
schedule_work(&buf->work);
}
But after 3.12 tty locking changes, calling flush_to_ldisc() from
interrupt context is a bug (we got scheduling while atomic bug report
here: https://bugzilla.redhat.com/show_bug.cgi?id=1065087 )
I'm not sure how this should be solved. After Peter get rid all of those
race condition in tty layer, we probably don't want go back to use
spin_lock's there. Maybe we can create WQ_HIGHPRI workqueue and schedule
flush_to_ldisc() work there. Or perhaps users that need to low latency,
should switch to thread irq and prioritize serial irq to meat
retirements. Anyway setserial low_latency is now broken and all who use
this feature in the past can not do this any longer on 3.12+ kernels.
Thoughts ?
Stanislaw
A patch has been posted to LKML to address the problem. It removes the generic code for handling low_latency but keeps the parameter for the low-level drivers to use.
http://www.kernelhub.org/?p=2&msg=419071
I tried forcing low_latency on Linux 3.12 with serial console. The kernel was very unstable. If preemption was enabled, it would hang after a few minutes of use.
So the answer for now is to stay away.

FreeRTOS Sleep Mode hazards while using MSP430f5438

I wrote an an idle hook shown here
void vApplicationIdleHook( void )
{
asm("nop");
P1OUT &= ~0x01;//go to sleep lights off!
LPM3;// LPM Mode - remove to make debug a little easier...
asm("nop");
}
That should cause the LED to turn off, and MSP430 to go to sleep when there is nothing to do. I turn the LED on during some tasks.
I also made sure to modify the sleep mode bit in the SR upon exit of any interrupt that could possibly wake the MCU (with the exception of the scheduler tick isr in portext.s43. The macro in iar is
__bic_SR_register_on_exit(LPM3_bits); // Exit Interrupt as active CPU
However, it seems as though putting the MCU to sleep causes some irregular behavior. The led stays on always, although when i scope it, it will turn off for a couple instructions cycles when ever i wake the mcu via one of the interrupts (UART), and then turn back on.
If I comment out the LPM3 instruction, things go as planned. The led stays off for most of the time and only comes on when a task is running.
I am using a MSP4f305438
Any ideas?
Perhaps the problem is the call __bic_SR_register_on_exit(LPM3_bits). This macro changes the LPM bits in the stacked SR, so it must know where to find the saved SR on the stack. I believe that __bic_SR_register_on_exit() is designed for the standard interrupt stack frame generated by the compiler when you use the __interrupt directive. However, a preemptive RTOS, like FreeRTOS, uses its own stack frame typically bigger than the stack frame generated by the compiler, because an RTOS must store the complete context. In this case __bic_SR_register_on_exit() called from an ISR might not find the SR on the stack. Worse, it probably corrupts some other saved register value on the stack.
For a preemptive kernel I would not call __bic_SR_register_on_exit() from the ISRs. The consequence is that the idle callback is called only once and never again, because every time the RTOS performs a context switch back to the idle task the side effect is restoring the SR with the LPM bits turned on. This causes a sleep mode (which is what you want), but your LED won't get toggled.
Miro Samek
state-machine.com