stm32f4 HardFault_Handler - need debugging advice - stm32

I'm working on a project based on the stm32f4discovery board using IAR Embedded Workbench (though I'm very close to the 32kb limit on the free version so I'll have to find something else soon). This is a learning project for me and so far I've been able to solve most of my issues with a few google searches and a lot of trial and error. But this is the first time I've encountered a run-time error that doesn't appear to be caused by a problem with my logic and I'm pretty stuck. Any general debugging strategy advice is welcome.
So here's what happens. I have an interrupt on a button; each time the button is pressed, the callback function runs my void cal_acc(uint16_t* data) function defined in stm32f4xx_it.c. This function gathers some data, and on the 6th press, it calls my void gn(float32_t* data, float32_t* beta) function. Eventually, two functions are called, gn_resids and gn_jacobian. The functions are very similar in structure. Both take in 3 pointers to 3 arrays of floats and then modify the values of the first array based on the second two. Unfortunately, when the second function gn_jacobian exits, I get the HardFault.
Please look at the link (code structure) for a picture showing how the program runs up to the fault.
Thank you very much! I appreciate any advice or guidance you can give me,
-Ben
Extra info that might be helpful below:
Running in debug mode, I can step into the function and run through all the lines click by click and it's OK. But as soon as I run the last line and it should exit and move on to the next line in the function where it was called, it crashes. I have also tried rearranging the order of the calls around this function and it is always this one that crashes.
I had been getting a similar crash on the first function gn_resids when one of the input pointers pointed to an array that was not defined as "static". But now all the arrays are static and I'm quite confused - especially since I can't tell what is different between the gn_resids function that works and the gn_jacobian function that does not work.
acc1beta is declared as a float array at the beginning of main.c and then also as extern float32_t acc1beta[6] at the top of stm32f4xx_it.c. I want it as a global variable; there is probably a better way to do this, but it's been working so far with many other variables defined in the same way.
Here's a screenshot of what I see when it crashes during debug (after I pause the session) IAR view at crash
EDIT: I changed the code of gn_step to look like this for a test so that it just runs gn_resids twice and it crashes as soon as it gets to the second call - I can't even step into it. gn_jacobian is not the problem.
void gn_step(float32_t* data, float32_t* beta) {
static float32_t resids[120];
gn_resids(resids, data, beta);
arm_matrix_instance_f32 R;
arm_mat_init_f32(&R, 120, 1, resids);
// static float32_t J_f32[720];
// gn_jacobian(J_f32, data, beta);
static float32_t J_f32[120];
gn_resids(J_f32, data, beta);
arm_matrix_instance_f32 J;
arm_mat_init_f32(&J, 120, 1, J_f32);

Hardfaults on Cortex M devices can be generated by various error conditions, for example:
Access of data outside valid memory
Invalid instructions
Division by zero
It is possible to gather information about the source of the hardfault by looking into some processor registers. IAR provides a debugger macro that helps to automate that process. It can be found in the IAR installation directory arm\config\debugger\ARM\vector_catch.mac. Please refer to this IAR Technical Note on Debugging Hardfaults for details on using this macro.
Depending on the type of the hardfault that occurs in your program you should try to narrow down the root cause within the debugger.

Related

Sudden too many errors: methods are undefined

My DES model was working well till I've added a new function (just like many others added before) and the model gave me all these error messages:
I wonder what do these messages mean and what made all of them appear suddenly. I was running the model just before it without any errors. Is there a way I could get my model back? :)
Thank you.
Edit 1:
Here is the newly created function: Newly created function
Given that I've ignored it, and made all the callings of it as comments, then the errors were still appearing. I've closed this model version and opened a backup then created the same function and the whole scenario happened again with the same 94 errors.
Is there something I need to change in the function itself?
The function checks the number of agents in certain parts of the model to stop delay blocks accordingly. That's done by adding variables that increment on exiting of agents and get the difference between them.
Typically this is caused by a syntax error in the new function that you created. See if you ignore this and then compile if it gets fixed.
Most of the time the compiler is smart enough to show you the error, but often if you replace a { with [ or you leave out a { the compiler get very confused and gives you 100s of errors.
If yous still cant find the issue maybe paste the code of your new function in the question.

ARM Eclipse debugging code in the RAM. Is it possible to see the source code`

I have a problem when try to debug the code which is copied to the SRAM and executed from there.
The code is overwriting the data - but it is done only during the system update. The sections where code is placed are correctly defined in the linker script file and the debuger correctly see the addresses. But when I step into the function (and the code in RAM is the correct one) it does not connect the source files with the code executed in the memory.
Do you know how can it be done. Debugging C code on the assembler level is not something which makes me happy :)
Any help appreciated.
The problem is a bit silly. When you call RAM function from the FLASH (the first call has to be done this way) it has to be done by the veneer. It was messing up the debugger. But having own calling macro (because of the distance it has to be done via the pointer) everything works fine
example calling macro.
#define RAMFCALL(func, ...) {unsigned (* volatile fptr)() = (unsigned (* volatile)())func; fptr(__VA_ARGS__);}

LibOpenCM3 vector table is all blocking-handler

The answer to this question here
Libopencm3 interrupt table on STM32F4
explains the whole mechanism nicely but what I get is whole vector table filled with blocking handlers.
I know that because I see it in debugger (apart from the whole thing not working): disassembly screenshot showing vector table.
It is as though linker simply ignores my nicely defined interrupt handler function(s), e.g.:
void sys_tick_handler(void)
{
...
}
void tim1_up_isr(void)
{
...
}
I am using EmBitz IDE and have followed this tutorial here to get libopencm3 to work (and it does work except for this issue).
I have checked the function names n-fold and have tried several online examples including those from the libopencm3-examples project.
Everything compiles without a glitch and loads into the target board (STM32F103C8) and runs fine - except no ISRs get invoked (I do get interrupt(s) but they get stuck in blocking handlers).
Does anyone have an idea why is this happening?
It looks like linking with standard vector table (from ST's SPL or HAL).
To check this, try to rename your sys_tick_handler() to SysTick_Handler() and
tim1_up_isr() to TIM1_UP_IRQHandler().
If it works, find file with this SysTick_Handler and TIM1_UP_IRQHandler (I think, that will be startup*.s) and delete it from your project.

release build variable corruption when using ne10 math library assembly function

has anyone experience the following issue?
A stack variable getting changed/corrupted after calling ne10 assembly function such as ne10_len_vec2f_neon?
e.g
float gain = 8.0;
ne10_len_vec2f_neon(src, dst, len);
after the call to ne10_len_vec2f_neon, the value of gain changes as its memory is getting corrupted.
1. Note this only happens when the project is compiled in release build but not debug build.
2. Does Ne10 assembly functions preserve registers?
3. Replacing the assembly function call to c equivalent such as ne10_len_vec2f_c and both release and debug build seem to work OK.
thanks for any help on this. Not sure if there's an inherent issue within the program or it is really the call to ne10_len_vec2f_neon causing the corruption with release build.enter code here
I had a quick rummage through the master NEON code here:
https://github.com/projectNe10/Ne10/blob/master/modules/math/NE10_len.neon.s
... and it doesn't really touch address-based stack at all, so not sure it's a stack problem in memory.
However based on what I remember of the NEON procedure call standard q4-q7 (alias d8-d15 or s16-s31) should be preserved by the callee, and as far as I can tell that code is clobbering q4-6 without the necessary save/restore, so it does indeed look like it's clobbering the stack in registers.
In the failed case do you know if gain is still stored in FPU registers, and if yes which ones? If it's stored in any of s16/17/18/19 then this looks like the problem. It also seems plausible that a compiler would choose to use s16 upwards for things it needs to keep across a function call, as it avoids the need to touch in-RAM stack memory.
In terms of a fix, if you perform the following replacements:
s/q4/q8/
s/q5/q9/
s/q6/q10/
in that file, then I think it should work; no means to test here, but those higher register blocks are not callee saved.

Initialization commands cannot be evaluated

I took the 'Performance of Three PSS for Interarea Oscillations' simulink model. I have disabled the link for 'Delta w PSS' block and modified (removed some connections and added new blocks) it according to my usage. Now when I simulate I am getting an error "Initialization commands cannot be evaluated" on this 'Delta w PSS'block.
What should I do to solve this problem?
Thanks
My guess is that it cannot initialize some of the mask parameters. If you right-click on the block and choose "View Mask...", you'll see what the initialization commands are under the "Initialization" block (you may have to go back to the library block). Unfortunately, once you start breaking/disabling library links and modifying library blocks, you're on your own...
In Power electronics, I saw this error and I have added a block named powergui and it solved. to find it I searched powergui
it may help you to find such this block in your kind of use