Trying to understand ITM and SWV so that the useful printf function can be used for debugging is well worth while. But it doesn't work on one of ST's latest boards, the NUCLEO-H723ZG. This is a real hair tearing out experience as printf works so easily on another board, the NUCLEO-H743ZI2.
I am using STM32CubeIDE v 1.7.0. Has anyone found a solution to this issue with the H723ZG chip, or more likely, the ST development software for that chip?
ITM and printf do work on the Nucleo-H723ZG.
Using the STM32CubeIDE v 1.7.0 (the latest version) the default board clock is 550MHz. The Trace clock is however 275MHz as noted on the Clock Configuration diagram. In the Debug Configuration Properties, when enabling SWV, the Core Clock frequency must be set at 275MHz (not 550).
This is different from configuring the Nucleo-H743ZI2 board for its maximum clock frequency of 480MHz. In this case set the SWV core clock to 480MHz, not the Trace clock frequency.
Thanks to STM support for helping me solve this conundrum.
Related
Testing out an STM32 with an FDCAN module (updated from the older BxCAN peripheral). CAN Classic at 500kbps.
I am running into an issue that when using the default pair of pins (D0/D1 in my case) I get expected behavior, but when switching the pins to the secondary option (B8/B9) using GPIO remapping, I get strange output on the bus.
I tried baud settings and options like protocol exception. Nothing seems to explain where this scope output is coming from.
I'm using the HAL to get this working, so I'm certain I'm not missing any registers on remapping. I've DeInit and ReInit the FDCAN module, started/stopped, etc. There seems to be no documented "process" for remapping pins. The entire FDCAN section of the reference module doesn't have the letters GPIO.
Picture: Yellow is the CANTX 0-3V signal (low is dominant). Purple is the CAN+ signal that idles at 2.5V and pulls past ~3.5V on a dominant. There is nothing else on this line, so I'm not concerned about the sawtoothing. The large initial CAN "SOF" pulse is wrong for timing. The long recessives are nonsense. Then the small value 1 bits are of the correct 2uS pulses for 500kbps. Changing the data put into the FDCAN FIFO makes no difference, the output is always the same.
Solved.
After sending this message, the INIT bits were set in the FDCAN->CCCR register. There were values in the error counters. Indicates an internal error. I was using the HAL as a time saver, but it was over-writting my desired GPIO settings.
I would set the pins B8/B9 to AF mode for FDCAN. Then call FDCAN_DeInit/Init, which via an MSP_INIT callback also calls GPIO Init, but for the original D0/D1 pins. Meaning the B8/B9 I set, and the D0/D1 pins were enabled at the same time.
This is an obvious problem. The HAL is fine for prototyping, but careful because it will try and "help". Undefined behavior at best and I normally wouldn't even post such a dumb mistake.
However... Maybe someone else finds it interesting that whatever the FDCAN state machine is doing here, makes this unique output seen in the scope picture. I initially didn't double check my pin setup, because it looked right, I was getting output on the scope, just the wrong output. I spent much more time going over peripheral settings and data I was feeding to it.
I'm currently working on a project of a nn to play a game similar to atari games (more details in the link). I'm having trouble with the indexing. perhaps anyone knows what could be the problem? because I cant seem to find it. Thank you for your time. here's my code (click on the link) and here's the full traceback. the problem starts from the way I call
history = network.fit(state, epochs=10, batch_size=10) // in line 82
See this post: Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
As said in the correct answer,
Modern CPUs provide a lot of low-level instructions, besides the usual arithmetic and logic, known as extensions, e.g. SSE2, SSE4, AVX, etc. From the Wikipedia:
The warning states that your CPU does support AVX (hooray!).
Pretty much, AVX speeds up your training, etc. Sadly, tensorflow is saying that they aren't going to use it... Why?
Because tensorflow default distribution is built without CPU extensions, such as SSE4.1, SSE4.2, AVX, AVX2, FMA, etc. The default builds (ones from pip install tensorflow) are intended to be compatible with as many CPUs as possible. Another argument is that even with these extensions CPU is a lot slower than a GPU, and it's expected for medium- and large-scale machine-learning training to be performed on a GPU.
What should yo do?
If you have a GPU, you shouldn't care about AVX support, because most expensive ops will be dispatched on a GPU device (unless explicitly set not to). In this case, you can simply ignore this warning by:
# Just disables the warning, doesn't enable AVX/FMA
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
If you don't have a GPU and want to utilize CPU as much as possible, you should build tensorflow from the source optimized for your CPU with AVX, AVX2, and FMA enabled if your CPU supports them. It's been discussed in this question and also this GitHub issue. Tensorflow uses an ad-hoc build system called bazel and building it is not that trivial, but is certainly doable. After this, not only will the warning disappear, tensorflow performance should also improve.
You can find all the details and comments in this StackOverflow question.
NOTE: This answer is a product of my professional copy-and-pasting.
Happy coding,
Bobbay
Has the code been debugged line by line ? as this would trace to the line causing error.
I assume the index error crops up from the below one - where "i" and further targets[i] , outs[i] can be checked for the values they have -
per_sample_losses = loss_fn.call(targets[i], outs[i])
I am using Raspberry Pi 2 board with raspbian loaded. need to do SPI by bit banging & interface MCP3208.
I have taken code from Github. It is written for MCp3008(10 bit adc).
Only change I made in code is that instead of calling:
adcValue = recvBits(12, clkPin, misoPin)
I called adcValue = recvBits(14, clkPin, misoPin) since have to receive 14 bits of data.
Problem: It keeps on sending random data ranging from 0-10700. Even though data should be max 4095. It means I am not reading data correctly.
I think the problem is that MCP3208 has max freq = 2Mhz, but in code there is no delay between two consecutive data read or write. I think I need to add some delay of 0.5us whenever I need to transition clock since I am operating at 1Mhz.
For a small delay I am currently reading Accurate Delays on the Raspberry Pi
Excerpt:
...when we need accurate short delays in the order of microseconds, it’s
not always the best way, so to combat this, after studying the BCM2835
ARM Peripherals manual and chatting to others, I’ve come up with a
hybrid solution for wiringPi. What I do now is for delays of under
100μS I use the hardware timer (which appears to be otherwise unused),
and poll it in a busy-loop, but for delays of 100μS or more, then I
resort to the standard nanosleep(2) call.
I finally found some py code to simplify reading from the 3208 thanks to RaresPlescan.
https://github.com/RaresPlescan/daisypi/blob/master/sense/mcp3208/adc_3.py
I had a data logger build on the pi, that was using a 3008. The COTS data logger I was trying to replicate had better resolution, so I started looking for a 12 bit and found the 3208. I literally swapped the 3008 out for the 3208 and with this guys code I have achieved better resolution than the COTS data logger.
I am working with an STM32F4 Microcontroller, and I am unable to use inline assembly that I am trying to port from another ARM processor. I have no idea where to begin trying to figure out the problem
There is an easy way.. You can use the asm key word.
asm("NOP"); for example will wait for one clock cycle and carry on. You can expand the results.
Well, I would normally say that you should post your code, but in this particular case, I would advise you to always do a little homework on processor architecture when working with microcontrollers.
The STM32F4 (Cortex M4 Processor architecture) does not use the typical arm and thumb instruction sets, like the ARM7 or many other ARM processors. Cortex M4 processors run in Thumb2 mode, which includes subsets of both the ARM and THUMB instruction sets, requiring no arm->thumb or thumb->arm switches (or instructions).
If a Windows executable makes use of SYSENTER and is executed on a processor implementing AMD64 ISA, what happens? I am both new and newbie to this topic (OSes, hardware/software interaction) but from what I've read I have understood that SYSCALL is the AMD64 equivalent to Intel's SYSENTER. Hopefully this question makes sense.
If you try to use SYSENTER where it is not supported, you'll probably get an "invalid opcode" exception.
Note that this situation is unusual - generally, Windows executables do not directly contain instructions to enter kernel mode.
As far as i know AM64 processors using different type of modes to handle such issues.
SYSENTER works fine but is not that fast.
A very useful site to get started about the different modes:
Wikipedia
They got rid of a bunch of unused functionality when they developed AMD64 extensions. One of the main ones is the elimination of the cs, ds, es, and ss segment registers. Normally loading segment registers is an extremely expensive operation (the CPU has to do permission checks, which could involve multiple memory accesses). Entering kernel mode requires loading new segment register values.
The SYSENTER instruction accelerates this by having a set of "shadow registers" which is can copy directly to the (internal, hidden) segment descriptors without doing any permission checks. The vast majority of the benefit is lost with only a couple of segment registers, so most likely the reasoning for removing the support for the instructions is that using regular instructions for the mode switch is faster.