HAL_SPI_Transmit works faster than direct registers manipulation, how come?

HAL_SPI_Transmit works faster than direct registers manipulation, how come? - stm32

So knowing the fact that HAL is considered "slow" I decided to rewrtite a small routine in my programm using direct register access. And I decided to see, what I have won. Surprisingly, I actually lost.
So the code is
this->chip_select();
HAL_SPI_Transmit(hspi, spi_array, 3, HAL_MAX_DELAY);
this->chip_deselect();
this->chip_select();
SPI1->DR = spi_array[0];
while (!(SPI1->SR & SPI_SR_TXE));
SPI1->DR = spi_array[1];
while (!(SPI1->SR & SPI_SR_TXE));
SPI1->DR = spi_array[2];
while(SPI1->SR & SPI_SR_BSY);
this->chip_deselect();
So first I send 3 bytes using HAL, and then the same 3 bytes using the registers and same SPI.
Using HAL, the "interbyte" pause is 0,848ms.
And using registers - 1.192ms
How come? Isn't doing things with registers supposed to be quicker?
P.S. The stm32 is l071, 32 Mhz, SPI is 16Mhz.

Ok, so my mistake is - this was done on Debug build with 0 optimisation. With optimisation the register method is faster. Question is how to see the assembly code in Eclipse.

Related

QSPI connection on STM32 microcontrollers with other peripherals instead of Flash memories

I will start a project which needs a QSPI protocol. The component I will use is a 16-bit ADC which supports QSPI with all combinations of clock phase and polarity. Unfortunately, I couldn't find a source on the internet that points to QSPI on STM32, which works with other components rather than Flash memories. Now, my question: Can I use STM32's QSPI protocol to communicate with other devices that support QSPI? Or is it just configured to be used for memories?
The ADC component I want to use is: ADS9224R (16-bit, 3MSPS)
Here is the image of the datasheet that illustrates this device supports the full QSPI protocol.
Many thanks
page 33 of the datasheet

The STM32 QSPI can work in several modes. The Memory Mapped mode is specifically designed for memories. The Indirect mode however can be used for any peripheral. In this mode you can specify the format of the commands that are exchanged: presence of an instruction, of an adress, of data, etc...
See register QUADSPI_CCR.

QUADSPI supports indirect mode, where for each data transaction you manually specify command, number of bytes in address part, number of data bytes, number of lines used for each part of the communication and so on. Don't know whether HAL supports all of that, it would probably be more efficient to work directly with QUADSPI registers - there are simply too many levers and controls you need to set up, and if the library is missing something, things may not work as you want, and QUADSPI is pretty unpleasant to debug. Luckily, after initial setup, you probably won't need to change very much in its settings.
In fact, some time ago, when I was learning QUADSPI, I wrote my own indirect read/write for QUADSPI flash. Purely a demo program for myself. With a bit of tweaking it shouldn't be hard to adapt it. From my personal experience, QUADSPI is a little hard at first, I spent a pair of weeks debugging it with logic analyzer until I got it to work. Or maybe it was due to my general inexperience.
Below you can find one of my functions, which can be used after initial setup of QUADSPI. Other communication functions are around the same length. You only need to set some settings in a few registers. Be careful with the order of your register manipulations - there is no "start communication" flag/bit/command. Communication starts automatically when you set some parameters in specific registers. This is explicitly stated in the reference manual, QUADSPI section, which was the only documentation I used to write my code. There is surprisingly limited information on QUADSPI available on the Internet, even less with registers.
Here is a piece from my basic example code on registers:
void QSPI_readMemoryBytesQuad(uint32_t address, uint32_t length, uint8_t destination[]) {
while (QUADSPI->SR & QUADSPI_SR_BUSY); //Make sure no operation is going on
QUADSPI->FCR = QUADSPI_FCR_CTOF | QUADSPI_FCR_CSMF | QUADSPI_FCR_CTCF | QUADSPI_FCR_CTEF; // clear all flags
QUADSPI->DLR = length - 1U; //Set number of bytes to read
QUADSPI->CR = (QUADSPI->CR & ~(QUADSPI_CR_FTHRES)) | (0x00 << QUADSPI_CR_FTHRES_Pos); //Set FIFO threshold to 1
/*
* Set communication configuration register
* Functional mode: Indirect read
* Data mode: 4 Lines
* Instruction mode: 4 Lines
* Address mode: 4 Lines
* Address size: 24 Bits
* Dummy cycles: 6 Cycles
* Instruction: Quad Output Fast Read
*
* Set 24-bit Address
*
*/
QUADSPI->CCR =
(QSPI_FMODE_INDIRECT_READ << QUADSPI_CCR_FMODE_Pos) |
(QIO_QUAD << QUADSPI_CCR_DMODE_Pos) |
(QIO_QUAD << QUADSPI_CCR_IMODE_Pos) |
(QIO_QUAD << QUADSPI_CCR_ADMODE_Pos) |
(QSPI_ADSIZE_24 << QUADSPI_CCR_ADSIZE_Pos) |
(0x06 << QUADSPI_CCR_DCYC_Pos) |
(MT25QL128ABA1EW9_COMMAND_QUAD_OUTPUT_FAST_READ << QUADSPI_CCR_INSTRUCTION_Pos);
QUADSPI->AR = (0xFFFFFF) & address;
/* ---------- Communication Starts Automatically ----------*/
while (QUADSPI->SR & QUADSPI_SR_BUSY) {
if (QUADSPI->SR & QUADSPI_SR_FTF) {
*destination = *((uint8_t*) &(QUADSPI->DR)); //Read a byte from data register, byte access
destination++;
}
}
QUADSPI->FCR = QUADSPI_FCR_CTOF | QUADSPI_FCR_CSMF | QUADSPI_FCR_CTCF | QUADSPI_FCR_CTEF; //Clear flags
}
It is a little crude, but it may be a good starting point for you, and it's well-tested and definitely works. You can find all my functions here (GitHub). Combine it with reading the QUADSPI section of the reference manual, and you should start to get a grasp of how to make it work.
Your job will be to determine what kind of commands and in what format you need to send to your QSPI slave device. That information is available in the device's datasheet. Make sure you send command and address and every other part on the correct number of QUADSPI lines. For example, sometimes you need to have command on 1 line and data on all 4, all in the same transaction. Make sure you set dummy cycles, if they are required for some operation. Pay special attention at how you read data that you receive via QUADSPI. You can read it in 32-bit words at once (if incoming data is a whole number of 32-bit words). In my case - in the function provided here - I read it by individual bytes, hence such a scary looking *destination = *((uint8_t*) &(QUADSPI->DR));, where I take an address of the data register, cast it to pointer to uint8_t and dereference it. Otherwise, if you read DR just as QUADSPI->DR, your MCU reads 32-bit word for every byte that arrives, and QUADSPI goes crazy and hangs and shows various errors and triggers FIFO threshold flags and stuff. Just be mindful of how you read that register.

STM32 bootloader failure to erase

I am writing an external bootloader for the STM32F730Z8 - (why? I need one windows code that can run the bootloader for the STM32, or use the STM32 to reprog a connected ATF1508 for my client). I've done this before, using info in AN3155 and AN2606. On lesser CPUs, this has had no difficulty (i.e. STM32L4P5). In this case, I try the same:
1-cycle \RESET & BOOT0 to boot to supervisor mode
2-autobaud successfully
3-send 0x00 to get the list of commands, successfully
4-send 01 to get the version and protection, successfully (vers 49, rp and nt both 0)
5-send 02 to get chip id (0x0452), successfully
6-send 0x73 to write-unprotect flash, successfully (i.e. receive back two ACK)
7-send 0x44 to begin an extended erase (intending only to erase sector 0).
This is where it fails. I get neither ACK nor NACK - it just times out. I don't even get to the second half of the extended-erase command where I send it the sector info. (On the STM32L4P5 it succeeds here easily and goes on to finish erasing, then to write code successfully.)
I've tried very long waits & repeat loops to wait for the ACK (many minutes). From past experience this should be fast, it is only the second stage where I tell it how much flash to erase that takes any significant time.
I've inspected the protection option areas of memory, at 0x1FFF0010, 0018, and they are unprotected, as per factory defaults.
I'm communicating over an FT231XS-R, using the D2XX driver calls. I can mess with the baud rates and such, but that only prevents it from autobauding...and we're doing that fine (9600/8/1/E). I've played with the D2XX SetTimeouts - if set too hasty that only screws up earlier commands. I'm wired to a 20 MHz crystal, and the application runs at 200 MHz, but my understanding is that the bootloader just runs at the internal RC clock rate.
I'm certainly missing something stupid, but I didn't see it in the documentation. Help?
Jeff Casey / Rockfield Research Inc. / Las Vegas, NV

Fixed, disregard.
The fineprint of AN3155 clued me in. On the description of the Write Unprotect command, it says that a system reset will be performed after completion. How did I miss this on the STM32L4P5? I just didn't read it. But why did it work then? In the really fine print next page, in a footnote to the flowchart, it says that they were just foolin'....system reset is only called for some (..list omitted..) and for other STM32 products no system reset is called for.
My earlier success had the following sequence:
reboot-supervisor
autobaud
get
gvrp
gid
wpun
xerase
wpun
write
verify
reboot-user
obviously that doesn't work for the F730. what works is:
reboot-super
autobaud
get
gvrp
gid
wpun
reboot-super
autobaud
get gvrp
gid
xerase
reboot-super
autobaud
get
gvrp
gid
write
verify
reboot-user
(obviously I can skip a few of the repeated steps, like get-id, but basically it needed a reboot and re-autobaud.)
note that i had to reboot-super a 3rd time...this was because the write attempt timed out after the xerase unless i went through the whole sequence again. funny, though, the spec doesn't say anything about resetting after an erase. i cross posted this question on the STM32 community site, and I'll do the same with this answer and ping them on this.
Thanks for reading, cheers. Jeff

Syncing of buffer-transmission with ESP32, I2S MEMS-mic and SD-card (FreeRTOS, PlatformIO, ESP-PROG)

i know this forum dislikes "open" questions like this, nevertheless i'd like somebody to help untie the knot in my head, much appreciated.
The goal is simple:
read a stereo 32bit 44100 S/s I2S signal from 2 adafruit sph0645 mics
create a wav-header and store the data onto an SD-card
I've been at this for a few days now and i know that this will be much more complicated than i originally thought. Main reason: signal quality. Like most tutorials on this subject the simplest "hello world" for these mics is a looped polling for I2S-samples. Poll, fill buffer, output via serial or write to SD-card. This returns a choppy, noisy, sped up version of RL-audio. The filling of the internal DMA-buffers can be seen as constant, but the rest is mostly chaos, so
how to i sync these DMA-buffers with the rest of my code?
From experience with the STM32 HAL i'd imagine some register which can be set to throw an interrupt whenever a buffer is full, or an event which can be sent between tasks via queues. Examples on this subject either poll in a main loop with mono an abysmal sample-rate and bit depth or use pages of overkill code and never adress what it does, "just copy and it works", not good. Does the ESP32-Arduino framework provide some way to to this properly? The espressif-documentation isn't something to look forward to, since some of their I2S interface functions don't even work (if you are researching this topic as well, you too might have noticed that i2s_read only returns zeros). Just a hint into the right direction would help, i'm writing my own code anyway. Interrupts? Events? Timers? Polling for full buffers? Only you might know.
have a good one, thx

Thanks to https://github.com/atomic14/ i now have an answer for a syncing-method which works very well. This method has been tried by https://esp32.com/viewtopic.php?t=12546 who also didn't fully understand what was going on: the espressif i2s-interface offers a flag stored in an event which is triggererd every time one of the specified dma-buffers has received a full set of data, ergo, is full. It looks like this:
while(<your condition>){
i2s_event_t evt;
if (xQueueReceive(<your queue>, &evt, portMAX_DELAY) == pdPASS){
if (evt.type == I2S_EVENT_RX_DONE){
size_t bytesRead = 0;
do{
//read data via i2s_read or i2s_read_bytes
} while (bytesRead > 0);
No data is stored in this queue, but rather a flag which can then be used to synchronize dma-filling and further buffering/calculating/sending the read data.
HOWEVER this only works if you install the i2s driver in a specific setup. Instead of using
i2s_driver_install(I2S_NUM_0, &i2s_config, 0, NULL);
in your setup, you can activate the "affinity" for events by passing a queue-handle and a lenght:
i2s_driver_install(I2S_NUM_0, &i2s_config, 4, &<your queue>);
hope this helps getting started, it sure did help me.

I2C: Raspberry Pi (Master) read Arduino (Slave)

I would like to read a block of data from my Arduino Mega (and also from an Arduino Micro in another project) with my Raspberry Pi via I2C. The code has to be in Perl because it's sort of a plug-in for my Home-Automation-Server.
I'm using the Device::SMBus interface and the connection works, I'm able to write and read single Bytes. I can even use writeBlockData with register address 0x00. I randomly discovererd that this address works.
But when I want to readBlockData, no register-address seems to work.
Does anyone know the correct register-address, or is that not even the problem that causes errors?
Thanks in advance

First off, which register(s) are you wanting to read? Here's an example using my RPi::I2C software (it should be exceptionally similar with the distribution you're using), along with a sketch that has a bunch of pseudo-registers configured for reading/writing.
First, the Perl code. It reads two bytes (the output of an analogRead() of pin A0 which is set up as register 80), then bit-shifts the two bytes into a 16-bit integer to get the full 0-1023 value of the pin:
use warnings;
use strict;
use RPi::I2C;
my $arduino_addr = 0x04;
my $arduino = RPi::I2C->new($arduino_addr);
my #bytes = $arduino->read_block(2, 80);
my $a0_value = ($bytes[0] << 8) | $bytes[1];
print "$a0_value\n";
Here's a full-blown Arduino sketch you can review that sets up a half dozen or so pseudo-registers, and when each register is specified, the Arduino writes or reads the appropriate data. If no register is specified, it operates on 0x00 register.
The I2C on the Arduino always does an onReceive() call before it does the onRequest() (when using Wire), so I set up a global variable reg to hold the register value, which I populate in the onReceive() interrupt, which is then used in the onRequest() call to send you the data at the pseudo-register you've specified.
The sketch itself doesn't really do anything useful, I just presented it as an example. It's actually part of my automated unit test platform for my RPi::WiringPi distribution.

Problems using a MCP3008 and a MCP23S17 on SPI with WebIOPi

I'm very new to WebIOPi and I'm trying my first tests. First of all I apologize for my english.
I'm trying to get to work a RPi with a MCP3008 on CE0 and a MCP23S17 on CE1 with SPI bus.
My problem is that devices only work when connected on CE1 (so, when 23017 is on CE0 I am not able to set pins to be inputs or outputs and to set it on 1 or 0, but 3008 is on CE1 and I see its levels changing. When - vice versa - 23017 is on CE1 it is fully functional, but 3008 outputs stay still).
Due to this, I think it is not an hardware issue (I don't have much expertise in electronics, but luckily I don't build my circuits by myself :) ), I think it is a problem in WebIOPi config. Here is my WebIOPi config:
[DEVICES]
mcp1 = MCP23S17 chip:1 slave:0x27
adc0 = MCP3008 chip:0
I only added these two lines to my config file.
I did not touch anything else of my original WebIOPi installation.
In this case (adc0 fully functional, mcp1 not working), when loading the WebIOPi devices monitor I see adc0 levels working good and mcp1 pins randomly changing between being a input and an output and from 0 and 1.
May it be a config error?

Use python and spidev module instead! Look my answer on another thread for function for the mcp3008 chip.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse