STM32 - Delay between UART TX frames at high speed - stm32

I am sending a fixed-sized buffer (512 byte) at 8 Mbits/s via UART out of an STM32F3, and I am experiencing what it seems to be a fixed delay (~ 2-3 bit periods) between consecutive frames.
In the screenshot below, after sending the dummy value 01010101, it can be seen how the line idles high for quite a long time between the stop bit "1" of one frame and start bit "0" of the next. The bit period of ~125 ns is as expected and the data is received successfully by another STM32, but such a cumulative delay between frames (128 us min. over the entire buffer) is a problem for my application.
Scope screenshot
I have tried sending the buffer using HAL_Transmit, HAL_Transmit_DMA (single call) and LL_USART_TransmitData8 (one byte at a time) with similar results.
Any idea of what could be causing it? Thanks!!!

Related

Recording an IR signal and playing it again on Raspberry Pi bullseye

I have a Teco aircon with an ifrared remote. I connected an IF receiver and an IF transmitter to my Pi and they are working fine.
EC systems are different to TVs, in that they send a rather long sequence of information each time you press a button. irrecord fails miserably trying to get what button you pressed. All it gets is the gap (50000) and the frequency (38000).
I got strong inspiration from this post on doing this. So...
First, I record what's being sent:
ir-ctl -rTEMPERATURE.txt --mode2 -r --device=/dev/lirc1 -1
This is a specific mode, with a specific temperature.
The file is saved and looks like this:
pulse 3117
space 1536
pulse 487
space 1093
pulse 487
space 1095
(...227 lines...)
space 1061
pulse 516
space 300
pulse 517
I then try and resent the exact same thing to my aircon:
ir-ctl -d /dev/lirc0 -sTEMPERATURE.txt -g 50000 --carrier 38000
But nothing happens. The aircon doesn't seem to respond.
(Note that I know the transmitter is working, because I tried to run the sending and the receiving at the same time, and the receiver acknowledged receiving data.)
I tried different "carrier" options, and none of them worked.
So, questions:
Why isn't the aircon getting the message?
How do I translate "TEMPERATURE.txt" in actual data (that is, bytes)
If I manage to do (1), how do I then send that rather than a bunch of pulses?
Am I doing this fundamentally wrong?
I spent my whole week end on this... and am frankly at my wits' end. I will deeply appreciate any hints.
UPDATE Things got much weirder. I looked at the actual pulses and spaces and translated them into actual on/off bits and then bytes. The result was alarming: no matter what button I press, the end result is always always C4D36440020000000A6 (in hexadecimal). The length of pulses and spaces varies a little but that is ALWAYS the end result. But surely, it can't be...? Surely different keys should provide different data?

STM32 ADC: leave it running at 'high' speed or switch it off as much as possible?

I am using a G0 with one ADC and 8 channels. Works fine. I use 4 channels. One is temperature that is measured constantly and I am interested in the value every 60s. Another one is almost the opposite: it is measuring sound waves for a couple a minutes per day and I need those samples at 10kHz.
I solved this by letting all 4 channels sample at 10kHz and have the four readings moved to memory by DMA (array of length 4 with 1 measurement each). Every 60s I take the temperature and when I need the audio, I retrieve the audio values.
If I had two ADC's, I would start the temperature ADC reading for 1 conversion every 60s. Non-stop. And I would only start the audio ADC for the the couple of minutes a day that it is needed. But with the one ADC solution, it seems simple to let all conversions run at this high speed continuously and that raised my question: Is there any true downside in having 40.000 conversions per second, 24 hours per day? If not, the code is simple. I just have the most recent values in memory all the time. But maybe I ruin the chip? I use too much energy I know. But there is plenty of it in this case.
You aren't going to "wear it out" by running it when you don't need to.
The main problems are wasting power and RAM.
If you have enough of these, then the lesser problems are:
The wasted power will become heat, this may upset your temperature measurements (this is a very small amount though).
Having the DMA running will increase your interrupt latency and maybe also slow down the processor slightly, if it encounters bus contention (this only matters if you are close to capacity in these regards).
Having it running all the time may also have the advantage of more stable readings, not being perturbed turning things on and off.

Optimal size for a ring buffer with single producer and single consumer

I have a single producer, single consumer problem which (I believe) can be solved using a circular/ring buffer.
I have a micro-controller running a RTOS, with an ISR(Interrupt Service Routine) handling UART (Serial port) interrupts. When the UART raises an interrupt, the ISR posts the received characters into the circular buffer. In another RTOS task (called packet_handler), I am reading from this circular buffer and running a state machine to decode the packet. A valid packet has 64 bytes including all the framing bytes.
The UART operates at 115200, and a packet arrives every 10ms. The packet_handler runs every 10ms. However, this handler may sometimes get delayed by the scheduler due to another higher priority task executing.
If I use an arbitrarily large circular buffer, there are no packet drops. But how to determine the optimal circular buffer size in this case? At least theoretically. How is this decision of buffer size made in practice?
Currently, I am detecting overrun of the buffer through some instrumentation functions and then increasing the buffer size to reduce packet loss.
You won't be safe, ever, as you are dealing with a stochastic process (according to your explanation).
Answering your question: You will need an infinite buffer just in case the consumer task is in ready state for infinite seconds. So, you will have to change something in your initial approach:
Increase the priority of the consumer, in order to ensure the 10ms execution (the smallest buffer approach, but it may not be possible).
Try to get a better characterization of your model, in order to predict the maximum gap of time in which the consumer task won't be executed (do your system as predictable as possible).
Lose packages with a random buffer size (it may not be safe)
I would calculate in this way:
64 Byte received just know
64 Byte still in the ring buffer
+100% to be save
===================
256 Byte Buffer
But this is just a guess. You had to do some worst case test with your buffer and then spend +100% to be save.
While all of the above answers are correct and throws light on the issue, this page summarizes all the factors to be considered while choosing the size of a ring buffer.
Some queuing models can be used to theoretically analyze the problem at hand and find out the suitable size of ring buffer.
A more pragmatic approach is to start with a large buffer, then find out the maximum used buffer size in real test case (this process is called watermarking) and use this figure in the final code.
It is simply a matter of determining the maximum possible delay - the sum of the execution times of all higher priority tasks that can run - and dividing that by the packet arrival interval.
This may not be straightforward, but can be simplified by making only the most deterministic tasks higher priority and moving less deterministic and longer running tasks to lower priorities according to rate-monotonic principles Tasks that frequently run for a short time, but sporadically run for longer are candidates for being split into two tasks (and further queues) to offload the longer execution to a lower priority.

understanding double buffers

I am using the C8051F320 and basing my firmware on the HID example firmware (for example, BlinkyExample).
IN and OUT reports are each 64B long (a single 64B packet).
I enabled the ADC and set it for 10kSps. Every ADC interrupt, a sample is stored in an array. When enough samples are taken to fill a packet, an IN packet is sent.
Software sends a report telling the firmware how many reports to return.
1) The example firmware uses EP1, which has 128B. It splits the EP into IN and OUT, 64B each.
The firmware drops the first sample of each IN report at 10kSps. At 5kSps it runs fine.
2) I modified EP1 to be double buffered, but it is only 32B long now. Regardless, streaming 1000s of IN reports with 10kSps data works great (confirmed by FFT of the sampled sine wave in software).
3) I modified the firmware to use EP2, since that has 256B total, giving 64B if splitting and double buffering.
a) Again, at 10kSps, the first sample of each packet is dropped. Why? It runs fine at 5kSps.
Actually, I cannot seem to visualize how double buffering works. If the sample rate is faster than the HID transfer rate, the FIFOs will overflow regardless, right? How does double buffering help? But it seems that for double-buffering to be effective, the packet size must be cut in half.
b) While switching references of EP1 to EP2, I came across this code in F3xx_USB0_Standard_Requests.c: DATAPTR = (unsigned char*)&ONES_PACKET;. Setting a char* = address of a char* does not seem correct to me. I modified it to DATAPTR = (unsigned char*)ONES_PACKET; Regardless, there seems to be no difference. What does the zeros and ones arrays do?
HID example firmware
HID uses interrupt type endpints, which will transfer data at most once per frame, or once per 1 ms - depending on your HID descriptor, it can be much slower. This yields a net data rate of about 64000 Bytes/sec.
Once you need to transfer more data, use bulk or isochronous endpoints.

AudioToolbox - Callback delay while recording

I've been working on a very specific project for iOS, lately, and my researches lead me to an almost final code. I've solved all the extreme difficulties I've found until now, but on this one I don't seem to have a clue (about the reason nor the possibility of solving it).
I set up my audioqueue (sample rate 44100, format LinearPCM, 16 bits per channel, 2 bytes per frame, 1 channel per frame...) and start recording the sound with 12 audio buffers. However, there seems to be a delay after every 4 callbacks.
The situation is the following: the first 4 callbacks are called with an interval each of about 2 ms. However, between the 4th and the 5th, there is a delay of about 60ms. The same thing happens between the 8th and the 9th, the 12th and 13th and on...
There seems to be a relation between the bytes per frame and the moment of the delay. I know this because if I change to 4 bytes per frame, I start having the delay between the 8th and the 9th, then between the 16th and the 17th, the 24th and the 25th... Nonetheless, there doesn't seem to be any relation between the moment of the delay and the number of buffers.
The callback function does only two things: store the audio data (inBuffer->mAudioData) on a array my class can use; and call another AudioQueueEnqueueBuffer, to put the current buffer back on the queue.
Did anyone go through this problem already? Does anyone know, at least, what could be the cause of it?
Thank you in advance.
The Audio Queue API seems to run on top of the RemoteIO Audio Unit API, who's real audio buffer size is probably unrelated to, and in your example larger than, whatever size your Audio Queue buffers are. So whenever a RemoteIO buffer is ready, a bunch of your smaller AQ buffers quickly get filled from it. And then you get a longer delay waiting for some larger buffer to be filled with samples.
If you want better controlled (more evenly spaced) buffer latency, try using the RemoteIo Audio Unit directly.