Cortex-M3 heap-stack organization using keil - cortex-m3

Trying to run blinky sample for Atmel sam3s and inspecting the stack pointer...
SP has the value 0x20000238 at the start of main function which is equal too Ram base + RW + ZI for this sample.
The base RAM address for this chip is : 0x20000000
Total ram size is: 0x10000
I expected the sp to be initialized on 0x20010000 and coming down.
Can anyone explain if I am wrong or not?

As Pait said (he/she should have answered the question so I could accept it, I think),
I was wrong to think the stack will be placed at the end of RAM by default. But I think it is wise to make it be placed there.
This is how I do it for my SAM3S micro, by changing the scatter file
LR_IROM1 0x00400000 0x00080000 { ; load region size_region
ER_IROM1 0x00400000 0x00080000 { ; load address = execution address
*.o (RESET, +First)
*(InRoot$$Sections)
.ANY (+RO)
}
RW_IRAM1 0x20000000 0x00010000 { ; RW data
.ANY (+RW +ZI)
}
RW_STACK 0x2000C000 UNINIT 0x4000 { ; STACK data
*.o (STACK)
}
}
RAM base = 0x20000000
Total RAM = 64 kb
Intended stack size = 16 kb

Related

How to force my section in the very end of flash?

I have my linker script with such memory layout
MEMORY{
ram : ORIGIN = 0x0, LENGTH = 0x1000 /* 4KB of RAM starting at address 0x0 */
flash : ORIGIN = 0x20000000, LENGTH = 0xC000 /* 48KB of flash starting at address 0x20000000 */
}
and I want to add two section in the very end of my flash(let's say .section1 and .section2)
but I don't know the length of my section unless I compile it
In my expectation, my flash layout will look like this:
start address
section
0x20000000
.text
0x20000000+.text size
.data
0x20000000+.text size + .data size
.mdata
0x20000000+.text size + .data size + .mdata
(empty flash)
0x2000C000-.section2 size - section1 size
.section1
0x2000C000-.section2 size
.section2
How should I do in my linker script?
I don't know how to wrote it for now I just force my address like this. but its not what I want.
.section1 :{
*(.section1)
} >flash :AT (0x2000A000)
.section2 :{
*(.section2)
} >flash :AT (0x2000B000)

matlab : out of memory with convenable confi of pc

I block on my problem that I will wrote in details below. During 3 days I tried a lot of differents things, none worked..
If anyone have an idea of what to do !
Here is my message error :
the call to "ft_selectdata" took 0 seconds
preprocessing
Out of memory. Type "help memory" for your options.
Error in ft_preproc_dftfilter (line 187)
tmp = exp(2*1i*pi*freqs(:)*time); % complex sin and cos
Error in ft_preproc_dftfilter (line 144)
filt = ft_preproc_dftfilter(filt, Fs, Fl(i), 'dftreplace', dftreplace, 'dftbandwidth',
dftbandwidth(i), 'dftneighbourwidth', dftneighbourwidth(i)); % enumerate all options
Error in preproc (line 464)
dat = ft_preproc_dftfilter(dat, fsample, cfg.dftfreq, optarg{:});
Error in ft_preprocessing (line 375)
[dataout.trial{i}, dataout.label, dataout.time{i}, cfg] = preproc(data.trial{i}, data.label,
data.time{i}, cfg, begpadding, endpadding);
Error in EEG_Prosocial_script (line 101)
data_intpl = ft_preprocessing(cfg, allData_preprosses);
187 tmp = exp(2*1i*pi*freqs(:)*time); % complex sin and cos
There is some informations from matlab about my computer and about the caracteristics of the calcul
Maximum possible array: 7406 MB (7.766e+09 bytes)*
Memory available for all arrays: 7406 MB (7.766e+09 bytes) *
Memory used by MATLAB: 4195 MB (4.398e+09 bytes)
Physical Memory (RAM): 12206 MB (1.280e+10 bytes)
Limited by System Memory (physical + swap file) available.
K>> whos Name
Size Bytes Class Attributes
freqs 7753x1 62024 double
li 1x1 16 double complex
time 1x1984512 15876096 double
So there the config of the computer which failed to run the script (Alienware aurora R4) :
Ram : 4gb free / 12 # 1,6Ghz --> 2x (4Gb 1600Mhz) - 2x (2Gb 1600 MHz)
Intel core i7-3820 4 core 8 threads 3,7 GHz 1 CPU
NVIDIA GeForce GTX 690 2gb
RAM : Kingston KVT8FP HYC
Hard disk : SSD kingston 250Go SATA 3"
This code work on this computer (Dell inspiron 14-500) : config
Ram 4 Go of memory DDR4 2 666 MHz (4 Go x 1)
Intel® Core™ i5-8265U 8e generatio, (6 Mo memory, 3,9 GHz)
Intel® UHD Graphics 620
Hard disk SATA 2,5" 500 Go 5 400 tr/min
Thank you
Kind regards,
by doing freqs(:)*time you are trying to create a 7753*1984512 size array, and you dont have memory for that... (you will need ~123 gigabytes for that, or 1.2309e+11 bytes, where your computer has ~7e+09)
see for example the case for :
f=rand(7,1);
t=rand(1,19);
size(f(:)*t)
ans =
7 19
what you want do to is probably a for loop per time element etc.

getting illegal instructions when vectorized code writes to PCI

I am writing a program that writes to a device's range of HW registers. I am using mmap to map the HW addresses to virtual address (user space). I tested the result from the mmap and it is OK. I implemented a copy of a buffer into the device:
void bufferCopy(void *dest, void *src, const size_t size) {
uint8_t *pdest = static_cast<uint8_t *>(dest);
uint8_t *psrc = static_cast<uint8_t *>(src);
size_t iters = 0, tailBytes = 0;
/* iterate 64bit */
iters = (size / sizeof(uint64_t));
for (size_t index = 0; index < iters; ++index) {
*(reinterpret_cast<uint64_t *>(pdest)) =
*(reinterpret_cast<uint64_t *>(psrc));
pdest += sizeof(uint64_t);
psrc += sizeof(uint64_t);
}
.
.
.
but when running it on QEMU I get illegal instruction exception. When I debugged got it crashes on the next instruction (below is the asm of the main loop):
movdqu (%rsi,%rax,1),%xmm0
movups %xmm0,(%rdi,%rax,1) <----- this instruction crashes ...
add $0x10,%rax
cmp %rax,%r9
jne 0x7ffff7eca1e0 <_ZN12_GLOBAL__N_110bufferCopyEPvS0_m+64>
any ideas why ? my guess that you can write to PCI only 32/64 bit.
The compile doesn’t know my limitations, so it optimize my code and create vectorized loop (each iteration loads 128 bit and saves 128 bit). Is is making sense ?? can I write to PCI with vectorized instructions ?
Also, whether it is a missing feature in QEMU or a bug or just a recommendation, how can I prevent from the compiler to generate those vector instructions ?

Meaning of the benchmark variables in snakemake

I included a benchmark directive to some of the rules in my snakemake workflow, and the resulting files have the following header:
s h:m:s max_rss max_vms max_uss max_pss io_in io_out mean_load
The only documentation I've found mentions a "benchmark txt file (which will contain a tab-separated table of run times and memory usage in MiB)".
I can guess that columns 1 and 2 are two different ways of displaying the time taken to execute the rule (in seconds, and converted to hours, minutes and seconds).
io_in and io_out likely related to disk read and write activity, but in what units are they measured?
What are the others? Is this documented somewhere?
Edit: Looking at the source code
I've found the following piece of code in /snakemake/benchmark.py, that might well be where the benchmark data come from:
def _update_record(self):
"""Perform the actual measurement"""
# Memory measurements
rss, vms, uss, pss = 0, 0, 0, 0
# I/O measurements
io_in, io_out = 0, 0
# CPU seconds
cpu_seconds = 0
# Iterate over process and all children
try:
main = psutil.Process(self.pid)
this_time = time.time()
for proc in chain((main,), main.children(recursive=True)):
meminfo = proc.memory_full_info()
rss += meminfo.rss
vms += meminfo.vms
uss += meminfo.uss
pss += meminfo.pss
ioinfo = proc.io_counters()
io_in += ioinfo.read_bytes
io_out += ioinfo.write_bytes
if self.bench_record.prev_time:
cpu_seconds += proc.cpu_percent() / 100 * (
this_time - self.bench_record.prev_time)
self.bench_record.prev_time = this_time
if not self.bench_record.first_time:
self.bench_record.prev_time = this_time
rss /= 1024 * 1024
vms /= 1024 * 1024
uss /= 1024 * 1024
pss /= 1024 * 1024
io_in /= 1024 * 1024
io_out /= 1024 * 1024
except psutil.Error as e:
return
# Update benchmark record's RSS and VMS
self.bench_record.max_rss = max(self.bench_record.max_rss or 0, rss)
self.bench_record.max_vms = max(self.bench_record.max_vms or 0, vms)
self.bench_record.max_uss = max(self.bench_record.max_uss or 0, uss)
self.bench_record.max_pss = max(self.bench_record.max_pss or 0, pss)
self.bench_record.io_in = io_in
self.bench_record.io_out = io_out
self.bench_record.cpu_seconds += cpu_seconds
So this seems to come from functionalities provided by psutil.
I will just leave this here for future reference.
Reading through
snakemake >= 6.0.0 benchmark module
psutil's memory_info(), memory_full_info(), io_counters(), cpu_times()
as previously suggested:
colname
type (unit)
description
s
float (seconds)
Running time in seconds
h:m:s
string (-)
Running time in hour, minutes, seconds format
max_rss
float (MB)
Maximum "Resident Set Size”, this is the non-swapped physical memory a process has used.
max_vms
float (MB)
Maximum “Virtual Memory Size”, this is the total amount of virtual memory used by the process
max_uss
float (MB)
“Unique Set Size”, this is the memory which is unique to a process and which would be freed if the process was terminated right now.
max_pss
float (MB)
“Proportional Set Size”, is the amount of memory shared with other processes, accounted in a way that the amount is divided evenly between the processes that share it (Linux only)
io_in
float (MB)
the number of MB read (cumulative).
io_out
float (MB)
the number of MB written (cumulative).
mean_load
float (-)
CPU usage over time, divided by the total running time (first row)
cpu_time
float(-)
CPU time summed for user and system
Benchmarking in snakemake could certainly be better documented, but psutil is documanted here:
get_memory_info()
Return a tuple representing RSS (Resident Set Size) and VMS (Virtual Memory Size) in bytes.
On UNIX RSS and VMS are the same values shown by ps.
On Windows RSS and VMS refer to "Mem Usage" and "VM Size" columns of taskmgr.exe.
psutil.disk_io_counters(perdisk=False)
Return system disk I/O statistics as a namedtuple including the following attributes:
read_count: number of reads
write_count: number of writes
read_bytes: number of bytes read
write_bytes: number of bytes written
read_time: time spent reading from disk (in milliseconds)
write_time: time spent writing to disk (in milliseconds)
The code you found confirms that all the memory usage and IO counts are reported in MB (= bytes * 1024 * 1024).

conditional statements for linker command language LD

Are there conditional statements for the GNU LD linker command language?
Context: I am developing firmware for an arm cortex m0+, which consists of a bootloader and an application. Both are compiled and flashed to target in separate projects, but I use a framework with symbolic links to the drivers, makefile and loader scripts so that I can reuse those for every app I make without copying these files for each app.
Currently I have two loader files, for bootloader and application (makefile automatically specifies the appropriate one), with memory assigment as follows:
bootloader
MEMORY {
flash (rx) : ORIGIN = 0x00000000, LENGTH = 16K
ram (rwx) : ORIGIN = 0x1FFFF000, LENGTH = 16K
}
app
MEMORY {
flash (rx) : ORIGIN = 0x00004000, LENGTH = 112K
ram (rwx) : ORIGIN = 0x1FFFF000, LENGTH = 16K
}
Like the makefile, I want to merge these to something like this (using C expressions to clarify)
MEMORY {
#ifdef(bootloaderSymbol)
flash (rx) : ORIGIN = 0x00000000, LENGTH = 16K
#else
flash (rx) : ORIGIN = 0x00004000, LENGTH = 112K
#endif
ram (rwx) : ORIGIN = 0x1FFFF000, LENGTH = 16K
}
Although its not its primary purpose, you can always run the C preprocessor
(cpp) on your linker scripts:
#if defined(MACHINE1)
# define TARGET_ADDRESS 0x80000000
# define SDRAM_START xxx
# define SDRAM_SIZE yyy
# define ROMFLAGS rx
#elif defined(MACHINE2)
# define TARGET_ADDRESS 0x40000000
# define SDRAM_START zzz
# define SDRAM_SIZE aaa
# define ROMFLAGS rwx
#else
# error unknown machine
#endif
MEMORY
{
rom (ROMFLAGS) : ORIGIN = TARGET_ADDRESS, LENGTH = 0x00100000
ram (WX) : ORIGIN = SDRAM_START + SDRAM_SIZE - 0x00200000, LENGTH = 0x00100000
driver_ram (WX) : ORIGIN = SDRAM_START + SDRAM_SIZE - 0x00100000, LENGTH = 0x00100000
}
...
You just need to make sure your macros don't collide with linker script syntax. Then save your linker script as xxx.lk.in (instead of xxx.lk) and add a recipe to your Makefile:
xxx.lk: xxx.lk.in
$(CPP) -P $(INCLUDE) -D$(MACHINE) $< $#
All that's left to do is to add xxx.lk as dependency to your final executables build recipe. I'm using similar processes on many of my projects successfully.
I think you can try "DEFINED(symbol)" according to https://sourceware.org/binutils/docs/ld/Builtin-Functions.html
Also, please don't forget to pass "--defsym=bootloaderSymbol=1" to ld.
MEMORY {
flash (rx) : ORIGIN = DEFINED(bootloaderSymbol) ? 0x00000000 : 0x00004000, LENGTH = DEFINED(bootloaderSymbol) ? 112K : 16K
ram (rwx) : ORIGIN = 0x1FFFF000, LENGTH = 16K
}
I've gone down this same path before, and later found out there's an ld command line argument to specify segment origin, which alleviates the need to figure it out in the linker script. From man page:
-Tbss=org
-Tdata=org
-Ttext=org
Same as --section-start, with ".bss", ".data" or ".text" as the section name.
So in your case, you would have -Ttext=0 (bootloader) or -Ttext=0x00004000 (app)