Basic question regarding ROM based executable - operating-system

I have basic doubt regarding executable stored in ROM.
As I know the executable with text and RO attributes is stored in ROM. Question is as ROM is for Read Only Memory, what happens if there is situation where the code needs to write into memory?
I am not able to conjure up any example to cite here (probably I am ignorant of such situation or I am missing out basic stuff ;) but any light on this topic can greatly help me to understand! :)
Last off -
1. Is there any such situation?
2. In such a case is copying the code from ROM to RAM is the answer?
Answer with some example can greatly help..
Many thanks in advance!
/MS

Read-only memory is read only because of hardware restrictions. The program might be in an EEPROM, flash memory protected from writes, a CD-ROM, or anything where the hardware physically disallows writing. If software writes to ROM, the hardware is incapable of changing the stored data, so nothing happens.
So if a software program in ROM wants to write to memory, it writes to RAM. That's the only option. If a program is running from ROM and wants to change itself, it can't because it can't write to ROM. But yes, the program can run from RAM.
In fact, running from ROM is rare except in the smallest embedded systems. Operating systems copy executable code from ROM to RAM before running it. Sometimes code is compressed in ROM and must be decompressed into RAM before running. If RAM is full, the operating system uses paging to manage it. The reason running from ROM is so rare is because ROM is slower than RAM and sometimes code needs to be changed by the loader before running.
Note that if you have code that modifies itself, you really have to know your system. Many systems use data-execution prevention (DEP). Executable code goes in read+execute areas of RAM. Data goes in read+write areas. So on these systems, code can never change itself in RAM.

Normally only program code, constants and initialisation data are stored in ROM. A separate memory area in RAM is used for stack, heap, etc.

There are few legitimate reasons why you would want to modify the code section at runtime. The compiler itself will not generate code that requires that.
Your linker will have an option to generate a MAP file. This will tell you where all memory objects are located.
The linker chooses where to locate based on a linker script (which you can customise to organise memory as you require). Typically on a FLASH based microcontroller code and constant data will be placed in ROM. Also placed in ROM are the initialisation data for non-zero initialised static data, this is copied to RAM before main() is called. Zero initialised static data is simply cleared to zero before main().
It is possible to arrange for the linker to locate some or all of the code in ROM and have the run-time start-up code copy it to RAM in the same way as the non-zero static data, but the code must either be relocatable or be located to RAM in the first instance, you cannot usually just copy code intended to run from ROM to RAM and expect it to run since it may have absolute address references in it (unless perhaps your target has an MMU and can remap the address space). Locating in RAM on micro-controllers is normally done to increase execution speed since RAM is typically faster than FLASH when high clock speeds are used, producing fewer or zero wait states. It may also be used when code is loaded at runtime from a filesystem rather than stored in ROM. Even when loaded into RAM, if the processor has an MMU it is likely that the code section in RAM section will be marked read-only.

Harvard architecture microcontrollers
Many small microcontrollers (Microchip PIC, Atmel AVR, Intel 8051, Cypress PSoC, etc.) have a Harvard architecture.
They can only execute code from the program memory (flash or ROM).
It's possible to copy any byte from program memory to RAM.
However, (2) copying executable instructions from ROM to to RAM is not the answer -- with these small microcontrollers, the program counter always refers to some address in the program memory. It's not possible to execute code in RAM.
Copying data from ROM to RAM is pretty common.
When power is first applied, a typical firmware application zeros all the RAM and then copies the initial values of non-const global and static variables out of ROM into RAM just before main() starts.
Whenever the application needs to push a fixed string out the serial port, it reads that string out of ROM.
With early versions of these microcontrollers, an external "device programmer" connected to the microcontroller is the only way change the program.
In normal operation, the device was nowhere near a "device programmer".
If the software running on the microcontroller needed to write to program memory ROM -- sorry, too bad --
it was impossible.
Many embedded systems had non-volatile EEPROM that the code could write to -- but this was only for storing data values. The microcontroller could only execute code in the program ROM, not the EEPROM or RAM.
People did may wonderful things with these microcontrollers, including BASIC interpreters and bytecode Forth interpreters.
So apparently (1) code never needs to write to program memory.
With a few recent "self-programming" microcontrollers (from Atmel, Microchip, Cypress, etc.),
there's special hardware on the chip that allows software running on the microcontroller to erase and re-program blocks of its own program memory flash.
Some few applications use this "self-programming" feature to read and write data to "extra" flash blocks -- data that is never executed, so it doesn't count as self-modifying code -- but this isn't doing anything you couldn't do with a bigger EEPROM.
So far I have only seen two kinds of software running on Harvard-architecture microcontrollers that write new executable software to its own program Flash: bootloaders and Forth compilers.
When the Arduino bootloader (bootstrap loader) runs and detects that a new application firmware image is available, it downloads the new application firmware (into RAM), and writes it to Flash.
The next time you turn on the system it's now running shiny new version 16.98 application firmware rather than clunky old version 16.97 application firmware.
(The Flash blocks containing the bootloader itself, of course, are left unchanged).
This would be impossible without the "self-programming" feature of writing to program memory.
Some Forth implementations run on a small microcontroller, compiling new executable code and using the "self-programming" feature to store it in program Flash -- a process somewhat analogous to the JVM's "just-in-time" compiling.
(All other languages seem to require a compiler far too large and complicated to run on a small microcontroller, and therefore have a edit-compile-download-run cycle that takes much more wall clock time).

Related

Loading OS image from floppy disk without INT 13 of BIOS Service

How can I load OS image from floppy disk to memory without BIOS Service while booting my PC?
The only way I’ve used is calling int13h in real mode .
I got to know that I need to handle with ‘Disk controller’ .
Do I need to write kinda ‘Device driver’ in [BIT 16] real mode and is it possible?
As 0andriy has commented, you will have to communicate with the floppy controller directly, bypassing the BIOS. (Which BTW, why do you want to do such a thing? The BIOS was made specifically so you don't have to do this. Is it solely because you want to, maybe to learn how to program the FDC? I'm okay with that.)
The FDC (Floppy Disk Controller) is of the ISA (Industry Standard Architecture) era, back when I/O ports were hard coded to specific addresses. The FDC came in many variants, but most followed a standard rule. The original 756 was a common FDC, with later (still really old to today's standards) controllers following the 82077AA variant.
These controllers had twelve (12) registers using eight (8) I/O Byte addresses, Base + 00h to Base + 07h. (Please note that a single I/O address can be two registers if one is a read and one is a write.) You read and write to these registers to instruct the FDC to do things, such as start the motor for drive 1. (For fun: Did you know that the FDC was originally capable of handling four drives?)
This isn't to difficult to do, but now you have to have some way for the ISA bus to communicate with the FDC and the main memory. In comes the DMA (Direct Memory Access). Now you have to also program the DMA to make the transfers.
Here is the catch. If you don't have all of the FDC and DMA code within the first 512 bytes of the floppy, the 512 bytes the BIOS loaded for you already, there is no way to load the remaining sectors. For example, you can't have your DMA code in the second sector of your boot code expecting to call it, since you have to use that DMA to load that sector in the first place. All FDC and DMA code, at least a minimum read service, must be in the first sector of the disk. This is quite difficult to do, reliably.
I am not saying it is impossible to do, I am just saying it is improbable. For one thing, if you can do it (reliably) in 512 bytes, I would like to see it. It might be a fun experiment. Anyway, do a search for FDC, DMA, etc., things I wrote of here. There are many examples on the web. If you wish to read a book about it, I wrote such a book a while back with all the juicy details.

Theoretical embedded linux requirements

I come from a programmer background using Java, C#, C++, Javascript
I got my self a Raspberry Pi (Model 1 A, the one without ethernet) and played around for a while with it. I used Raspbian and Arch Linux ARM (since it was said it is small and customizable). Unfortunatly I didn't manage to configure them as I want to have them.
I am trying to build a nice looking (embedded) system with the only goal to start (boot) the Raspberry Pi fast and autostart a test application which will be written in C# (Mono), C++ (Qt), Java (Java Runtime) or something in JavaScript/HTML.
Since I was not able to get rid of all the log messages (i got rid of most), the tty login screen, the attempts of connecting to the network (although the Model 1 A does not have ethernet at all) booting was ugly and took long (+1 minute in some cases).
It seems I will have to build a minimum embedded linux but I have a lack in the theory of embedded linux elements and how they fit together.
My question: What are the theoretically required parts of an embedded linux holding either mono, qt, java runtime on a raspberry pi?
So far I know the following parts:
the hardware (raspberry pi model 1 A) + sd card
the sd card holds 2 partitions, 1 boot partition (fat32), 1 data partition (ext4)
a boot loader
a linux kernel (which can be optimized to the needs of a raspi)
But what then? My research got lost at "use a distro" what I don't want. What are the missing pieces between the kernel and starting an application?
An Embedded Linux system is comprised of many different parts that work together towards the same goal of making things work efficiently.
Ideally, that is not much different from a regular GNU/Linux system, but let's see in detail the building blocks of a generic embedded system.
For the following explanation, I am assuming as architecture ARM. What is written below may differ slightly from implementation to implementation, but is usually a common track for commercial embedded systems.
Blocks of a GNU/Linux Embedded System
Hardware
SoC
The SoC is where all the processing takes places, it is the main processing unit of the whole system and the only place that has "intelligence". It is in charge of using the other hardware and running your software.
It is made of various and heterogeneous sub-blocks:
Core + Caches + MMU - the "real" processor, e.g. ARM Cortex-A9. It's the main thing you will notice when choosing a SoC.
May be coadiuvated by e.g. a SIMD coprocessor like NEON.
Internal RAM - generally very small. Used in the first phase of the boot sequence.
Various "Peripherals" - connected via some interconnect
fabric/bus to the Core. These can span from a simple ADC to a 3D Graphics Accelerator. Examples of such IP cores are: USB, PCI-E, SGX, etc.
A low power/real time coprocessor - some systems offer one or more coprocessor thought either to help the main Core with real time tasks (e.g. industrial communication buses) or to handle low power states. Its/their architecture might (or not) be a relative of the Core's one.
External RAM
It is used by the SoC to store temporary data after the system has bootstrapped and during the bootstrap itself. It's usually the memory your embedded system uses during regular operation.
Non-Volatile Memory - optional
May or may not be present. In your case it's the SD card you mentioned. In other cases could be a NAND, NOR or SPI Dataflash memory (or any combination of them).
When present, it is often the regular source of data the SoC will read from and usually stores all the SW components needed for the system to work.
Could not be necessary/useful in some kind of applications.
External Peripherals
Anything not strictly related to the above.
Could be a MAC ID EEPROM, some relays, a webcam or whatever you can possibly imagine.
Software
First of all, we introduce what is called the bootchain, which is what happens as soon as you power up your SoC and - someway - tell it to start running. In the following list, the bootchain is the subsequent calls of point 1 to point 4.
Apart from specific/exotic implementations, it is more or less always the same:
Boot ROM code - a small (usually masked - aka factory impressed) memory contained in the SoC. The first thing the SoC will do when powered up is to execute the code in it.
This code will - generally according to external configuration pins - decide the so-called "boot strategy" or "boot order", which is where (and in what order) to look for additional code to be executed. The suitable mediums are disparate: USB storage devices, USB hosts, SD cards, NANDs, NORs, SPI dataflashes, Ethernets, UARTs, etc.
If none of the above contains something valid, the Boot ROM will usually issue a soft reset of the SoC, and so on.
The code in the medium is not, of course, executed in place: it gets copied into the Internal RAM then executed.
[The following two are contained in what we will call bootloader medium]
1st stage bootloader - it has just been copied by the Boot ROM into
the SoC's Internal RAM. Must be tiny enough to fit that memory
(usually well under 100kB). It is needed because the Boot ROM isn't
big enough and does not know what kind of External RAM the SoC is
attached to. Has the main important function of initializing the
External RAM and the SoC's external memory interface, as well as
other peripherals that may be of interest (e.g. disable watchdog
timers). Once done, it copies the next stage to the External RAM and
executes it. Depending on the context, could be called MLO, SPL or
else.
2nd stage bootloader - the "main" bootloader. Bigger (could be x10) than the 1st stage one, completes the initializiation of the
relevant peripherals (e.g. ethernet, additional storage media, LCD
displays). Allows a much more complicated logic for what to do next
and offers - depending on the level of sofistication - high level
facilities (filesystem/volume handling, data
copy-move-interpretation, LCD output, interactive console, failsafe
policies). Most of the times loads a Linux kernel (and related) into
memory from some medium and passes relevant information to it (e.g.
if not embedded, for newer kernels the DTB physical address is put
in the r2 register - the Kernel then reads the register and
retrieves the DTB)
Linux Kernel - the core of the operating system. Depending on the
hardware platform may or may not be a mainline ("official") version.
Is usually completed by built-in or loadable (from an external
source - free or not) modules. Initializes all the hardware needed for the complete system to work according to hardcoded configuration and the DT - enables MMU, orchestrates the whole system and accesses the hardware exlusively. According to the boot arguments
(cmdline - usually passed by the previous stage) and/or to compiled
options, the Kernel tries to mount a root file system. From the
rootfs, it will try to load an init (namely, /sbin/init - where / is
the just mounted rootfs).
Init and rootfs - init is the first non-Kernel task to be run, and
has PID 1. It initalizes literally everything you need to use your
system. In production embedded systems, it also starts the main
application. In such systems is either BusyBox or a custom crafted
application.
More on rootfs and distros
Rootfs contains all of your GNU/Linux systems that is not Kernel (apart from /lib/modules and other bits).
It contains all the applications that manage peripherals like Ethernet, WiFi, or external UMTS modems.
Contains the interactive part of the system, contains the user interface, and everything else you see when you boot a GNU/Linux system - embedded or not.
A "distro" is just a particular collection of userspace (non-Kernel) programs and libraries (usually) verified to work well one with the other, put toghether by a particular group of people.
Desktop distros usually also ship with a custom-tailored kernel and a bootloader. Examples are Fedora, Ubuntu, Debian, etc.
In the general sense of the term, nothing stops you from creating your own distro, which is what happens everytime a custom embedded system goes in production: through tools like Yocto or Buildroot (or by hand), in fact, you are able to decide the very particular collection (hence distro, distribution) of softwares fit for the purpose of the system.
To sum up and answer exactly to your question, the missing part you are looking for is init and the process of mounting the rootfs: the Kernel mounts - aka renders available to itself - via its drivers and the passed/builtin parameters - a given volume/partition (the ext4 data partition you mention) to the "/" mount point.
In this volume/partition there is a /sbin/init executable, which the Kernel executes.
This is the "Big Bang" of our GNU/Linux userspace system: the place where everything visible starts. Depending on the configuration scripts (usually located under /etc/init.d) the "application" you mention is either run automatically by init or by the user via a terminal/ssh/whatever that - again - init made you possible to use.

Arduino flash memory limit

I have a project on Arduino Uno, and I am making it from Eclipse. The AVR compiler gives me this:
avrdude: 24348 bytes of flash written avrdude: verifying flash memory
against SunAngles.hex: avrdude: load data flash data from input file
SunAngles.hex: avrdude: input file SunAngles.hex auto detected as
Intel Hex avrdude: input file SunAngles.hex contains 24348 bytes
avrdude: reading on-chip flash data:
Reading | ################################################## | 100%
3.45s
avrdude: verifying ... avrdude: 24348 bytes of flash verified
avrdude done. Thank you.
The serial monitor does not print anything. If I make the project to be 23999 bytes then the serial monitor works. I have checked Eclipse's serial monitor and Arduino IDE's Serial monitor. They have the same problem. At the site it says that Arduino Uno has 32 KB flash memory and that 0.5 KB is used for the bootloader. What is happening?
In another question someone says to use serial.print(F(something));, and they give a library for pgm. What should I do to solve this problem?
Don't forget the small size of RAM, the 328's 2 KB. You may just be running out of RAM. I learned that when it runs out, it just kind of sits there. And to at first it really looked like a flash boundary problem. Just like your symptom.
I suggest reading the readme library to get the FreeRAM from this. It mentions how the "Serial.print" can consume both RAM and ROM.
I always now use
Serial.print(F("HELLO"));
versus
Serial.print("HELLO");
as it saves RAM, and this should be true for lcd.print. Where I always put a
Serial.println(freeMemory(), DEC); // Print how much RAM is available.
in the beginning of the code, and pay attention. Noting that there needs to be room to run the actual code and recurse into it.
The F() is now stock in Arduino 1.0 and replaces the need for the library function getPSTR().
The latest Arduino IDE also indicates a very rough estimate of expected RAM usage. So there is a switch for that in avr-gcc. You may also want to try using the avr-gcc 4.7.0 rather than 4.3.2 (stock for Arduino), as it claims to be more optimizing.
To equip yourself just in case anyone still has similar issues: Please read the blog post Optimizing SRAM on managing the limited Arduino memory.
From there, you will get a few things to keep in mind as you develop your sketch.
Avoid as much as you possibly can, any global variables. Keep them local to their functions.

Kernel Code vs User Code

Here's a passage from the book
When executing kernel code, the system is in kernel-space execut-
ing in kernel mode.When running a regular process, the system is in user-space executing
in user mode.
Now what really is a kernel code and user code. Can someone explain with example?
Say i have an application that does printf("HelloWorld") now , while executing this application, will it be a user code, or kernel code.
I guess that at some point of time, user-code will switch into the kernel mode and kernel code will take over, but I guess that's not always the case since I came across this
For example, the open() library function does little except call the open() system call.
Still other C library functions, such as strcpy(), should (one hopes) make no direct use
of the kernel at all.
If it does not make use of the kernel, then how does it make everything work?
Can someone please explain the whole thing in a lucid way.
There isn't much difference between kernel and user code as such, code is code. It's just that the code that executes in kernel mode (kernel code) can (and does) contain instructions only executable in kernel mode. In user mode such instructions can't be executed (not allowed there for reliability and security reasons), they typically cause exceptions and lead to process termination as a result of that.
I/O, especially with external devices other than the RAM, is usually performed by the OS somehow and system calls are the entry points to get to the code that does the I/O. So, open() and printf() use system calls to exercise that code in the I/O device drivers somewhere in the kernel. The whole point of a general-purpose OS is to hide from you, the user or the programmer, the differences in the hardware, so you don't need to know or think about accessing this kind of network card or that kind of display or disk.
Memory accesses, OTOH, most of the time can just happen without the OS' intervention. And strcpy() works as is: read a byte of memory, write a byte of memory, oh, was it a zero byte, btw? repeat if it wasn't, stop if it was.
I said "most of the time" because there's often page translation and virtual memory involved and memory accesses may result in switched into the kernel, so the kernel can load something from the disk into the memory and let the accessing instruction that's caused the switch continue.

A trivial SYSENTER/SYSCALL question

If a Windows executable makes use of SYSENTER and is executed on a processor implementing AMD64 ISA, what happens? I am both new and newbie to this topic (OSes, hardware/software interaction) but from what I've read I have understood that SYSCALL is the AMD64 equivalent to Intel's SYSENTER. Hopefully this question makes sense.
If you try to use SYSENTER where it is not supported, you'll probably get an "invalid opcode" exception.
Note that this situation is unusual - generally, Windows executables do not directly contain instructions to enter kernel mode.
As far as i know AM64 processors using different type of modes to handle such issues.
SYSENTER works fine but is not that fast.
A very useful site to get started about the different modes:
Wikipedia
They got rid of a bunch of unused functionality when they developed AMD64 extensions. One of the main ones is the elimination of the cs, ds, es, and ss segment registers. Normally loading segment registers is an extremely expensive operation (the CPU has to do permission checks, which could involve multiple memory accesses). Entering kernel mode requires loading new segment register values.
The SYSENTER instruction accelerates this by having a set of "shadow registers" which is can copy directly to the (internal, hidden) segment descriptors without doing any permission checks. The vast majority of the benefit is lost with only a couple of segment registers, so most likely the reasoning for removing the support for the instructions is that using regular instructions for the mode switch is faster.