How is a variable assigned a memory address? - operating-system

If I write an instruction x = 7, I understand x to be some address. What then assigns a memory address to x? Is this address a virtual address that is then translated into a physical memory address?

If I write an instruction x = 7, I understand x to be some address. What then assigns a memory address to x?
It depends on the type of var x.
if x is a global or static variable, several tools will cooperate to give it an address
the compiler will write in the object file that it needs to store a global var named x with 4 bytes.
the linker will collect all the global vars in object files, put them in the data segment, and choose a position for them. For instance, x will be at #data_segment+0x1000. The linker will then modify all references to x in the code by #data_segment+0x1000
when it runs the program, the loader will first ask the operating system memory to store the different segments, including data segment. One then knows the value of #data_segment and the actual address of x1.
if x is a local variable, things are slightly simpler. All local vars are in the stack and their address is computed relatively to stack (or frame) pointer by the compiler. So address of x will be something like #stack_pointer+8 and it is generated by the compiler. But its actual value is only known at execution and depends on the stack pointer.
if x is dynamically allocated (malloc-ed), its address is only known at run-time. malloc() asks the OS for chunks of memory and dynamically positions vars in it. x will be put at a position that depends on free space in the memory managed by malloc()
Is this address a virtual address that is then translated into a physical memory address?
All addresses seen by the computer are virtual addresses that are converted to physical memory addresses.
1 Virtual addresses of program segments (including data segment) used to be constant for different executions of the program, but it is no longer true. For security reasons, they are randomized.

There are generally four ways this is done.
1) The variable is mapped to a hardware register. In that case x has no address.
2) The variable has an absolute address. This is usually considered bad form because the code using absolute addresses cannot be relocated; meaning it has to be placed in a fixed location in the address space. However, there are cases where a variable must be at a specific locations, such as some interfaces to devices.
In this case the address of x may be specified by the compiler or by the linker.
3) The variable is defined as an offset from a stack-related register. The is the method used to implement local variables in most programming languages. If you have 4-byte integers and say a C declaration like
int x, y ;
in a function with no other variables, there were be instructions at the top fo the function that look something like:
SUBL2 #8, SP ; Allocate 8 bytes from the stack
MOVL SP, BP ; Set the Base Pointer Register to the start of the allocation
where SP is the stack pointer and BP is some based pointer register.
In that case, x could then be the offset located at BP + 0 and, y could be at BP + 4.
Thus something like
x = y
would look like
MOVL X(BP), Y(BP)
or written as:
MOVL (BP), 4(BP)
The memory location of x and y are entirely determined at run time. Only the offset from the base pointer register is known. In fact, there could be multiple x and y active at the same time having different addresses if their containing function is called recursively or through an interrupt.
4) The memory location is another register offset (usually the program counter).
Let's say you are using traditional uppercase FORTRAN where all variable are static. It is common for the compiler to determine an location for a variable but refer to it using an offset from the program counter register (or some other register). The variable remains in a fixed place at run time but the location could be variable. Using such an offset allows the code to be position independent; meaning it can be loaded anywhere in memory. This allows the code to be used in shared libraries that can be used by multiple programs.
Usually the compiler sets some location for the variable and then that gets fixed by the linker.

Related

What is the benefit of having the registers as a part of memory in AVR microcontrollers?

Larger memories have higher decoding delay; why is the register file a part of the memory then?
Does it only mean that the registers are "mapped" SRAM registers that are stored inside the microprocessor?
If not, what would be the benefit of using registers as they won't be any faster than accessing RAM? Furthermore, what would be the use of them at all? I mean these are just a part of the memory so I don't see the point of having them anymore. Having them would be just as costly as referencing memory.
The picture is taken from Avr Microcontroller And Embedded Systems The: Using Assembly and C by Muhammad Ali Mazidi, Sarmad Naimi, and Sepehr Naimi
AVR has some instructions with indirect addressing, for example LD (LDD) – Load Indirect From Data Space to Register using Z:
Loads one byte indirect with or without displacement from the data space to a register. [...]
The data location is pointed to by the Z (16-bit) Pointer Register in the Register File.
So now you can move from a register by loading its data-space address into Z, allowing indirect or indexed register-to-register moves. Certainly one can think of some usage where such indirect access would save the odd instruction.
what would be the benefit of using registers as they won't be any faster than accessing RAM?
accessing General purpose Registers is faster than accessing Ram
first of all let us define how fast measured in microControllers .... fast mean how many cycle the instruction will take to excute ... LOOk at the avr architecture
See the General Purpose Registers GPRs are input for the ALU , and the GPRs are controlled by instruction register (2 byte width) which holds the next instruction from the code memory.
Let us examine simple instruction ADD Rd , Rr; where Rd,Rr are any two register in GPRs so 0<=r,d<=31 so each of r and d could be rebresented in 5 bit,now open "AVR Instruction Set Manual" page number 32 look at the op-code for this simple add instraction is 000011rdddddrrrr and becuse this op-code is two byte(code memory width) this will fetched , Decoded and excuit in one cycle (under consept of pipline ofcourse) jajajajjj only one cycle seems cool to me
I mean these are just a part of the memory so I don't see the point of having them anymore. Having them would be just as costly as referencing memory
You suggest to make the all ram as input for the ALU; this is a very bad idea: a memory address takes 2 bytes.
If you have 2 operands per instruction as in Add instruction you will need 4 Byte for saving only the operands .. and 1 more byte for the op-code of the operator itself in total 5 byte which is waste of memory!
And furthermore this architecture could only fetch 2 bytes at a time (instruction register width) so you need to spend more cycles on fetching the code from code memory which is waste of cycles >> more slower system
Register numbers are only 4 or 5 bits wide, depending on the instruction, allowing 2 per instruction with room to spare in a 16-bit instruction word.
conclusion GPRs' existence are crucial for saving code memory and program execution time
Larger memories have higher decoding delay; why is the register file a part of the memory then?
When cpu deal with GPRs it only access the first 32 position not all the data space
Final comment
don't disturb yourself by time diagram for different ram technology because you don't have control on it ,so who has control? the architecture designers , and they put the limit of the maximum crystal frequency you can use with there architecture and everything will be fine .. you only concern about cycles consuming with your application

Why does registers exists and how they work together with cpu?

So I am currently learning Operating Systems and Programming.
I want how the registers work in detail.
All I know is there is the main memory and our CPU which takes address and instruction from the main memory by the help of the address bus.
And also there is something MCC (Memory Controller Chip which helps in fetching the memory location from RAM.)
On the internet, it shows register is temporary storage and data can be accessed faster than ram for registers.
But I want to really understand the deep-down process on how they work. As they are also of 32 bits and 16 bits something like that. I am really confused.!!!
I'm not a native english speaker, pardon me for some perhaps incorrect terminology. Hope this will be a little bit helpful.
Why does registers exists
When user program is running on CPU, it works in a 'dynamic' sense. That is, we should store incoming source data or any intermediate data, and do specific calculation upon them. Memory devices are needed. We have a choice among flip-flop, on-chip RAM/ROM, and off-chip RAM/ROM.
The term register for programmer's model is actually a D flip-flop in the physical circuit, which is a memory device and can hold a single bit. An IC design consists of standard cell part (including the register mentioned before, and and/or/etc. gates) and hard macro (like SRAM). As the technology node advances, the standard cells' delay are getting smaller and smaller. Auto Place-n-Route tool will place the register and the related surrounding logic nearby, to make sure the logic can run at the specified 3.0/4.0GHz speed target. For some practical reasons (which I'm not quite sure because I don't do layout), we tend to place hard macros around, leading to much longer metal wire. This plus SRAM's own characteristics, on-chip SRAM is normally slower than D flip-flop. If the memory device is off the chip, say an external Flash chip or KGD (known good die), it will be further slower since the signals should traverse through 2 more IO devices which have much larger delay.
how they work together with cpu
Each register is assigned a different 'address' (which maybe not open to programmer). That is implemented by adding address decode logic. For instance, when CPU is going to execute an instruction mov R1, 0x12, the address decode logic sees the binary code of R1, and selects only those flip-flops corresponding to R1. Then data 0x12 is stored (written) into those flip-flops. Same for read process.
Regarding "they are also of 32 bits and 16 bits something like that", the bit width is not a problem. Both flip-flops and a word in RAM can have a bit width of N, as long as the same address can select N flip-flops or N bits in RAM at one time.
Registers are small memories which resides inside the processor (what you called CPU). Their role is to hold the operands for fast processor calculations and to store the results. A register is usually designated by a name (AL, BX, ECX, RDX, cr3, RIP, R0, R8, R15, etc.) and has a size which is the number of bits it can store (4, 8, 16, 32, 64, 128 bits). Other registers have special meanings, and their bits control the state or provide information about the state of the processor.
There are not many registers (because they are very expensive). All of them have a capacity of only a few kilobytes, so they can't store all the code and data of your program, which can go up to gigabytes. This is the role of the central memory (what you call RAM). This big memory can hold gigabytes of data and each byte has its address. However, it only holds data when the computer is turned on. The RAM reside outside of the CPU Chip and interacts with him via Memory Controller Chip which stands as interface between CPU and RAM.
On top of that, there is the hard drive that stores your data when you turn off your computer.
That is a very simple view to get you started.

When to use device and when to use constant address space qualifier in metal shading language?

I know that device address space is used when indexing a buffer and constant address space is used when many invocations of the function will access the same portion of the buffer. But I am still not very clear. Thank you!
Based on this Metal Shading Language Specification
device Address Space
The device address space name refers to buffer memory objects
allocated from the device memory pool that are both readable and
writeable. A buffer memory object can be declared as a pointer or
reference to a scalar, vector or userdefined structure. In an app,
Metal API calls allocate the memory for the buffer object, which
determines the actual size of the buffer memory. Some examples are:
// An array of a float vector with four components.
device float4 *color;
struct Foo {
float a[3];
int b[2];
}
// An array of Foo elements.
device Foo *my_info;
Since you always allocate texture objects from the device address
space, you do not need the device address attribute for texture types.
constant Address Space
The constant address space name refers to buffer memory objects
allocated from the device memory pool but are read-only. Variables in
program scope must be declared in the constant address space and
initialized during the declaration statement. The initializer(s)
expression must be a core constant expression. Variables in program
scope have the same lifetime as the program, and their values persist
between calls to any of the compute or graphics functions in the
program.
constant float samples[] = { 1.0f, 2.0f, 3.0f, 4.0f };
Pointers or references to the constant address space are allowed as
arguments to functions. Writing to variables declared in the constant
address space is a compile-time error. Declaring such a variable
without initialization is also a compile-time error. To decide which
address space (device or constant) a read-only buffer passed to a
graphics or kernel function uses, look at how the buffer is accessed
inside the graphics or kernel function. The constant address space is
optimized for multiple instances executing a graphics or kernel
function accessing the same location in the buffer. Some examples of
this access pattern are accessing light or material properties for
lighting / shading, matrix of a matrix array used for skinning, filter
weight accessed from a filter weight array for convolution. If
multiple executing instances of a graphics or kernel function are
accessing the buffer using an index such as the vertex ID, fragment
coordinate, or the thread position in grid, the buffer must be
allocated in the device address space.

How is the page number checked against the PT entries?

In the book Operating System Concepts by Silberschatz et al. and in other representations of paging, each Page Table (PT) entry is shown to contain a Frame Number(FN). I have been trying to solve some problems on this topic and the solutions also assume the same.
I am new to this topic and all I've read is that the PT (or part of it) is usually stored in the physical memory. This means that the PT must start at some particular physical address. This value is stored in the PTBR. So when the CPU generates a logical address which is divided into a Page Number(PN) and the offset, the PN must be compared against it's value in the PT to determine the corresponding FN, right? Suppose, p=14. Now, 14, in its binary form, must be checked in the PT to find the entry 14.
So, does this mean that some bits of each entry of the PT store the PN and the rest of them store the corresponding FN? Then what about the problems which calculate the PT size assuming that it's only a multiple of the number of bits of the FN and completely ignore the PN bits? Or is it that since the values in the PT are stored in increasing order of PN, some calculations are performed on the PTBR to get the particular PT entry? If so, once the calculation to access the particular PT entry has been done, how is the memory access switched from the base address in the PTBR to the 'n'th entry? What hardware is used for that switch?
Basically, what I'd like to know is, where is the index, p(page number), which checks for the corresponding FN stored? How exactly is this value checked against the PT entries to get the FN?
Physical Page Frames are stored sequentially in hardware.
A logical address consists of a logical page number and an offset into the page. For page translation, we can ignore the offset.
In the simplest case, the page table is an array of Page Table Entries. The Logical Page Number serves as an indicate into this table.
The Page Table Entry may or may not have a value for a Physical Page Frame.
This means that the PT must start at some particular physical address.
The address and size of the Page Table is usually specified by hardware registers.
So when the CPU generates a logical address which is divided into a Page Number(PN) and the offset, the PN must be compared against it's value in the PT to determine the corresponding FN, right?
You appear to be describing an INVERTED PAGE TABLE that is used on a few processors. In most systems the Page Number is an index into the table. There is no comparison.

Modbus server node register mapping strategy

I have a small raspberry pi like gadget running Linux. I'm using uModbus to let it act as a Modbus Server so that it can provide data and control points to a larger PLC system.
My device is a gateway to a series of sub nodes. Let's pretend for example, they are micro weather data sensors. For each one, the device gathers pressure, humidity, temperature, and windspeed. The subnode count is variable but can be as many as 100. It is possible that at some point, there might be an additional value or two added to what the sensors can read.
To flatten these values into modbus registers, I can choose to order by subnode first or subnode second.
A subnode first scheme might look something like
baseAddress = 01000 + (subnodeIndex * 10)
And then the individual values per subnode are just offset from that.
The subnode second scheme would assign a baseAddress for each value type (e.g. temperature starts at 00100, humidity at 00300, etc).
Is there a strong reason to prefer one scheme over the other? Or is there a scheme I'm missing? It seems that there's value in banking values together to take advantage of single block reads, but at the same time, leaving space for future expansion encourages some form of padding between values.