Virtual address vs Physical address [closed] - operating-system

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
This post was edited and submitted for review 1 year ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I understand that physical address space depends on RAM. If we have 1GB ram, the physical address space size is 1GB. Whereas every process has its own virtual address space, e.g. a 32 bit machine has 4GB virtual address space for each process.
If we write something to a physical address space in a baremetal application, let us take write 1234 in address 0x1000, so the data will be written in to 0x1000th offset of 1GB of physical address space.
My doubt is, in virtual address space process-1 wants to write 0x1234 at 0x1000 offset, the same way process-2 wants the same data at the same offset. How it will work?
I want to understand how the multiple virtual address spaces are mapped to a single physical address space.
2.
I understand your point, if physical adress contains 40 bits, then only it can map Ram,Rom and others I/O's.
Where can we refer width( bits) of physical address space and virtual address space?
Is baremetal or microcontroller code access directly physical address space? Also it does not have multitasking and multithreading feature?

Till now what I understand is physical address space depends on Ram. If we have 1gb ram, physical address space size is 1gb.
For the physical address space; some physical addresses correspond to RAM, some are other things (ROM, devices), and a lot are literally nothing (unused). If you have 1 GiB of RAM, then physical addresses may be 40 bits and physical address space size may be 1024 GiB (where 1 GiB of it might be RAM, 128 MiB might be ROM and devices, and over 1022 GiB is literally nothing).
My doubt is in virtual address space process-1 wants to 0x1234 at 0x1000 offset, same way process-2 wants same data in same offset. How it will work?
The CPU uses a set of tables (called "page tables") to convert virtual addresses into physical addresses. Typically each process has its own set of page tables, and when the OS does a task switch from one process to another it also switches from the previous process' page tables to the next process' page tables.
If process 1 writes 0x1234 at virtual address 0x1000 in its virtual address space then (using process 1's page tables) the CPU might write the value 0x1234 at physical address 0x1111000. If process 2 writes 0x4567 at virtual address 0x1000 in its virtual address space then (using process 2's page tables) the CPU might write the value 0x4567 at physical address 0x2222000. Because the virtual address spaces are different (because the processes have their own page tables) the physical addresses are different, so there's no problem (physical address 0x1111000 has nothing to do with physical address 0x22220000).

Related

How does Cpu generate the logical address for the disk

If you think question is not proper please edit or make it correct, i am asking what i google and extract from the internet.
Cpu generates the logical address which is converted into physical address but the question here is how does the cpu generates the logical address for the data that is stored on the disk.
Cpu generates the logical address which is converted into physical address but the question here is how does the cpu generates the logical address for the data that is stored on the disk.
It doesn't, at least not the way you're thinking it does.
Normally a program tries to access memory at a virtual address, but the CPU sees "virtual address isn't present" and complains to the OS (kernel) via. a page fault. The page fault handler figures out what went wrong, loads the data from disk into RAM, maps the RAM into the virtual address space, then lets the program continue/retry as if nothing happened. The second time the CPU tries to execute the code the data is in RAM so it works fine.
Of course the OS has to know the reason why data at a virtual address wasn't present, which means that the OS has to keep track of extra information that the CPU doesn't have - if the virtual address actually isn't valid at all (e.g. NULL), or if the data is in swap space (and where), or if the data is part of a memory mapped file (and which offset of which file).
There is Virtual address space and a physical address space. virtual address space defines the address space of the program. say there is a program of 4GB. in that case we can represent the address space for that program as 32 bits. (2^32 = 4GB) from 0 to 0xFFFFFFFF.
this is the space the program thinks it has.
while compilation of the program, the program is given logical addresses based on the address space of the program.
after loading in to the memory. the program counter that is assign to this program will point to these addresses (logical/virtual addresses) and cpu will only want to fetch these addresses where the program instruction's are. cpu doesn't know where are the instructions located in the memory. that is up to MMU to translate the addresses.
the main thing is CPU doesn't actually generate these addresses, these are the addresses that where given to the program while compilation, using these, the instructions in the program reference each other. so cpu just see what program counter is pointing and generate / asks for these instruction. which are located in the physical memory.
when ever a address for fetching data or operand , instruction pointed by the PC, cpu call for these addresses.

How are the sizes of pointers determined in computer systems? Via virtual or physical addresses?

I have an exam tomorrow on virtual memory address translation and I'm rather confused on this topic. I know the CPU will generate a virtual address to then access a physical address. So if we have a system with 32 bit virtual addresses, and 64 bit physical addresses, then the pointers for user level processes I'm guessing will be 8 bytes.
My logic is because the virtual address is being translated to the physical address, so this number will always be coming from the physical address.
No, user-space processes work only with virtual addresses (32-bit in your example).
The memory they "see" is their own private virtual address space. (They can make system calls like mmap and munmap to request that pages in that address-space be backed by files, or by anonymous RAM like for C malloc.) But they don't know anything about where in physical memory those pages are located.
The OS can even "fake it" by paging out some of their pages to swap space / page file, and handling the page fault if the process touches such a page by doing I/O to bring it back in and then waking up the process to rerun the load or store instruction that page faulted.
Hardware translates virtual addresses to physical addresses on every memory access. To make this fast, a TLB caches recently-used translations. On a TLB miss, hardware does a "page walk", reading the page tables to find the right virtual page->physical page translation.
The OS manages the page tables, choosing any physical page as "backing" for a virtual page.
Physical addresses wider than virtual?
Under a multi-tasking OS, multiple processes can be running. Each one has its own 32-bit (4GiB) virtual address space.
The size of physical address space limits how much RAM you can put in a machine total, and can be different from how much any single process can use at once. Changing page tables is faster than reading from disk, so even if it can't all be mapped at once, a kernel can still make use of lots of physical RAM for pagecache (cache of file contents from disk).
More importantly, multiple processes can be running, each with their own up-to-4GiB of virtual address space backed by physical memory, up to the amount of physical RAM in the system. On a CPU with multiple cores, these can be running simultaneously, truly allowing simultaneous use of more than 4GB of RAM. But not by any single process.
x86 is a good example here: Running an x86-64 kernel with 32-bit user-space gives us pretty much the situation you describe. (A 64-bit kernel can use 64-bit virtual addresses, but nevermind that, just look at user-space.)
You can have several processes each using about 4GiB of physical RAM.
The x86-64 page-table format has room for physical addresses as wide as 52-bit, although current HW doesn't use that many. (Only as wide as the amount of RAM it actually supports attaching. Saves bits in the TLBs, and other parts of the CPU). https://en.wikipedia.org/wiki/X86-64#Architectural_features
Before x86-64, 32-bit x86 kernels could use the same page-table format but with 36-bit physical addresses, on CPUs from Pentium Pro and later.
https://en.wikipedia.org/wiki/Physical_Address_Extension. That allowed up to 64GB of physical RAM. (A 32-bit kernel would typically reserve 1 or 2GB of virtual address space for itself so each process could really only use up to 3 or 2GB, but it's the same idea. Not a problem for 32-bit user-space under a 64-bit kernel though, so that made a simpler example.)
Virtual addresses are visible to user-level processes. They never should never see the physical address. So if virtual addresses are 32-bit, pointers in user-level processes are also 32-bit, i.e. 4 bytes.
The system/kernel then needs to do the translation somehow. It will know the virtual address and must translate it to the physical address, so it will eventually have a physical pointer, 64-bit = 8 byte. But once again, this address/pointer are for "internal use" only.
In practice though, you will have virtual and physical addresses of the same size, matching the word size of the CPU and its architecture (x86 vs x86_64). A virtual to pyhsicial translation will normally need to happen in a page fault, which happens when a user-level process attempts to access memory that is not loaded. To access it in the first place, it needs to have e.g. dereferenced a pointer pointing to that address, which would be done with a memory access instruction of the particular CPU architecture, which is done with word-sized addresses.
The programmer will only see virtual addresses. The physical address space is opaque to the programmer and the user. Therefore, size of a pointer is dependent on the size of the virtual address. In the particular case you have given, the maximum amount of memory your system can consume is dictated by your virtual address space. This is why 32-bit OS on 64-bit hardware is limited to a maximum of 4 gigs of memory. But, in the case of a 64-bit virtual address, even if we have insufficient RAM, we can offload some of the pages to the secondary storage to give the illusion that we have more RAM available. In the case, the page is located in the secondary memory, a page fault occurs and the page is transferred to RAM.
Edit : As Peter said in the comments, the virtual address limit affects the maximum memory a Process can consume.

Where are Logical addresses located?

I know that user program generates logical addresses.Suppose there is a small code snippet in C .When address is printed,the addresses are virtual addresses.My question is where are those addresses fetched from?where exactly do the allocated values and variables stay?At main memory or secondary memory?If main memory then why there is physical address?
User mode programs only see logical addresses. Only the operating system (kernel mode) sees physical memory.
My question is where are those addresses fetched from?
Those are the logical addresses assigned by the program loader ad linker.
where exactly do the allocated values and variables stay?At main memory or secondary memory?
In a virtual memory system, it may be in main memory or secondary storage.
If main memory then why there is physical address?
It is a logical address that is mapped to a physical address using page tables.
I am kind of new to the computer architecture and Operating System,
but I will try to answer as much as I can. As far as I have
understood about the logical address (Which I still have trouble
understanding, about where it is fetched from or where it is stored.
I mean these addresses (numbers) gotta be stored somewhere,
otherwise CPU can't generate it by itself, right?), these addresses
are assigned by CPU or a processor and depends on the CPU
architecture. Each process is assigned a virtual/logical
address. And this logical address is translated to physical address
by Memory Management Unit of CPU (MMU).
Where exactly do the allocated values and variables stay? As user3344003 said, it may be in main memory or secondary storage.
If main memory then why there is physical address? The reason lies
in the concept of Virtual Memory. Each process has its own virtual
address and a page table. Process's logical address are mapped
through this page table to the Physical memory (RAM). Whatever that
logical address is, it gets mapped to Physical address. If the
Physical memory gets full, then OS evicts some of the less used
or unused process to Secondary storage and puts the needed process
in the RAM. That way multiple process can run at the same time.
Every process assumes that they have all the space in RAM just for
themselves. If not for virtual memory, then physical memory would be
full and process might crash and may shut down the OS as well.
Hope it helps. I am still learning, if my understanding about Logical address and virtual memory is wrong then Please comment.

Diff. between Logical memory and Physical memory

While understanding the concept of Paging in Memory Management, I came through the terms "logical memory" and "physical memory". Can anyone please tell me the diff. between the two ???
Does physical memory = Hard Disk
and logical memory = RAM
There are three related concepts here:
Physical -- An actual device
Logical -- A translation to a physical device
Virtual -- A simulation of a physical device
The term "logical memory" is rarely used because we normally use the term "virtual memory" to cover both the virtual and logical translations of memory.
In an address translation, we have a page index and a byte index into that page.
The page index to the Nth path in the process could be called a logical memory. The operating system redirects the ordinal page number into some arbitrary physical address.
The reason this is rarely called logical memory is that the page made be simulated using paging, becoming a virtual address.
Address transition is a combination of logical and virtual. The normal usage is to just call the whole thing "virtual memory."
We can imagine that in the future, as memory grows, that paging will go away entirely. Instead of having virtual memory systems we will have logical memory systems.
Not a lot of clarity here thus far, here goes:
Physical Memory is what the CPU addresses on its address bus. It's the lowest level software can get to. Physical memory is organized as a sequence of 8-bit bytes, each with a physical address.
Every application having to manage its memory at a physical level is obviously not feasible. So, since the early days, CPUs introduced abstractions of memory known collectively as "Memory Management." These are all optional, but ubiquitous, CPU features managed by your kernel:
Linear Memory is what user-level programs address in their code. It's seen as a contiguous addresses space, but behind the scenes each linear address maps to a physical address. This allows user-level programs to address memory in a common way and leaves the management of physical memory to the kernel.
However, it's not so simple. User-level programs address linear memory using different memory models. One you may have heard of is the segmented memory model. Under this model, programs address memory using logical addresses. Each logical address refers to a table entry which maps to a linear address space. In this way, the o/s can break up an application into different parts of memory as a security feature (details out of scope for here)
In Intel 64-bit (IA-32e, 64-bit submode), segmented memory is never used, and instead every program can address all 2^64 bytes of linear address space using a flat memory model. As the name implies, all of linear memory is available at a byte-accessible level. This is the most straightforward.
Finally we get to Virtual Memory. This is a feature of the CPU facilitated by the MMU, totally unseen to user-level programs, and managed by the kernel. It allows physical addresses to be mapped to virtual addresses, organized as tables of pages ("page tables"). When virtual memory ("paging") is enabled, tables can be loaded into the CPU, causing memory addresses referenced by a program to be translated to physical addresses transparently. Page tables are swapped in and out on the fly by the kernel when different programs are run. This allows for optimization and security in process/memory management (details out of scope for here)
Keep in mind, Linear and Virtual memory are independent features which can work in conjunction. If paging is disabled, linear addresses map one-to-one with physical addresses. When enabled, linear addresses are mapped to virtual memory.
Notes:
This is all linux/x86 specific but the same concepts apply almost everywhere.
There are a ton of details I glossed over
If you want to know more, read The Intel® 64 and IA-32 Architectures Software Developer Manual, from where I plagiarized most of this
I'd like to add a simple answer here.
Physical Memory : This is the memory that is actually present and every process needs space here to execute their code.
Logical Memory:
To a user program the memory seems contiguous,Suppose a program needs 100 MB of space in memory,To this program a virtual address space / Logical address space starts from 0 and continues to some finite number.This address is generated by CPU and then The MMU then maps this virtual address to real physical address through some page table or any other way the mapping is implemented.
Please correct me or add some more content here. Thanks !
Physical memory is RAM; Actually belongs to main memory. Logical address is the address generated by CPU. In paging,logical address is mapped into physical address with the help of page tables. Logical address contains page number and an offset address.
An address generated by the CPU is commonly referred to as a logical address, whereas an address seen by the memory unit—that is, the one loaded into the memory-address register of the memory—is commonly referred to as a physical address
The physical address is the actual address of the frame where each page will be placed, whereas the logical address is the address generated by the CPU for each page.
What exactly is a frame?
Processes are retrieved from secondary memory and stored in main memory using the paging storing technique.
Processes are kept in secondary memory as non-contiguous pages, which implies they are stored in random locations.
Those non-contiguous pages are retrieved into main Memory as a frame by the paging operating system.
The operating system divides the memory frame size equally in main memory, and all processes retrieved from secondary memory are stored concurrently.

In virtual memory, can two different processes have the same address?

This is an interview question I found in a website, the questions says: "In virtual memory, can two different processes have the same address? When you answer "No" which is correct, how one process can access another process' memory, for example the debugger can access the variables and change them while debugging?"
What I understand is :
2 diff process can have same virtual memory address. This is because each process has its own page table. Each process thinks it as 4Gb memory on a 32-bit machine. So both P1 and P2 can access address 0xabcdef - but the physical memory location might be different. Isnt this right ?
The debugger works on the same principle - 2 processes can access the same address. So it can modify variables etc on the fly.
Theoretically every process executed by user in any present popular OSes(Win,linux,unix,Sol etc) are initially allowed to use the address range of 4gig ( 0x00000000 t0 0xffffffff on 32 bit platform),whether its a simple hello world program or its complex web container hosting stackoverflow site.It means every process has its range starting from the same start address and ending with the same address space VIRTUALLY. So obviously every process has that same virtual addresses in their respective virtual address space range. So answer for your first question is YES.
Difference comes when OS execute any process, modern OSes are multitasking OS and they run more than one process at any point of time.So accommodating 4gig of every process in the main memory is not feasible at all. So OSes using paging system,in which they divide the virtual address range (0x00000000 to 0xffffffff) into a page of 4k size(not always). So before starting the process it actually load the required pages which needed at the initial time to the main memory and then load the another virtual page ranges as required. So loading of virtual memory to physical memory (main memory) is called memory mapping. In this process you map the page's virtual address range to physical address range( like ox00000000 to ox00001000 virtaul address range to 0x00300000 to 0x00301000 physical address range)based on the slot free in the main memory.So at any point of time only one virtual address range will be mapped to that particular physical address range,so answer for your second question is NO.
BUT
Shared Memory concept is an exception where all the process can share some of their virtual address range with each other,that will be mapped to a common physical address space.So in this case answer can be YES.
As an example on Linux every executable require libc.so library to execute the program executable.Every process load their required libraries and allocate them some virtual address page ranges in their address space. So now consider a scenario where you are executing 100's of process where each process require this library libc.so. So if OS allocate virtual address space in every process for this library libc.so,then you can imagine the level of duplication for library libc.so & its highly possible that at any point of time you will get multiple instance of libc.so address range pages in the main memory.So to make is redundant OS will load libc.so to specific virtual address space range of every process which is mapped to a fixed physical address range in main memory.So every process will refer to that fixed physical address range to execute any code in libc.so. So in this case every process share some physical address ranges as well.
But there is no chance of two process has same physical address at the same time in the user malloced virtual address range mapping.
Hope it helps.
1)
Same physical memory address at the same time: NO
Same virtual memory address at the same time: YES (each one maps to differnet physical address, or swap space)
2) I think the debuggers don't access directly the other process debugged but communicates with the runtime in the debugged process to do that changes.
That said, maybe the OS or processor instructions provide access/modify to other's memory access if you have the right. That doesn't mean it has the SAME address, it only says process 1 can say "access memory #address1 in Process2". Someone (processor / OS / runtime) will do that for process 1.
Yes, it's definitely possible for the same address to map to different physical memory depending on the process that's referencing it. This is in fact the case under Windows.
Each process has a address space of 4GB in a 32 bit system. Where is this real 4GB is managed by the OS. So in principle 2 different process can have same addresses that is local to the process.
Now when one process has to read the memory of another process it has to either communicate with the other process (memory mapped files etc.,) or use the Debug apis like OpenProcess/ReadProcessMemory.
What I am sure is one process cannot directly go and read the virtual memory of other process atleast in Win32 without the help of the OS.
Sometimes I feel like the "elder" in the Minolta commercial... In the 1960's Multics was created using Virtual Memory. The last Multics system was shut down October 30, 2000 at 17:08Z.
In Multics, only one copy of any program was present in memory, regardless of how many users were running it. So that means that each user process had both the same physical and virtual address for the program.
When I look at the Windows Task Manager and see multiple copies of a program (e.g. svchost.exe) I wonder why / how the revolutionary concepts in Multics were lost.