How are the sizes of pointers determined in computer systems? Via virtual or physical addresses? - operating-system

I have an exam tomorrow on virtual memory address translation and I'm rather confused on this topic. I know the CPU will generate a virtual address to then access a physical address. So if we have a system with 32 bit virtual addresses, and 64 bit physical addresses, then the pointers for user level processes I'm guessing will be 8 bytes.
My logic is because the virtual address is being translated to the physical address, so this number will always be coming from the physical address.

No, user-space processes work only with virtual addresses (32-bit in your example).
The memory they "see" is their own private virtual address space. (They can make system calls like mmap and munmap to request that pages in that address-space be backed by files, or by anonymous RAM like for C malloc.) But they don't know anything about where in physical memory those pages are located.
The OS can even "fake it" by paging out some of their pages to swap space / page file, and handling the page fault if the process touches such a page by doing I/O to bring it back in and then waking up the process to rerun the load or store instruction that page faulted.
Hardware translates virtual addresses to physical addresses on every memory access. To make this fast, a TLB caches recently-used translations. On a TLB miss, hardware does a "page walk", reading the page tables to find the right virtual page->physical page translation.
The OS manages the page tables, choosing any physical page as "backing" for a virtual page.
Physical addresses wider than virtual?
Under a multi-tasking OS, multiple processes can be running. Each one has its own 32-bit (4GiB) virtual address space.
The size of physical address space limits how much RAM you can put in a machine total, and can be different from how much any single process can use at once. Changing page tables is faster than reading from disk, so even if it can't all be mapped at once, a kernel can still make use of lots of physical RAM for pagecache (cache of file contents from disk).
More importantly, multiple processes can be running, each with their own up-to-4GiB of virtual address space backed by physical memory, up to the amount of physical RAM in the system. On a CPU with multiple cores, these can be running simultaneously, truly allowing simultaneous use of more than 4GB of RAM. But not by any single process.
x86 is a good example here: Running an x86-64 kernel with 32-bit user-space gives us pretty much the situation you describe. (A 64-bit kernel can use 64-bit virtual addresses, but nevermind that, just look at user-space.)
You can have several processes each using about 4GiB of physical RAM.
The x86-64 page-table format has room for physical addresses as wide as 52-bit, although current HW doesn't use that many. (Only as wide as the amount of RAM it actually supports attaching. Saves bits in the TLBs, and other parts of the CPU). https://en.wikipedia.org/wiki/X86-64#Architectural_features
Before x86-64, 32-bit x86 kernels could use the same page-table format but with 36-bit physical addresses, on CPUs from Pentium Pro and later.
https://en.wikipedia.org/wiki/Physical_Address_Extension. That allowed up to 64GB of physical RAM. (A 32-bit kernel would typically reserve 1 or 2GB of virtual address space for itself so each process could really only use up to 3 or 2GB, but it's the same idea. Not a problem for 32-bit user-space under a 64-bit kernel though, so that made a simpler example.)

Virtual addresses are visible to user-level processes. They never should never see the physical address. So if virtual addresses are 32-bit, pointers in user-level processes are also 32-bit, i.e. 4 bytes.
The system/kernel then needs to do the translation somehow. It will know the virtual address and must translate it to the physical address, so it will eventually have a physical pointer, 64-bit = 8 byte. But once again, this address/pointer are for "internal use" only.
In practice though, you will have virtual and physical addresses of the same size, matching the word size of the CPU and its architecture (x86 vs x86_64). A virtual to pyhsicial translation will normally need to happen in a page fault, which happens when a user-level process attempts to access memory that is not loaded. To access it in the first place, it needs to have e.g. dereferenced a pointer pointing to that address, which would be done with a memory access instruction of the particular CPU architecture, which is done with word-sized addresses.

The programmer will only see virtual addresses. The physical address space is opaque to the programmer and the user. Therefore, size of a pointer is dependent on the size of the virtual address. In the particular case you have given, the maximum amount of memory your system can consume is dictated by your virtual address space. This is why 32-bit OS on 64-bit hardware is limited to a maximum of 4 gigs of memory. But, in the case of a 64-bit virtual address, even if we have insufficient RAM, we can offload some of the pages to the secondary storage to give the illusion that we have more RAM available. In the case, the page is located in the secondary memory, a page fault occurs and the page is transferred to RAM.
Edit : As Peter said in the comments, the virtual address limit affects the maximum memory a Process can consume.

Related

Memory Address Translation in OS

Is Memory address translation only useful when the total size of virtual memory
(summed over all processes) needs to be larger than physical memory?
Basically, the size of virtual memory depends on what you call "virtual memory". If you call virtual memory the virtual memory of one process then virtual memory has the same size than physical memory. If you call virtual memory the whole virtual memory of all processes than virtual memory can (technically) have an infinite size. This is because every process can have a whole address space. The virtual address space of one process cannot be bigger than physical memory because the processor has limited bits to address this memory. In modern long mode the processor has only 48 bits to address RAM at the byte level. This gives a very big amount of RAM but most systems will have 8GB to 32GB.
Technically on a 8GB RAM computer, every process could have 8GB allocated. I say technically because eventually, the computer will constantly be removing page frames from RAM and that will put too much overhead on the OS and on the computer which will make your system freeze. In the end, the size of the sum of the virtual memory of every process is limited by the capacity of your system (and OS) to have an efficient page swapping algorithm (and on your willingness to have a slow system).
Now to answer your question, paging (virtual memory) is used also to avoid fragmentation and for securing the system. With the old segmentation model, fragmentation was an issue because you had to run a complex algorithm to determine which part of memory a process gets. With paging, the smallest granularity of memory is 4KB. This makes everything much easier because a small process just gets a 4KB page and the process can work in that page the way it wants. While a bigger process will get several pages and can allocate more pages by doing a system call. There is still the issue of external fragmentation but it is mostly due to latency of accessing high memory vs low memory. Basically, paging solves the issue of external fragmentation because a process can get a page anywhere (where it's available) and it will not make a difference (except for high vs low memory). There is still the issue of internal fragmentation with paging.
Paging also secures the system. With segmentation you had several levels of ring protection. With paging you have only user or supervisor. With segmentation, the memory is not well protected because one process can access the memory of another process in the same segment. With paging, there are 2 different protections. The first protection is the ring itself (user vs supervisor) the second are the page tables. The page tables isolate one process from another because the memory accesses are translated to other positions in RAM. It is the job of the OS to fill the page tables properly so that one process doesn't have access to the physical memory allocated to another process. The user vs supervisor bit in the page tables, prevent one process from accessing the kernel except via a system call interface (the instruction syscall in assembly for x86).

Logical Address Space is Larger than Physical and Backing store combined

When a virtual address space is larger than the physical memory, OS can use swapping to evict page frames (e.g. LRU eviction). CPU generates Page Fault where then page that is in disk is swapped into the main memory. What happens when the virtual address is large enough that neither primary memory or disk have enough storage to hold it? What happens when a page frame is not in the disk either? Is another page fault called?
What happens when the virtual address is large enough that neither primary memory or disk have enough storage to hold it?
A virtual memory system maintains an image of the logical address space in secondary storage. A well-designed operating system is not going to allow a process to map a logical address that does not have a backing already in secondary storage. When your application calls a system service to map pages to the logical address space, the call will fail if there is no secondary storage available for the pages.
What happens when a page frame is not in the disk either?
There are some poorly designed operating systems that will map pages without having secondary storage behind them. You call the system service to map pages, it succeeds even if the pages could not be backed in secondary storage.
In that case, you get a memory exception upon access (and get no hint in your application that the real problem as a memory allocation failure).
Is another page fault called?
No.
In a logical memory system (as supported by most processors) a page has two states:
1. Mapped
2. Unmapped
In a virtual memory system, there are three states:
1. Mapped
2. Unmapped and valid
3. Unmapped and invalid
When a page fault occurs, the processor just knows the page is not mapped to memory. The operating system then has to figure out if the page is in secondary storage somewhere. If it is not, the operating causes the process to see an exception. If it is, the operating system loads and maps the page, the lets the process continue on its merry way.
When a virtual address space is larger than the physical memory, OS can use swapping to evict page frames (e.g. LRU eviction)
Lets assume that a virtual address is 48-bit (so the size of one virtual address space is 256 TiB), and you're running 123 processes where each one has its own virtual address space. This adds up to a total of 31488 TiB of virtual address space. Note: This is "very normal" for a modern 80x86 PC running a modern OS (Windows, Linux, ...).
Out of this 31488 TiB:
almost all of it will be unused and marked as "not present". If software tries to access it you get a page fault, the page fault handler realizes it's a bug, and you probably end up with a SIGSEGV (or "blue screen of death" or ...). Because it isn't being used the OS doesn't need any RAM or any disk space for it.
some of it will be the same things loaded into RAM once and then mapped into many virtual address spaces. This is extremely common for the kernel itself and for shared libraries/DLLs. It also includes cases where the same RAM is used for the virtual file system cache and for memory mapped files, or the same RAM is mapped into 2 or more processes as "shared memory", or when the same RAM is mapped into 2 or more virtual address spaces as "copy on write" (e.g. in the aftermath of fork()).
some will be "allocate on write" - literally the same page full of zeros mapped at many virtual addresses in many virtual address spaces, where if you write to it you get a page fault and the page fault handler allocates a new page of RAM for the page you tried to write to. This allows the OS to pretend that a huge amount of virtual space is allocated and filled with zeros without using any RAM or any disk space (until it actually is modified).
some will be (modified) data that is unique to a specific process.
The end result is that the 31488 TiB of total virtual space might only need a few GiB of RAM (and probably won't use swap space at all).
Over-commit
The OS does a pile of tricks to pretend memory was allocated when it actually wasn't. This creates the potential for a worst case where all the memory the OS pretends is allocated actually does need to be allocated. There are 2 ways to deal with this:
a) Refuse to let processes allocate more if you can't cover the worst case (e.g. return a "not enough memory" error when a process tries to allocate more than the OS can supply). This is bad because the worst case is extremely unlikely and you end up with software failing for no reason ("not enough memory" when there's actually plenty of memory to cover current requirements).
b) Allow processes allocate more (within reason), even if you can't cover the worst case. This works fine most of the time, but if the worst case actually happens something has to break (e.g. the OS terminates a process to free up some RAM).
The best option (in my opinion) is the first option (don't allow over-commit), but to have a large amount of swap space. Essentially; this is like "allow over-commit of RAM, but don't allow over-commit of swap space + RAM"; where the OS will probably be running slowly (due to excessive swap space use) before it has to start telling processes "no more memory"; and where most of the time everything will be in RAM (and ideally swap space is only used to cover the unlikely worst case).

Location of OS Kernel Data

I'm a beginner with operating systems, and I had a question about the OS Kernel.
I'm used to the standard notion of each user process having a virtual address space of stack, heap, data, and code. My question is that when a context switch occurs to the OS Kernel, is the code run in the kernel treated as a process with a stack, heap, data, and code?
I know there is a dedicated kernel stack, which the user program can't access. Is this located in the user program address space?
I know the OS needs to maintain some data structures in order to do its job, like the process control block. Where are these data structures located? Are they in user-program address spaces? Are they in some dedicated segment of memory for kernel data structures? Are they scattered all around physical memory wherever there is space?
Finally, I've seen some diagrams where OS code is located in the top portion of a user program's address space. Is the entire OS kernel located here? If not, where else does the OS kernel's code reside?
Thanks for your help!
Yes, the kernel has its own stack, heap, data structures, and code separate from those of each user process.
The code running in the kernel isn't treated as a "process" per se. The code is privileged meaning that it can modify any data in the kernel, set privileged bits in processor registers, send interrupts, interact with devices, execute privileged instructions, etc. It's not restricted like the code in a user process.
All of kernel memory and user process memory is stored in physical memory in the computer (or perhaps on disk if data has been swapped from memory).
The key to answering the rest of your questions is to understand the difference between physical memory and virtual memory. Remember that if you use a virtual memory address to access data, that virtual address is translated to a physical address before the data is fetched at the determined physical address.
Each process has its own virtual address space. This means that some virtual address a in one process can map to a different physical address than the same virtual address a in another process. Virtual memory has many important uses, but I'm not going to go into them here. The important point is that virtual memory enforces memory isolation. This means that process A cannot access the memory of process B. All of process A's virtual addresses map to some set of physical addresses and all of process B's virtual addresses map to a different set of physical addresses. As long as the two sets of physical addresses do not overlap, the processes cannot see or modify the memory of each other. User processes cannot access physical memory addresses directly - they can only make memory accesses with virtual addresses.
There are times when two processes may have some virtual addresses that do map to the same physical addresses, such as if they both mmap the same file, both use a shared library, etc.
So now to answer your question about kernel address spaces and user address spaces.
The kernel can have a separate virtual address space from each user process. This is as simple as changing the page directory pointer in the cr3 register (in an x86 processor) on each context switch. Since the kernel has a different virtual address space, no user process can access kernel memory as long as none of the kernel's virtual memory addresses map to the same physical addresses as any of the virtual addresses in any address space for a user process.
This can lead to a minor problem. If a user process makes a system call and passes a pointer as a parameter (e.g. a pointer to a buffer in the read system call), how does the kernel know which physical address corresponds to that buffer? The virtual address in the pointer maps to a different physical address in kernel space, so the kernel cannot just dereference the pointer. There are two options:
The kernel can traverse the user process page directory/tables to find the physical address that corresponds to the buffer. The kernel can then read/write from/to that physical address.
The kernel can instead include all of its mappings in the user address space (at the top of the user address space, as you mentioned). Now, when the kernel receives a pointer through the system call, it can just access the pointer directly since it is sharing the address space with the process.
Kernels generally go with the second option, since it's more convenient and more efficient. Option 1 is less efficient because each time a context switch occurs, the address space changes, so the TLB needs to be flushed and now you lose all of your cached mappings. I'm simplifying things a bit here since kernels have started doing things differently given the recent Meltdown vulnerability discovered.
This leads to another problem. If the kernel includes its mappings in the user process address space, what stops the user process from accessing kernel memory? The kernel sets protection bits in the page table that cause the processor to prohibit the user process from accessing the virtual addresses that map to physical addresses that contain kernel memory.
Take a look at these slides for more information.
I'm used to the standard notion of each user process having a virtual address space of stack, heap, data, and code. My question is that when a context switch occurs to the OS Kernel, is the code run in the kernel treated as a process with a stack, heap, data, and code?
One every modern operating system I am aware there is NEVER a context switch to the kernel. The kernel executes in the context of a process (some systems user the fiction of a reduced process context.
The "kernel" executes when a process enters kernel mode through an exception or an interrupt.
Each process (thread) normally has its own kernel mode stack used after an exception. Usually there is a single single interrupt stack for each processor.
https://books.google.com/books?id=FSX5qUthRL8C&pg=PA322&lpg=PA322&dq=vax+%22interrupt+stack%22&source=bl&ots=CIaxuaGXWY&sig=S-YsXBR5_kY7hYb6F2pLGjn5pn4&hl=en&sa=X&ved=2ahUKEwjrgvyX997fAhXhdd8KHdT7B8sQ6AEwCHoECAEQAQ#v=onepage&q=vax%20%22interrupt%20stack%22&f=false
I know there is a dedicated kernel stack, which the user program can't access. Is this located in the user program address space?
Each process has its own kernel stack. It is often in the user space with protected memory but could be in the system space. The interrupt stack is always in the system space.
Where are these data structures located? Are they in user-program address spaces?
They are generally in the system space. However, some systems do put some structures in the user space in protected memory.
Are they in some dedicated segment of memory for kernel data structures?
If they are in the user space, they are generally for an access mode more privileged than user mode and less privileged than kernel mode.
Are they scattered all around physical memory wherever there is space?
Thinks can be spread over physical memory pretty much at random.
The data structures in questions are usually regular C structures situated in the RAM allotted to the kernel by the kernel allocator
They are not usually accessible from regular processes becuase of normal mechanisms for memory protection and paging (virtual memory)
A kind of exception to this are kernel threads which have no userspace address space so the code they execute is always the kernel code working with the kernel space data structures hence with the isolated kernel memory
Now for the interesting part: 64-bit Linux uses a thing called Direct Map for memory organization, which means that the full amount of physical memory available is mapped in the kernel page tables as just one contiguous chunk. This is not true for 32-bit as the HIGHMEM was used to avoid the limitation of 4GB address spaces
Since the kernel has all the physical RAM visible and available to its own allocator, the kernel data structures in question can be situated pretty randomly with respect to the physical addresses
You can google on there terms to gain additional information:
PTI (page table isolation)
__copy_from_user (esp. on esoteric architectures where this function is not just a bitwise copy)
EPT (Intel nested paging in virtual machines)

Diff. between Logical memory and Physical memory

While understanding the concept of Paging in Memory Management, I came through the terms "logical memory" and "physical memory". Can anyone please tell me the diff. between the two ???
Does physical memory = Hard Disk
and logical memory = RAM
There are three related concepts here:
Physical -- An actual device
Logical -- A translation to a physical device
Virtual -- A simulation of a physical device
The term "logical memory" is rarely used because we normally use the term "virtual memory" to cover both the virtual and logical translations of memory.
In an address translation, we have a page index and a byte index into that page.
The page index to the Nth path in the process could be called a logical memory. The operating system redirects the ordinal page number into some arbitrary physical address.
The reason this is rarely called logical memory is that the page made be simulated using paging, becoming a virtual address.
Address transition is a combination of logical and virtual. The normal usage is to just call the whole thing "virtual memory."
We can imagine that in the future, as memory grows, that paging will go away entirely. Instead of having virtual memory systems we will have logical memory systems.
Not a lot of clarity here thus far, here goes:
Physical Memory is what the CPU addresses on its address bus. It's the lowest level software can get to. Physical memory is organized as a sequence of 8-bit bytes, each with a physical address.
Every application having to manage its memory at a physical level is obviously not feasible. So, since the early days, CPUs introduced abstractions of memory known collectively as "Memory Management." These are all optional, but ubiquitous, CPU features managed by your kernel:
Linear Memory is what user-level programs address in their code. It's seen as a contiguous addresses space, but behind the scenes each linear address maps to a physical address. This allows user-level programs to address memory in a common way and leaves the management of physical memory to the kernel.
However, it's not so simple. User-level programs address linear memory using different memory models. One you may have heard of is the segmented memory model. Under this model, programs address memory using logical addresses. Each logical address refers to a table entry which maps to a linear address space. In this way, the o/s can break up an application into different parts of memory as a security feature (details out of scope for here)
In Intel 64-bit (IA-32e, 64-bit submode), segmented memory is never used, and instead every program can address all 2^64 bytes of linear address space using a flat memory model. As the name implies, all of linear memory is available at a byte-accessible level. This is the most straightforward.
Finally we get to Virtual Memory. This is a feature of the CPU facilitated by the MMU, totally unseen to user-level programs, and managed by the kernel. It allows physical addresses to be mapped to virtual addresses, organized as tables of pages ("page tables"). When virtual memory ("paging") is enabled, tables can be loaded into the CPU, causing memory addresses referenced by a program to be translated to physical addresses transparently. Page tables are swapped in and out on the fly by the kernel when different programs are run. This allows for optimization and security in process/memory management (details out of scope for here)
Keep in mind, Linear and Virtual memory are independent features which can work in conjunction. If paging is disabled, linear addresses map one-to-one with physical addresses. When enabled, linear addresses are mapped to virtual memory.
Notes:
This is all linux/x86 specific but the same concepts apply almost everywhere.
There are a ton of details I glossed over
If you want to know more, read The Intel® 64 and IA-32 Architectures Software Developer Manual, from where I plagiarized most of this
I'd like to add a simple answer here.
Physical Memory : This is the memory that is actually present and every process needs space here to execute their code.
Logical Memory:
To a user program the memory seems contiguous,Suppose a program needs 100 MB of space in memory,To this program a virtual address space / Logical address space starts from 0 and continues to some finite number.This address is generated by CPU and then The MMU then maps this virtual address to real physical address through some page table or any other way the mapping is implemented.
Please correct me or add some more content here. Thanks !
Physical memory is RAM; Actually belongs to main memory. Logical address is the address generated by CPU. In paging,logical address is mapped into physical address with the help of page tables. Logical address contains page number and an offset address.
An address generated by the CPU is commonly referred to as a logical address, whereas an address seen by the memory unit—that is, the one loaded into the memory-address register of the memory—is commonly referred to as a physical address
The physical address is the actual address of the frame where each page will be placed, whereas the logical address is the address generated by the CPU for each page.
What exactly is a frame?
Processes are retrieved from secondary memory and stored in main memory using the paging storing technique.
Processes are kept in secondary memory as non-contiguous pages, which implies they are stored in random locations.
Those non-contiguous pages are retrieved into main Memory as a frame by the paging operating system.
The operating system divides the memory frame size equally in main memory, and all processes retrieved from secondary memory are stored concurrently.

Virtual Memory Space

What does the virtual memory space size depend on? Does it depend on the RAM or on the architecture or something else.
Basically it depends on the architecture (32bit 64bit and so...).
This is a very simplistic explanation of things, but so called "architecture" limits size of the virtual address space. For example, 32bit architecture will enable to address 2^31 memory addresses.
The size of the RAM will limit the amount of physical memory that can be used, but not the virtual address space. (potentially the Hard-drive can be used to extend the available physical memory)
Anyway I recommend to read the wiki page on virtual memory
Very simply, virtual memory is just a way of letting your software use more memory addresses than there is actual physical memory, such that when the data being access isn't already hosted in physical memory it's transparently read in from disk, and when some more physical memory is needed to do things like that some of the current content of physical memory is temporarily written or "swapped" out to disk (e.g. the least-recently used memory). In other words, some of the physical memory becomes a kind of cache for a larger virtual memory space including hard disk.