Does the MMU mediate everything between the operating system and physical memory or is it just an address translator? - operating-system

I'm trying to understand how does an operating system work when we want to assign some value to a particular virtual memory address.
My first question concerns whether the MMU handles everything between the CPU and the RAM. Is this true? From what one can read from Wikipedia, I'd say so:
A memory management unit (MMU), sometimes called paged memory
management unit (PMMU), is a computer
hardware component responsible for
handling accesses to memory requested
by the CPU.
If that is the case, how can one tell the MMU I want to get 8 bytes, 64 or 128bytes, for example? What about writing?
If that is not the case, I'm guessing the MMU just translates virtual addresses to physical ones?
What happens when the MMU detects there will be what we call a page-fault? I guess it has to tell it to the CPU so the CPU loads the page itself off disk, or is the MMU able to do this?
Thanks

Devoured Elysium,
I'll attempt to answer your questions one by one but note, it might be a good idea to get your hands on a textbook for an OS course or an introductory computer architecture course.
The MMU consists of some hardware logic and state whose purpose is, indeed, to produce a physical address and provide/receive data to and from the memory controller. Actually, the job of memory translation is one that is taken care of by cooperating hardware and software (OS) mechanisms (at least in modern PCs). Once the physical address is obtained, the CPU has essentially done its job and now sends the address out on a bus which is at some point connected to the actual memory chips. In many systems this bus is called the Front-Side Bus (FSB), which is in turn connected to a memory controller. This controller takes the physical address supplied by the CPU and uses it to interact with the DRAM chips, and ultimately extract the bits in the correct rows and columns of the memory array. The data is then sent back to the CPU, which can now operate on it. Note that I'm not including caching in this description.
So no, the MMU does not interact directly with RAM, which I assume you are using to mean the physical DRAM chips. And you cannot tell the MMU that you want 8 bytes, or 24 bytes, or whatever, you can only supply it with an address. How many bytes that gets you depends on the machine you're on and whether it's byte-addressable or word-addressable.
Your last question urges me to remind you: the MMU is actually a part of the CPU--it sits on the same silicon die (although this was not always the case).
Now, let's take your example with the page fault. Suppose our user-level application wants to, like you said, set someAddress = 10, I'll take it in steps. Let's assume someAddress is 0xDEADBEEF and let's ignore caches for now.
1) The application issues a store instruction to 0xsomeAddress, which, in x86 might look something like
mov %eax, 0xDEADBEEF
where 10 is the value in the eax register.
2) 0xDEADBEEF in this case is a virtual address, which must be translated. Most of the time, the virtual to physical address translation will be available in a hardware structure called the Translation Lookaside Buffer (TLB), which will provide this translation to us very fast. Typically, it can do so in one clock cycle. If the translation is in the TLB, called a TLB hit, execution can continue immediately (i.e. the physical address corresponding to 0xDEADBEEF and the value 10 are sent out to the memory controller to be written).
3) Let's suppose though, that the translation wasn't available in the TLB (called a TLB miss). Then we must find the translation in the page tables, which are structures in memory whose structure is defined by the hardware and managed by the OS. They simply contain entries that map a virtual address to a physical one (more accurately, a virtual page number to a physical page number). But these structures also reside in memory, and so must have addresses! The hardware contains a special register called cr3 which contains the physical address of the current page table. We can index into this page table using our virtual address, so the hardware takes the value in cr3, computes an address by adding an offset, and goes off to memory to fetch the page table entry (PTE). This PTE will (hopefully) contain the physical address corresponding to 0xDEADBEEF, in which case we put this mapping in the TLB (so we don't have to walk the page table again) and continue on our way.
4) But oh no! What if there is no PTE in the page tables for 0xDEADBEEF? This is a page fault, and this is where the Operating System comes into play. The PTE we got out of the page table existed, as in it was (let's assume) a valid memory address to access, but the OS had not created a VA->PA mapping for it yet, so it would have had a bit set to indicate that it is invalid. The hardware is programmed in such a way that when it sees this invalid bit upon an access, it generates an exception, in this case a page fault.
5) The exception causes the hardware to invoke the OS by jumping to a well known location--a piece of code called a handler. There can be many exception handlers, and a page fault handler is one of them. The page fault handler will know the address that caused the fault because it's stored in a register somewhere, and so will create a new mapping for our virtual address 0xDEADBEEF. It will do so by allocating a free page of physical memory and then saying "all virtual addresses between VA x and VA y will map to some address within this newly allocated page of physical memory". 0xDEADBEEF will be somewhere in that range, so the mapping is now securely in the page tables, and we can restart the instruction that caused the page fault (the mov).
6) Now, when we go through the page tables again, we will find a mapping and the PTE we pull out will have a nice physical address, the one we want to store to. We provide this with the value 10 to the memory controller and we're done!
Caches will change this game quite a bit, but I hope this serves to illustrate how paging works. Again, it would benefit you greatly to check out some OS/Computer Architecture books. I hope this was clear.

There are data structures that describe which virtual addresses correspond to which physical addresses. The OS creates and manages these data structures, and the CPU uses them to translate virtual addresses into physical addresses.
For example, the OS might use these data structures to say "virtual addresses in the range from 0x00000000 to 0x00000FFF correspond to physical addresses 0x12340000 to 0x12340FFFF"; and if software tries to read 4 bytes from the virtual address 0x00000468 then the CPU will actually read 4 bytes from the physical address 0x12340468.
Typically everything is effected by the virtual->physical translation (except for when the CPU is accessing the data structures that describe the translation). Also, usually there's some sort of translation cache build into the CPU to help reduce the overhead involved.

Related

What does it mean to implement the Virtual Memory Using Paging?

I know what the term Virtual Memory means and what how does the Paging System work, but I want to understand , how Virtual memory is implemented via Paging ?
So let me put the following example , if program wants to run does it first brought to the Virtual Memory before its pages are brought to Main Memory Frames? is that the relation between the Virtual memory and the Paging System ?
Kind regards
When you run a C program, or a program in some other "down to the metal" language, and it dereference a wild pointer (a pointer to some nonexistent memory location) you get an error message saying Segmentation Fault, Page Fault, or something similar. (But the term Page Fault is sometimes reserved to mean a Segmentation Fault that the operating system can resolve, read on.)
A lot happens before you see that message. Whenever a program tries to use a nonexistent memory address, its host computer raises a fault. It looks something like an interrupt from a device. The OS then looks at an internal data structure describing the virtual memory of the process to determine whether the address in question is part of the virtual memory address range of the program. If it is, the OS either retrieves the page -- the chunk of memory holding the address in question -- from disk, or if it's a response to a request for new data memory, delivers a new page full (usually) of zeros. It then updates the computer's virtual-to-physical address translation registers and restarts the instruction that generated the fault. Zap! the illusion of lots of memory--virtual memory.
Only if the address was not part of the program's declared memory space does the fault make it to an error message visible to the programmer or user.
This is grossly oversimplified: thousands of hardware and software developers have been working on this problem space continuously for over half a century now, and it has many variations and refinements.
The sequence of events that start a new process varies from OS to OS. They all involving loading at least one page and jumping to it: setting the computer's program counter register to point to it.

Page Fault - How does os search for the page in secondary storage?

My question is when a page fault occurs and the required page is not in RAM ,after that how does the os know where to look for the given page in the entire secondary memory to bring it to the RAM? So is the logical address the address of the secondary memory store or is the required secondary storage address stored in the page table itself or some other way?
I feel like i am probably missing something very basic here but this doubt came in my mind and a quick google search is not providing any answers.
My question is when a page fault occurs and the required page is not in RAM ,after that how does the os know where to look for the given page in the entire secondary memory to bring it to the RAM?
If there were 50 different operating systems that supported an average of 10 different architectures each, there would be up to a maximum of 500 different answers; where one of the answers would be "all software uses physical addresses and there is no virtual memory and there is no secondary memory" and another answer would be "a virtual address is a location on the disk and RAM is just used as disk cache to speed it up" (see https://en.wikipedia.org/wiki/Single-level_store ).
For most typical modern operating systems running on most typical architectures; if you worked out all of the information the kernel needs to know about each virtual page (e.g. what the page is pretending to be, what the page actually is, location on disk if any, location in RAM if any, something to keep track of "least recently used", something to keep track of "number of copy-on-write copies", etc); then you could scatter all the information across multiple different data structures such that:
some of the data structures are used/required by the CPU itself and some aren't
the same information may or may not be in 2 or more data structures at the same time
some data structures have an entry for each virtual page and some just have an entry for each range of multiple pages
some data structures are arrays/tables, some are trees, some are trees of tables, and some are something else.
some use "virtual address" or "virtual page number" as a key to find information; and some use something else (e.g. inverted page tables on PowerPC and Itanium use "physical address" as an index because using what you're trying to find as an index is the least intelligent thing you could possibly do, so why not?).
some of the data structures may be in the kernel and some may not be (e.g. the L4 micro-kernel manages virtual memory mapping purely in user-space via. an "abstract hierarchical address space" model).
In general; the information about where a page's data is in (each different piece of?) secondary memory (if there is secondary memory) will be stored in one or more places in one or more things.
Note that when a page fault occurs the page fault handler typically needs to make multiple decisions; possibly starting with figuring out what made the access (a process, the kernel itself?) and whether the access should be allowed or denied, then figuring out what to do about it (send SIGSEGV? do a kernel panic? fetch data into the CPU's TLB? invalidate stale data from CPU's TLB? do copy-on-write cloning? fetch data from swap space? fetch data from file?); so the page fault handler ends up finding multiple different pieces of data from (potentially) multiple different places.
A Concrete Example
For my OS designs (which are based on asynchronous message passing and use micro-kernels); a micro-kernel is small enough that it can be custom designed and optimised for a specific architecture (without any regard to portability). The operating system design is intended for distributed systems, and for that reason shared memory (and fork()) are not supported (you don't want page fault handler to have to fetch data from a remote computer over a congested network connection to do a "copy on write"); and the only case for "copy on write" is memory mapped files where the page is shared by one or more processes and the (local) VFS cache.
For 64-bit 80x86, the CPU requires a tree of 4 levels of tables (page tables, page directories, page directory pointer tables and page map level 4), and to improve efficiency (reduce memory consumption and reduce cache misses, etc) I use these tables as much as possible.
For page table entries (or page directory entries if 2 MiB pages are being used); if the page is not present there are 63 bits that are ignored by the CPU that the OS can use for its own purposes; and if the page is present then (depending on which features CPU supports) there are at least 9 bits that the OS can use for its own purposes and flags that the CPU uses (e.g. the "read, write, no-execute" flags) can be used to augment the OS's own information.
When a page is not present, the 63-bits are split into 2 fields - one 8 bit field to keep track of the virtual type of the page (if it's supposed to act like RAM, if it's supposed to be executable, if it's supposed to use "write-back" caching, etc), and one 55 bit "where" field. If the highest bit in the "where" field is set the page was sent to swap space and the other 54 bits are a "swap space handle" (allowing for a maximum of "2**54 * 4 KiB" of swap space); and if the highest bit in the "where" field is clear then the other 54 bits are a "memory mapped file handle". If a page fault occurs because of a "not present" page, the page fault handler uses the 8-bit field to determine if the access should be allowed or denied (or if it's already being handled due to a different thread accessing it already), then (if the access should be allowed) the page fault handler tells the scheduler to put the thread in a "WAITING FOR PAGE" state and marks the page as "being fetched" (so that other threads that belong to the same process know that it's being fetched already), then uses the "where" field to either send a request message asking for the page's data to the Swap Manager (which is a process in user-space), or to find a "memory mapped file descriptor" structure in kernel space that contains more information (that didn't fit in the page table entry) to determine the offset of the page within the file and a file handle, and send a request to the VFS for the page's data (the VFS or Virtual File System is another process in user-space). Later; when Swap Manager or VFS send a reply message containing the page's data back to the kernel, the kernel fixes up the page table entry (putting the page of data from the message into the virtual address space) and tells scheduler to unblock the thread/s (shift them from the "WAITING FOR PAGE" state to the "READY TO RUN" state).
For both of these cases (memory mapped file and swap space) if the access was an "allowed read" then the page is mapped as read only (regardless of whether the page is supposed to be writeable). If the access was an "allowed write", or if a later "allowed write" is done to a page that was previously fetched and mapped as read only; then if the page's data came from swap space the page fault handler informs the Swap Manager that the copy of the page in swap space can be discarded (can't be re-used if the same page is sent to swap space later), and if the page's data came from a memory mapped file the page fault handler informs the VFS that there's one less process with a copy of that page and copies the "copy on write" page's data to a newly allocated page.
When a page is "present", it may still be part of a memory mapped file and there may still be a copy in swap space; but there isn't enough space in the page table entry to store the "where" field. In this case, if the page is in swap space and in RAM, the Swap Manager has to accept "Process ID + virtual address" instead of a "swap space handle" (which causes a little extra overhead in Swap Manager because it has to convert "Process ID + virtual address" into "swap space handle" itself). If the page is a "copy on write" memory mapped file, then the page fault handler searches the process' list of "memory mapped file descriptors" (which causes a little extra overhead).
Note that (in theory) when an OS is running low on free RAM it wants to select a "least likely to be needed soon" page to send to swap space, but this isn't easy/practical so most operating systems use "least recently used" instead.
My kernels don't do this at all. Instead they just send "random" pages to the Swap Manager, and (initially) the Swap Manager keeps the data in RAM and doesn't send it to any of the swap providers to store; and the Swap Manager uses "least recently sent to swap manager" to figure out which pages to send to a swap provider to store. A page that is used often may be sent to swap manager many times without ever actually being sent to a swap provider (and without causing slow disk IO for frequently used pages). Also note that, because "copy on write memory mapped file" is the only case that "copy on write" is used and because there is no other form of shared memory, the VFS can keep track of how many processes are sharing a copy of pages itself and the kernel never need to keep track of how many processes are sharing a copy of any page (like most kernels for most operating systems do).

Is my understanding of the relationship between virtual addresses and physical addresses correct?

I've been researching (on SO and elsewhere) the relationship between virtual addresses and physical addresses. I would appreciate it if someone could confirm if my understanding of this concept is correct.
The page table is classified as 'virtual space' and contains the virtual addresses of each page. It then maps to the 'physical space', which contains the physical addresses of each page.
A wikipedia diagram to make my explanation clearer:
https://upload.wikimedia.org/wikipedia/commons/3/32/Virtual_address_space_and_physical_address_space_relationship.svg
Is my understanding of this concept correct?
Thank you.
Not entirely correct.
Each program has its own virtual address space. Technically, there is only one address space, the physical random-access memory. Therefore it's called "virtual" because to the user program it seems as if it has its own address space.
Now, take the instruction mov 0x1234, %eax (AT&T) or MOV EAX, [0x1234] (Intel) as an example:
The CPU sends the virtual address 0x1234 to one of its parts, the MMU.
The MMU obtains the corresponding physical address from the page table. This process of adjusting the address is also lovingly called "massaging."
The CPU retrieves the data from the RAM location the physical address refers to.
The concrete translation process depends heavily on the actual architecture and CPU.
The page table is classified as 'virtual space' and contains the virtual addresses of each page. It then maps to the 'physical space', which contains the physical addresses of each page.
This is not really correct. The page table defines a logical address space consisting of pages. The page table maps the logical pages to physical page frames they indicate that the page frame does not [yet] exist in memory in which case you have a virtual mapping. A page is VIRTUAL when memory is simulated using disk space.
In the olde days, page tables always established a virtual address space. Now it is becoming increasingly common (e.g. embedded system) to use logical address translation without virtual memory (paging). Thus, the terms "virtual memory" and "logical memory" are frequently conflated.
The physical address space exists only to the operating system. The process sees only a logical address space.
That is a bit of an oversimplification because the process becomes the operating system after an exception or interrupt and the kernel operates within a common logical address range. However, the operating system kernel does have to manage physical memory to some degree.
For example, some aspect of the page tables must use physical addresses. If the page tables used all logical addresses, then you'd have a chicken and egg problem for address translation. Various hardware systems address this problem in different ways.
Finally, the diagram you link to is a very poor illustration.

What exactly happens inside the OS to cause the segmentation fault

I have read a lot about the virtual address and paging.
Let me first tell you people what i understood. When a process wants to execute something it tries to load the data from the hard disk to memory. To do this it uses a virtual address. So our MMU validates the virtual address looks into the TLB to find the corresponding physical page, if it doesn't find there it looks into Inverted Page Table and at the end it looks into Page table if it doesn't find an entry over there it generates a page fault and all the swapping of page is done and all the tables will be updated.
And as I read all the processes have different page tables and same virtual address. so if I try access an array element a[1000] which was defined as int a[100] I am sure that am gonna get a segmentation fault cause that instruction might be trying to access a memory that doesn't belong to it. but how OS comes to know that a[1000] doesn't belong to the running process by just using the concept of virtual address and physical pages. Am I missing something here or my entire understanding is wrong?
I know we can say a memory access is illegal if a process is trying to access a read only or sup true memory segment.
at the end the boiling question is how OS decides which memory is allocated to which process and how it decides that this access of memory is illegal.
What is a segmentation fault on Linux?
this link didn't help much .
thanks a lot in advance for all you lovely people's inputs :)
Actually each and every process in the Linux kernel has some well defined structure associated with it called task_struct which stores all the information about the corresponding process like its parent, its PID, its child, its address space, pending signals, threads associated with it etc. Now the address space entries tell the kernel while executing the process the legal address space for that process.Every process has its own address space allotted to it by the kernel right from the very beginning when it is created. So when a executing process tries to access the space outside its legal memory a fault is generated(called segmentation fault in Unix/Linux) and the process is terminated by giving a signal to it by the kernel. Its important for memory protection to be achieved in OS.
On x86, linux uses a combination of segmentation and paging, so the address generated by program first looks up for the corresponding segments base and limit registers values. This gives the virtual address which is then translated using the page table. When you try to access a memory which has not been allocated, the accessed page is beyond what the limit register allows, hence generating a segmentation fault.

Providing physical address directly to the TLB in x86-64

Is it possible to provide physical address for a given virtual address in a direct way to the TLB on x86-64 architectures in long mode?
For example, lets say, I put zeros in PML4E, so a page fault exception will be triggered because an invalid address will be found, during the exception can the CPU tell the TLB by using some instruction that this virtual address is located at X physical page frame?
I want to do this because by code I can easily tell where the physical address would be, and this way avoid expensive page walk.
No, you need to put a page to the TLB. To be precise, you need to create/update appropriate PTE (with PDE and PDPE if needed). Everything around MMU management is somehow based on page tables and TLB. Even user/supervisor protection mode is done as a special flag of mapped page.
Why do you think that "page walk" is expensive operation? It is not expensive at all. To determine the PTE that must be updated you need to dereference only 4 pointers: PML4E -> PDPE -> PDE -> PTE. These entries are just indices in related tables. To get PML4E you need to use 39-47 bits of address taken during page fault handling and use the value as an index in PML4 table. To get PDPE you need 30-39 bits of an address as an index in PDE table and so on. It's not the thing that can slow down your system. I think allocation of a physical page takes more time than that.