I've read three different OS reference books (Stallings, Tanenbaum and Silberschatz) but none of them, from what I've understood, clearly indicates the content type of the outer page in a double paging scheme. In general, the outer page table makes us "find" the inner page table. But does it contain a raw physical address or something else ?
I'm currently following a OS university course, and for an exercise I have to calculate the size (bytes) of an outer page table of 16 elements, the only way I see for an outer page table to make us find an inner page table is to give us a physical address (in this case a physical address is 12 bits long), so I deduct a outer page table is 16*(12+1) bits long (+1 is for validity bit), but the correction states the size is actually 16*(4+1) because the outer page return the number of the page table (I frankly don't know how the number of the page would allow us to retrieve it in any way and I couldn't get a meaningful explanation).
I got an answer from a university Professor : There is several ways to do it, in my exercise it was proposed that the outer page table would be an "array of array", it's simple to implement but unrealistic as an array of array can be too big to be stored effectively. So a more realistic way to do this is by storing physical address pointers to the inner page tables. As the OS can't predict where these addresses will be stored beforehand, it has to use the system reserved memory that will be fixed in advance (the "low memory", I don't know the correct terminology in English for this), and thus it will be able to use raw physical addresses.
Related
The top-level page table will not occupy all its table items. What happens when I access a table item with "in / out of place" of 0?
I see that the processing method given in the book is to force a page missing interrupt, but this page missing interrupt is different from what I think. It may directly end the process. Why not remap it?
In addition, if the access to the top-level page table is random (I think the virtual address can be large), For example, in this case in the figure, it is obvious that there are many empty table items in the top-level page table.Isn't it easy to miss?
I think I can extract the key point of the problem, that is, is there a certain virtual address after the program is compiled? Can the operating system try to ensure that it does not cross the border to access unmapped addresses?
I think I have found the answer. After compiling, the program will set logical addresses for them, and the page table items of the top-level page table also correspond to the logical memory distribution of the program. For example, the program body is at the bottom, the corresponding page table item is 0, and the last item of the page table corresponds to the stack area of the program. Once the program accesses the page table items without mapping, it means that it accesses an address area that should not be accessed, I wonder if my idea is correct?
I'm sorry that my English level may make it difficult for you to understand. Thank you for your answer!
In our exam the question was asked, and I couldn't answer it. However, I wonder its answer.
If there exist 2^N bit virtual addressing, 2^M bit physical
addressing and 2^L kb page size. In single paging, what is the page
size?
This is unanswerable unless you make random/unfounded assumptions.
For a silly example, you could assume that there are only 2 physical pages and only one virtual page (M == N-1), and that the page table is the same size as a page (and only has one page table entry) and therefore a page table entry consumes 2^L bits where one of these bits are used to select the physical page and all of the other bits are used for other purposes (access permissions, tracking accessed/dirty, spare bits for the OS to use however it likes, ...).
What is the relationship between address space and page table? I know that each process should have a page table which maps between virtual address to physical address. But what does an address space do? in os161, address space looks like:
struct addrespace {
vaddr_t as_vbase1;
paddr_t as_pbase1;
size_t as_npages1;
vaddr_t as_vbase2;
paddr_t as_pbase2;
size_t as_npages2;
paddr_t as_stackpbase;
}
we translate the virtual address (vaddr) to physical address using: (assume vaddr in segment 1)
paddr = vaddr - as_vbase1 + as_pbase1
it seems that we can get the physical address from the virtual address using the addrespace. If we can use addrespace to do the virtual to physical memory mapping, why do we need the page table?
Looking forward to your help! Thanks!
Firstly, thanks a lot for this question. Even though I am still a newbie to OS161 and struggling to understand the code, I will tell you what I have understood till now. Please feel free to correct me.
We need a page table to keep track of all the pages assigned to our process, not just because we need a translation from virtual to physical address translation.
The page table also keeps track of the pages if they are in memory and if the required page is on the disk, which would trigger a page fault. In that case we should allocate a new page, load a page from the disk, update page table, and update TLB entries.
Any page undergoes different states like free, dirty (should be written to the disk), etc. There are certain pages which are never supposed to be swapped out and they always remain in memory. Page tables also keep track of these states too.
Even this article might help you..
Why one page table per process
I'm familiar with the MIPS architecture, which is has a software-managed TLB. So how and where you (the operating system) wants to store the page tables and the page table entries is completely up to you. For example I did a project with a single inverted page table; I saw others using 2-level page tables per process.
But what's the story with x86? From what I know the TLB is hardware-managed. Does x86 tell basically tell you, "Hey this is where the page table entries you're currently using need to go [physical address range]"? But wait, I've always thought x86 uses multi-level page tables, so would it tell you where to put the 1st level or something...? I'm confused.
Thanks for any help.
Upon entering protected mode, the CR3 register points to a "page directory" (you can put it anywhere you want before you enter protected mode), which is a page of memory (remember, a "small" page is 4 KiB, and a "large" page is 4 MiB) with 1024 page directory entries (PDEs) that point to to "page tables". Each entry is the top 10 bits of a pointer (the address of the page table), plus a bunch of flags that make up the bottom portion of the pointer (present, permission, dirty, etc.).
(The 1024 just comes from the fact that a page is 4096 bytes and a pointer is 4 bytes.)
Each "page table" is itself 1024 "page table entries" (PTEs), which, again, contains 1024 entries that point to physical pages in memory, along with a bunch of (almost the same) flags.
So, to translate a 32-bit virtual address, you take the top 10 bits of the pointer as an index into the table at CR3 (since there are 210 entries), and -- if that PDE is further subdivided (meaning it isn't a "large" page, which you can figure out from the flags) -- you take the top 20 bits of the PDE, look up the page table at that address, and index into it with the virtual address's next-topmost 10 bits. Then the topmost 20 bits refer you to the physical page, assuming the bottom 12 bits tell you the physical page is actually present.
If you're using Physical Address Extension (PAE), then you get another level in the hierarchy at the very top.
Note: for your own sanity (and maybe the CPU's), you'd probably want to map the page directory and the page table to themselves, otherwise things get confusing fast. :)
The TLB is hardware-managed -- so the caching of the page tables is transparent -- but there is an instruction, InvlPG, that invalidates a PTE in the the TLB for you. (I don't know exactly when you should use it and when you shouldn't.)
Source: http://wiki.osdev.org/Paging
Is it possible to provide physical address for a given virtual address in a direct way to the TLB on x86-64 architectures in long mode?
For example, lets say, I put zeros in PML4E, so a page fault exception will be triggered because an invalid address will be found, during the exception can the CPU tell the TLB by using some instruction that this virtual address is located at X physical page frame?
I want to do this because by code I can easily tell where the physical address would be, and this way avoid expensive page walk.
No, you need to put a page to the TLB. To be precise, you need to create/update appropriate PTE (with PDE and PDPE if needed). Everything around MMU management is somehow based on page tables and TLB. Even user/supervisor protection mode is done as a special flag of mapped page.
Why do you think that "page walk" is expensive operation? It is not expensive at all. To determine the PTE that must be updated you need to dereference only 4 pointers: PML4E -> PDPE -> PDE -> PTE. These entries are just indices in related tables. To get PML4E you need to use 39-47 bits of address taken during page fault handling and use the value as an index in PML4 table. To get PDPE you need 30-39 bits of an address as an index in PDE table and so on. It's not the thing that can slow down your system. I think allocation of a physical page takes more time than that.