Suppose I have a 2 entry TLB and am using LRU Clock replacement. Further suppose that I have a TLB miss and its a page fault, so I load in a page into memory and update TLB, now my TLB has 1 entry. Next, I have another TLB miss and its a page fault, so I load in a page into memory and update TLB now my TLB is full.
Now, let us say that there is an another TLB miss and its a page fault and I need to evict an entry from my TLB using LRU Clock.
My question is, is the reference bit for the two entries in the TLB 1? (i.e. When we add in a new TLB entry do we set the reference bit to 1 initially? Or is it initialized to 0?). If not is the first TLB entry's reference bit 0 and the second a 1? (Would this be because during the 2nd-page fault we set the reference bit for the first entry to 0?)
Lastly which entry would be evicted? (first or second)
Related
I am a novice in an operating system and curious about the page fault.
I am reading Operating System Concetp(10th) and it say:
A handling page fault follow these sequence
1. Trap to the operating system
2. Save the user registers and process state
3. Determine that the interrupt was a page fault
4. Check that the page reference was legal and determine the location of the page on the disk
5. Issue a read from the disk to a free frame:
a. Wait in a queue for this device until the read request is serviced
b. Wait for the device seek and/or latency time
c. Begin the transfer of the page to a free frame
6. While waiting, allocate the CPU to some other user
7. Receive an interrupt from the disk I/O subsystem (I/O completed)
8. Save the registers and process state for the other user
9. Determine that the interrupt was from the disk
10. Correct the page table and other tables to show that the page is now in memory
11. Wait for the CPU to be allocated to this process again
12. Restore the user registers, process state, and new page table, and then resume the interrupted instruction
I cannot imagine other tables in step 10 other than the page table.
Can you give me some examples?
Modern CPUs expect several page tables. For x86-64, the MMU (memory management unit) in the CPU takes the address stored in the CR3 register (stored there at boot by the OS). This address is the address of the first table (PML4). The PML4 table contains addresses of the second table and so on. The virtual address is split into 5 parts. The first 4 parts are the offsets in respective tables and the last part (12 bits) is the offset in the page in main memory. Today, only 48 bits of the 64 bits virtual addresses are used.
When you use virtual addressing the CPU will translate the virtual address using the MMU before putting the address on the address bus. It will basically cross each table taking one part of the virtual address at a time until it reaches the last table which is present in memory.
When a page fault occurs, a certain interrupt line is raised by the CPU. The interrupt number (of a page fault) is linked to a certain handler in the IDT by the OS at boot. The address of the IDT in RAM is stored in the LIDT register. The handler will do the steps mentioned in your example. It will correct all page tables and return.
As TLB flushing in case of process context switch, why each process starts from scratch in TLB when given charge.
Why don't we fill in first few page table entries in the TLB as it can work in same fashion as we use locality of reference in memory management, i.e. when a process comes to execute, it is very likely that it will start with instruction 1 or the first instruction of the first few pages that are loaded in the main memory?
It can reduce the problem of filling up TLB during execution n speed up the system.
When CPU generates virtual address, The corresponding page will be searched in TLB, if it is not present in TLB, it will be searched in the next level memory, and then it will be placed in the TLB by following the suitable replacement algorithm.
The system can't predict in which frame the page containing so called 'instruction 1' is placed. If that was the case, then there won't be any need of page replacement algorithms, Instead it can replace all the required pages sequentially like page with first instruction, page with second instruction .. and so on.
Suppose that you have a 64-bit system and that your OS is scheduling two processes on it. Assume that the core has access to a 4-entry TLB for 4KB page size, and full associativity. Furthermore, assume that the core has a 64-byte direct-mapped cache with 16 byte cache lines. Now suppose that your processes, A and B, have
the following page tables:
Process A Page Table
Now suppose that your OS schedules process A and in it, memory references to the following virtual address are made.
0x2002
For the memory reference presented above, detail all TLB access(whether they are hits or misses) and all cache accesses(whether they are hits or misses). Assume hardware page table walks and a physically addressed cache.
This may depends on the OS, but in general as I understand that when there a page fault (the desired page is not in main memory) occurs OS will instruct CPU to read the page from disk, and I am wondering does OS dispatch to another process while the disk I/O ? if it does then there will be a complete flushing of the TLB on a context switch, correct ?
More or less, but a page fault doesn't always mean the page is on the disk (it could also not exist at all, be a lazy-allocation page, be a copy-on-write page that was written to, exist but be marked unreadable/unwritable, etc). But if that's how it is, it's probably going to schedule an other thread at least because disk IO takes approximately forever.
The amount of switching necessary depends on what it switches to, switching between threads from the same context doesn't imply a TLB flush. If a TLB flush is necessary, it's probably not a complete flush, because of global pages (so typically, you're not flushing out TLB entries for kernel pages). There is also PCID to avoid complete flushes (flushing can be limited to specified process context IDs), but that's quite recent, and tricky to use since there are only 4096 different IDs.
Process-specific pages are marked as non-global entries with nG(non-global) bit in TLB entry and also stores the pid(Address ID in ARM's terminology).
Now the article clearly lays out this concept.
"For non-global entries, when the TLB is updated and the entry is marked as non-global, a value is stored in the TLB entry in addition to the normal translation information. This value is called the Address Space ID (ASID), which is a number assigned by the OS to each individual task. Subsequent TLB look-ups only match on that entry if the current ASID matches with the ASID that is stored in the entry. This permits multiple valid TLB entries to be present for a particular page marked as non-global, but with different ASID values. In other words, we do not necessarily need to flush the TLBs when we context switch."
Source: https://developer.arm.com/documentation/den0024/a/The-Memory-Management-Unit/Context-switching
Can any one please make me clear what is the difference between virtual memory and swap space?
And why do we say that for a 32-bit machine the maximum virtual memory accessible is 4 GB only?
There's an excellent explantation of virtual memory over on superuser.
Simply put, virtual memory is a combination of RAM and disk space that running processes can use.
Swap space is the portion of virtual memory that is on the hard disk, used when RAM is full.
As for why 32bit CPU is limited to 4gb virtual memory, it's addressed well here:
By definition, a 32-bit processor uses
32 bits to refer to the location of
each byte of memory. 2^32 = 4.2
billion, which means a memory address
that's 32 bits long can only refer to
4.2 billion unique locations (i.e. 4 GB).
There is some confusion regarding the term Virtual Memory, and it actually refers to the following two very different concepts
Using disk pages to extend the conceptual amount of physical memory a computer has - The correct term for this is actually Paging
An abstraction used by various OS/CPUs to create the illusion of each process running in a separate contiguous address space.
Swap space, OTOH, is the name of the portion of disk used to store additional RAM pages when not in use.
An important realization to make is that the former is transparently possible due to the hardware and OS support of the latter.
In order to make better sense of all this, you should consider how the "Virtual Memory" (as in definition 2) is supported by the CPU and OS.
Suppose you have a 32 bit pointer (64 bit points are similar, but use slightly different mechanisms). Once "Virtual Memory" has been enabled, the processor considers this pointer to be made as three parts.
The highest 10 bits are a Page Directory Entry
The following 10 bits are a Page Table Entry
The last 12 bits make up the Page Offset
Now, when the CPU tries to access the contents of a pointer, it first consults the Page Directory table - a table consisting of 1024 entries (in the X86 architecture the location of which is pointed to by the CR3 register). The 10 bits Page Directory Entry is an index in this table, which points to the physical location of the Page Table. This, in turn, is another table of 1024 entries each of which is a pointer in physical memory, and several important control bits. (We'll get back to these later). Once a page has been found, the last 12 bits are used to find an address within that page.
There are many more details (TLBs, Large Pages, PAE, Selectors, Page Protection) but the short explanation above captures the gist of things.
Using this translation mechanism, an OS can use a different set of physical pages for each process, thus giving each process the illusion of having all the memory for itself (as each process gets its own Page Directory)
On top of this Virtual Memory the OS may also add the concept of Paging. One of the control bits discussed earlier allows to specify whether an entry is "Present". If it isn't present, an attempt to access that entry would result in a Page Fault exception. The OS can capture this exception and act accordingly. OSs supporting swapping/paging can thus decide to load a page from the Swap Space, fix the translation tables, and then issue the memory access again.
This is where the two terms combine, an OS supporting Virtual Memory and Paging can give processes the illusion of having more memory than actually present by paging (swapping) pages in and out of the swap area.
As to your last question (Why is it said 32 bit CPU is limited to 4GB Virtual Memory). This refers to the "Virtual Memory" of definition 2, and is an immediate result of the pointer size. If the CPU can only use 32 bit pointers, you have only 32 bit to express different addresses, this gives you 2^32 = 4GB of addressable memory.
Hope this makes things a bit clearer.
IMHO it is terribly misleading to use the concept of swap space as equivalent to virtual memory. VM is a concept much more general than swap space. Among other things, VM allows processes to reference virtual addresses during execution, which are translated into physical addresses with the support of hardware and page tables. Thus processes do not concern about how much physical memory the system has, or where the instruction or data is actually resident in the physical memory hierarchy. VM allows this mapping. The referenced item (instruction or data) may be resident in L1, or L2, or RAM, or finally on disk, in which case it is loaded into main memory.
Swap space it is just a place on secondary memory where pages are stored when they are inactive. If there is no sufficient RAM, the OS may decide to swap-out pages of a process, to make room for other process pages. The processor never ever executes instruction or read/write data directly from swap space.
Notice that it would be possible to have swap space in a system with no VM. That is, processes that directly access physical addresses, still could have portions of it on
disk.
Though the thread is quite old and has already been answered. Still would like to share this link as this is the simplest explanation I have found so far. Below link has got diagrams for better visualization.
Key Difference: Virtual memory is an abstraction of the main memory. It extends the available memory of the computer by storing the inactive parts of the content RAM on a disk. Whenever the content is required, it fetches it back to the RAM. Swap memory or swap space is a part of the hard disk drive that is used for virtual memory. Thus, both are also used interchangeably.
Virtual memory is quiet different from the physical memory. Programmers get direct access to the virtual memory rather than physical memory. Virtual memory is an abstraction of the main memory. It is used to hide the information of the real physical memory of the system. It extends the available memory of the computer by storing the inactive parts of the RAM's content on a disk. When the content is required, it fetches it back to the RAM. Virtual memory creates an illusion of a whole address space with addresses beginning with zero. It is mainly preferred for its optimization feature by which it reduces the space requirements. It is composed of the available RAM and disk space.
Swap memory is generally called as swap space. Swap space refers to the portion of the virtual memory which is reserved as a temporary storage location. Swap space is utilized when available RAM is not able to meet the requirement of the system’s memory. For example, in Linux memory system, the kernel locates each page in the physical memory or in the swap space. The kernel also maintains a table in which the information regarding the swapped out pages and pages in physical memory is kept.
The pages that have not been accessed since a long time are sent to the swap space area. The process is referred to as swapping out. In case the same page is required, it is swapped in physical memory by swapping out a different page. Thus, one can conclude that swap memory and virtual memory are interconnected as swap memory is used for the technique of virtual memory.
difference-between-virtual-memory-and-swap-memory
"Virtual memory" is a generic term. In Windows, it is called as Paging or pagination. In Linux, it is called as Swap.