About Operating System, about page-table entries status bits - operating-system

In the movie The Social Network, when Mark Zuckberg was in class, the teacher asked this question:
Suppose we're given a computer, with a 16-bit virtual address, and a page size of 256-bytes,the system uses one-level page tables that start at address hex 400, may be you want DMA (Direct Memory Access) on your 16-bit system. Who knows? The first pages are reserved for hardware flags, etc. Assume page-table entries have eight status bits. The eight status bits would then be ...
Mark Zuckberg answered:
One valid bit, one modified bit, one reference bit and five permission bits.
How did he get this?

http://chomaloma.blogspot.com.au/2011/02/social-network-inaccuracies-regarding.html
That does explain it a little

Intel nomenclature in parentheses. The 'valid' (present), 'modified' (dirty) and 'reference' (accessed) bits are the minimum set of bits you need for a demand paging manager and MMU.
The 'valid' (present) bit is used by the MMU to know whether the page is mapped to a valid physical address.
The 'modified' (dirty) bit is used by the demand paging manager to determine if the page being evicted needs to be written to backing media. As accessing backing media can be considered an expensive operation, you really want to keep this to a minimum--especially when writing to it as that is generally slower than reading from it.
The 'reference' (accessed) bit is useful to the demand paging manager to figure out how to age the pages it controls. You don't want to evict the most frequently used pages as that would require saving and/or loading them repeatedly from backing store (which has already been stated as SLOW).
The remaining five bits are gravy. They are free to use as permission and/or option bits. For example, can the page be accessed by supervisor and/or user threads? Is the page available for write, or is it read-only? What is the caching strategy to be used on the page?
Hope this helps.
Sparky

How did he get the answer?---That is just movie BS.
If you take the number of bits in the address and subtract the number of bits used to represent a page, you get the number of bits available for the processor to use as system status bits.
With that information, he could identify the number of system status bits
The usage of those bits is another story. The allocation of system status bits is system dependent. Maybe they exist, but I don't know of any 16-bit virtual addressing system. So he's not referring to any specific type of system.
A reference bit is not used by all systems (e.g. VMS). That's not even mandatory.
Hollywood magic.

Related

Why is Page Size specified as part of Instruction Set Architecture?

I am trying to understand why is Page Size specified as part of an ISA.
More specifically, I am looking for details where any of the hardware modules (MMU, TLB) (apart from the Operating System) use the Page Size information to provide a certain functionality.
Please let me know the reasons Page Size has to be part of the ISA instead of just being decided by the OS.
Thanks.
The TLB hardware has to know the page size to figure out whether a translation applies to an address or not. e.g. given a translation, does an address 2500 bytes above it use that translation or not?
Or to put it another way, the TLB has to know which address bits are part of the page offset (within a page), and which bits need translating from virtual to physical.
Also, on architectures with HW page walk, the whole page table format is part of the ISA, and the typical design uses the virtual page number as an index to find the right entry (e.g. x86-64's 4-level page tables). Not a linear or binary search through the page table to find an entry that contains the virtual address being searched for. Normally this same design is used for page tables walked by software, AFAIK.
It is possible to build a TLB where each entry has a mask to control how many address bits it matches. i.e. where a single TLB can have entries for pages of multiples sizes. This only works if pages have power-of-2 sizes and are naturally aligned (i.e. the start address of a page is always some multiple of its size, so zeroing the low bits of an address inside a page gives you the page-start address).
You could potentially use this with an extent-based page-table format, where you have one entry for each contiguous mapping instead of one entry for each page.
Page-walks would probably be more costly, having to check more entries for more mappings, but the same number of TLB entries could cover more address space.
In many cases OSes want to be able to easily unmap or even page out unused pages, and this conflicts with using huge pages that cover a mix of hot and cold data or especially code. (But normal fixed-size hugepages have this problem, too, so x86-64's 2M and 1G hugepages aren't always a win vs. standard 4k pages.)
Page size isn't a part of the ISA (what a compiler would normally emit) for x86_64. The instruction set architecture for x86_64 is formally known as Intel® 64 Architecture, and it is briefly described in section 2.2.10 (volume 1) of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It describes what an application program can see and do. There is something similar for ARMv8.
Instead, page size is left to the OS, and it isn't a part of the ISA. This is because page sizes can vary amongst implementations and can vary according to mode settings (4K/2M/4M/1G). x86_64 implementations present something like an ISA to the OS which Intel refers to as the system programming level (what an OS would use). That's described in Chapter 13 of volume 2 of Intel's Software Developer's Manual.
That level describes page sizes and modes. But a 'correct' application program should run with different page sizes on different systems in different page size modes.

How can I limit the number of blocks written in a Write_10 command?

I have a product that is basically a USB flash drive based on an NXP LPC18xx microcontroller. I'm using a library provided from the manufacturer (LPCOpen) that handles the USB MSC and the SD card media (which is where I store data).
Here is the problem: Internally the LPC18xx has a 64kB (limited by hardware) buffer used to cache reads/writes which means it can only cache up to 128 blocks(512B) of memory. The SCSI Write-10 command has a total-blocks field that can be up to 256 blocks (128kB). When originally testing the product on Windows 7 it never writes more than 128 blocks at a time but when tested on Linux it sometimes writes more than 128 blocks, which causes the microcontroller to crash.
Is there a way to tell the host OS not to request more than 128 blocks? I see references[1] to a Read-Block-Limit command(05h) but it doesn't seem to be widely supported. Also, what sense key would I return on the Write-10 command to tell Linux the write is too large? I also see references to a block limit VPD page in some device spec sheets but cannot find a lot of documentation about how it is implemented.
[1]https://en.wikipedia.org/wiki/SCSI_command
Let me offer a disclaimer up front that this is what you SHOULD do, but none of this may work. A cursory search of the Linux SCSI driver didn't show me what I wanted to see. So, I'm not at all sure that "doing the right thing" will get you the results you want.
Going by the book, you've got to do two things: implement the Block Limits VPD and handle too-large transfer sizes in WRITE AND READ.
First, implement the Block Limits VPD page, which you can find in late revisions of SBC-3 floating around on the Internet (like this one: http://www.13thmonkey.org/documentation/SCSI/sbc3r25.pdf). It's probably worth going to the t10.org site, registering, and then downloading the last revision (http://www.t10.org/cgi-bin/ac.pl?t=f&f=sbc3r36.pdf).
The Block Limits VPD page has a maximum transfer length field that specifies the maximum number of blocks that can be transferred by all the READ and WRITE commands, and basically anything else that reads or writes data. Of course the downside of implementing this page is that you have to make sure that all the other fields you return are correct!
Second, when handling READ and WRITE, if the command's transfer length exceeds your maximum, respond with an ILLEGAL REQUEST key, and set the additional sense code to INVALID FIELD IN CDB. This behavior is indicated by a table in the section that describes the Block Limits VPD, but only in late revisions of SBC-3 (I'm looking at 35h).
You might just start with returning INVALID FIELD IN CDB, since it's the easiest course of action. See if that's enough?

How to log a page reference string of a process?

Operating System question:
Say we have a process running in a paged memory system, and we want to track which pages it accesses in the specific order it does so. How could we do this?
I was thinking we could write the page to the string every time it needs to be loaded into the TLB, but then my OS wouldn't be able to track the ordering of references (and number of references) to each page in the TLB, unless somehow every every memory access I could check the TLB. Overall I'm finding the problem to be a bit confusing...
Assume each page table entry is 64 bits, 20 bits for the virtual page, and 20 bits for the corresponding physical frame. A couple of bits are status/privilege, but there are a few "free" bits to work with.
Thanks.
EDIT - an example: if the operating system has page sizes of 1000, and the process accesses some addresses like 1234, 5660, 1220, 7442, ... then the page reference string would look like 1,5,1,7,...
One option is to mark all pages of the process as inaccessible and whenever there's a page fault, mark the faulting page(s) as accessible, record the page number(s) in your "string" and then let the process execute one instruction and repeat everything from the beginning (mark all as inaccessible, etc).
You may not always be able to do the above if your the code that's doing all of this is not running in the kernel. This depends a lot on the hardware and OS. You may get close to it on Windows, though, see this question and my answer to it.

Providing physical address directly to the TLB in x86-64

Is it possible to provide physical address for a given virtual address in a direct way to the TLB on x86-64 architectures in long mode?
For example, lets say, I put zeros in PML4E, so a page fault exception will be triggered because an invalid address will be found, during the exception can the CPU tell the TLB by using some instruction that this virtual address is located at X physical page frame?
I want to do this because by code I can easily tell where the physical address would be, and this way avoid expensive page walk.
No, you need to put a page to the TLB. To be precise, you need to create/update appropriate PTE (with PDE and PDPE if needed). Everything around MMU management is somehow based on page tables and TLB. Even user/supervisor protection mode is done as a special flag of mapped page.
Why do you think that "page walk" is expensive operation? It is not expensive at all. To determine the PTE that must be updated you need to dereference only 4 pointers: PML4E -> PDPE -> PDE -> PTE. These entries are just indices in related tables. To get PML4E you need to use 39-47 bits of address taken during page fault handling and use the value as an index in PML4 table. To get PDPE you need 30-39 bits of an address as an index in PDE table and so on. It's not the thing that can slow down your system. I think allocation of a physical page takes more time than that.

Does the MMU mediate everything between the operating system and physical memory or is it just an address translator?

I'm trying to understand how does an operating system work when we want to assign some value to a particular virtual memory address.
My first question concerns whether the MMU handles everything between the CPU and the RAM. Is this true? From what one can read from Wikipedia, I'd say so:
A memory management unit (MMU), sometimes called paged memory
management unit (PMMU), is a computer
hardware component responsible for
handling accesses to memory requested
by the CPU.
If that is the case, how can one tell the MMU I want to get 8 bytes, 64 or 128bytes, for example? What about writing?
If that is not the case, I'm guessing the MMU just translates virtual addresses to physical ones?
What happens when the MMU detects there will be what we call a page-fault? I guess it has to tell it to the CPU so the CPU loads the page itself off disk, or is the MMU able to do this?
Thanks
Devoured Elysium,
I'll attempt to answer your questions one by one but note, it might be a good idea to get your hands on a textbook for an OS course or an introductory computer architecture course.
The MMU consists of some hardware logic and state whose purpose is, indeed, to produce a physical address and provide/receive data to and from the memory controller. Actually, the job of memory translation is one that is taken care of by cooperating hardware and software (OS) mechanisms (at least in modern PCs). Once the physical address is obtained, the CPU has essentially done its job and now sends the address out on a bus which is at some point connected to the actual memory chips. In many systems this bus is called the Front-Side Bus (FSB), which is in turn connected to a memory controller. This controller takes the physical address supplied by the CPU and uses it to interact with the DRAM chips, and ultimately extract the bits in the correct rows and columns of the memory array. The data is then sent back to the CPU, which can now operate on it. Note that I'm not including caching in this description.
So no, the MMU does not interact directly with RAM, which I assume you are using to mean the physical DRAM chips. And you cannot tell the MMU that you want 8 bytes, or 24 bytes, or whatever, you can only supply it with an address. How many bytes that gets you depends on the machine you're on and whether it's byte-addressable or word-addressable.
Your last question urges me to remind you: the MMU is actually a part of the CPU--it sits on the same silicon die (although this was not always the case).
Now, let's take your example with the page fault. Suppose our user-level application wants to, like you said, set someAddress = 10, I'll take it in steps. Let's assume someAddress is 0xDEADBEEF and let's ignore caches for now.
1) The application issues a store instruction to 0xsomeAddress, which, in x86 might look something like
mov %eax, 0xDEADBEEF
where 10 is the value in the eax register.
2) 0xDEADBEEF in this case is a virtual address, which must be translated. Most of the time, the virtual to physical address translation will be available in a hardware structure called the Translation Lookaside Buffer (TLB), which will provide this translation to us very fast. Typically, it can do so in one clock cycle. If the translation is in the TLB, called a TLB hit, execution can continue immediately (i.e. the physical address corresponding to 0xDEADBEEF and the value 10 are sent out to the memory controller to be written).
3) Let's suppose though, that the translation wasn't available in the TLB (called a TLB miss). Then we must find the translation in the page tables, which are structures in memory whose structure is defined by the hardware and managed by the OS. They simply contain entries that map a virtual address to a physical one (more accurately, a virtual page number to a physical page number). But these structures also reside in memory, and so must have addresses! The hardware contains a special register called cr3 which contains the physical address of the current page table. We can index into this page table using our virtual address, so the hardware takes the value in cr3, computes an address by adding an offset, and goes off to memory to fetch the page table entry (PTE). This PTE will (hopefully) contain the physical address corresponding to 0xDEADBEEF, in which case we put this mapping in the TLB (so we don't have to walk the page table again) and continue on our way.
4) But oh no! What if there is no PTE in the page tables for 0xDEADBEEF? This is a page fault, and this is where the Operating System comes into play. The PTE we got out of the page table existed, as in it was (let's assume) a valid memory address to access, but the OS had not created a VA->PA mapping for it yet, so it would have had a bit set to indicate that it is invalid. The hardware is programmed in such a way that when it sees this invalid bit upon an access, it generates an exception, in this case a page fault.
5) The exception causes the hardware to invoke the OS by jumping to a well known location--a piece of code called a handler. There can be many exception handlers, and a page fault handler is one of them. The page fault handler will know the address that caused the fault because it's stored in a register somewhere, and so will create a new mapping for our virtual address 0xDEADBEEF. It will do so by allocating a free page of physical memory and then saying "all virtual addresses between VA x and VA y will map to some address within this newly allocated page of physical memory". 0xDEADBEEF will be somewhere in that range, so the mapping is now securely in the page tables, and we can restart the instruction that caused the page fault (the mov).
6) Now, when we go through the page tables again, we will find a mapping and the PTE we pull out will have a nice physical address, the one we want to store to. We provide this with the value 10 to the memory controller and we're done!
Caches will change this game quite a bit, but I hope this serves to illustrate how paging works. Again, it would benefit you greatly to check out some OS/Computer Architecture books. I hope this was clear.
There are data structures that describe which virtual addresses correspond to which physical addresses. The OS creates and manages these data structures, and the CPU uses them to translate virtual addresses into physical addresses.
For example, the OS might use these data structures to say "virtual addresses in the range from 0x00000000 to 0x00000FFF correspond to physical addresses 0x12340000 to 0x12340FFFF"; and if software tries to read 4 bytes from the virtual address 0x00000468 then the CPU will actually read 4 bytes from the physical address 0x12340468.
Typically everything is effected by the virtual->physical translation (except for when the CPU is accessing the data structures that describe the translation). Also, usually there's some sort of translation cache build into the CPU to help reduce the overhead involved.