How can virtual address space be paged? - operating-system

While I was reading this Wikipedia article, http://en.wikipedia.org/wiki/Memory_management_unit#How_it_works, I came across that divide virtual address space (range of address used by processor) into pages. But I have learnt that only the physical memory (RAM) is divided into pages. So how is the division of virtual address space of a process done?
Also, here the definition of virtual address space goes as range of address used by processor. Range of address used by processor means the length of address bus in processor, right? So if I am having a processor of address bus of 32 bits, and a RAM of 4 GB (2^32), is my physical and virtual address space same?
Bear with me if the questions are too naive.. I am still not getting a very clear visualization of address space. Thanks in advance.

The answer is specific to each OS, but in general terms it means that though each process gets say 32 bits worth of addressable memory, this memory space is divided in to ranges or pages of a certain size.
Simplistically speaking when your process accesses an address, that location will be in a certain page. The OS will ensure that there is physical memory that is mapped to that location. However it may not be in the same address in physical ram.
When some other process addresses that location then the OS will map in a page of physical ram at that so that location too will be addressable.
All the time the physical memory pages are being mapped to and from disk (so that you can have memory greater than 32 bits worth_\, and the virtual memory pages are being mapped to physical pages just described.
I really recommend reading the links in this question https://stackoverflow.com/questions/1437914/best-book-on-operating-systems

Related

How to design memory map (physical RAM and ROM address bus connections) for 8086 microprocessor?

It is desired to design a 64kb memory decoder for the 8086 microprocessor using two 16KB SRAM and two 16KB EPROM chips.
Assuming the microprocessor wants to write to memory address 0A557H, which chip select and location select will the system use?
I know that the 8086 microprocessor has 20 address lines and 14 address lines could be used for location select if we were considering only one 16KB RAM and 4 lines for chip select.
Location select: A0-A13 and Chip select: A14-A19
I am confused about going forward to solve this question as we expect to see such in exams.
How to design memory map...?
You can put the each of the 2 16k-RAMs and 2 16k-ROMs almost anywhere in the 1MB address space (i.e. as long as no two overlap each other), so you might develop some other criteria to help you choose.
  It would make most sense for the RAMs to be consecutive with each other, so that you'd have one larger 32k RAM area rather than two 16k RAM section separated in the address space — that will result in more flexibility and ease of use for software.
  Software memory model: the small memory model for x86 means that the program need store only 16 bits for pointers.  The manner in which code in ROM interacts with code in RAM interact may or may not suggest placing both with some 16 bit address space.
  Expandability:  If the system later allows expansion of either the amount of RAM or ROM, there is then an argument for leaving a gap between them for the other to expand into, at the cost of having to deal with a larger portion of the 1MB address space, possibly requiring pointers larger than 16-bits wide to be used in some cases.
  Using the least number of address lines would suggest using the lowest part of the 1MB address space.
Other criteria can be identified also, I'm sure.

How to find bits of physical address and virtual address?

A system has 512MB physical memory and 4GB virtual memory with a page size of
8KB. (Sznarks)
Find the number of bits for the physical address and the virtual address.
Find the number of pages and the number of frames.
Find the number of bits for the physical address and the virtual address.
The physical address space needs to be large enough for more than just RAM (you need space for ROM and memory mapped devices; and often there's "unused" space for future expansion). There isn't enough information to say anything other than "physical addresses probably have more than 29 bits", but in that case it's likely that the hardware designer would round it up to a nice "power of 2", so 32 bits is the most likely for number of bits in a physical address.
Virtual addresses would have 32 bits (because "log2(4 GiB) == 32").
Find the number of pages and the number of frames.
"512 MiB / 8 KiB" works out to 65536 physical pages of RAM alone (excluding ROM, memory mapped devices, etc).
"4 GiB / 8 KiB" works out to a maximum of 524288 virtual pages per virtual address space.

Why is there an OS segment reserved in virtual memory

Why is there a portion of virtual memory reserved for OS? Why is it limited to a certain size? This seems to be a universally known fact because when I googled I didn't find anyone asking similar questions.
If the OS segment (the part in VM reserved for OS) is accessed, what happens?
How does the OS segment affect the translation between virtual and physical memory?
For example if your virtual memory is 128KB, the first 32KB is allocated for seg 0 and the last 32KB for seg 1. Then you reserve the first 16KB for the OS seg. What happens to seg 0? Does its size shrink to 16KB because 16KB has been changed to OS seg? Or does it stays the same?
Why is there a portion of virtual memory reserved for OS? Why is it limited to a certain size? This seems to be a universally known fact because when I googled I didn't find anyone asking similar questions.
The reason some area of the logical address space is reserved for the OS is because the same physical memory is shared by all processes and it needs to be at the same location.
When an interrupt occurs, any process can be running. So the kernel mode handler needs to be in the same location.
Usually the reserved OS area is so large that the actual OS will never come close to using it all. So it is not really limited in size.
If the OS segment is accessed, what happens?
That depends upon how it is accessed. If a process accesses it in kernel mode (system call, interrupt, exception), that is normal. If it accesses the reserved area in user mode, it usually triggers an access violation of some kind. Some systems may make some areas of system memory readable from user mode but usually is all write protected.
How does the OS segment affect the translation between virtual and physical memory?
This is system dependent. Some systems make the user page tables pageable. The user page tables can then be in pageable areas in the system address space. In other words, the page tables are in virtual/logical memory, giving an additional translation for user addresses that does not occur for system addresses
Doing the same for the system address space would cause a chicken and egg problem. In such a system, the system page tables would be in physical locations (another reason everyone uses the same address range for system space).
Other systems use physical addresses for all page tables. In case, they translation is the same.
For example if your virtual memory is 128KB, the first 32KB is allocated for seg 0 and the last 32KB for seg 1. Then you reserve the first 16KB for the OS seg. What happens to seg 0? Does its size shrink to 16KB because 16KB has been changed to OS seg? Or does it stays the same?
This is not a good example. Virtual memory is never this small. Imagine a 32-bit system. The virtual address space is 4GB. The system assigns the first 3 GB to user the user space and the last 1 GB to the system space.
All processes share the same 1GB system space. They have there own, unique 3 GB user space.

Is there any limit on a process’s virtual memory? If so what?

I faced this question in my interview, I answered that, there is no limit, as virtual memory itself imaginary thing, so we don't have any limit.
But I don't understand any proper answer by googling.
Kindly help me out in this and explain the memory limit of virtual memory.
The maximum theoretical size for virtual memory is given by the size of a pointer. The largest number that can be represented by the pointer is the maximum theoretical size of virtual memory. The units are the minimal addressable memory unit (typically bytes).
Real operating systems sometimes impose additional restrictions.
There are a number of restrictions on virtual memory.
The address range of the underlying hardware.
Any subdivisions of the address space. Some ranges may be reserved (for example, System and User address spaces) Some may be invalid altogether. Example: VAX divides the 32-bit address evenly into 2 user spaces, a system space, and a reserved (unusable space).
Limits the operating system imposes on page table size. Must system have a parameter and/or account setting limiting this.
The size of the page file.

Why do x86-64 systems have only a 48 bit virtual address space?

In a book I read the following:
32-bit processors have 2^32 possible addresses, while current 64-bit processors have a 48-bit address space
My expectation was that if it's a 64-bit processor, the address space should also be 2^64.
So I was wondering what is the reason for this limitation?
Because that's all that's needed. 48 bits give you an address space of 256 terabyte. That's a lot. You're not going to see a system which needs more than that any time soon.
So CPU manufacturers took a shortcut. They use an instruction set which allows a full 64-bit address space, but current CPUs just only use the lower 48 bits. The alternative was wasting transistors on handling a bigger address space which wasn't going to be needed for many years.
So once we get near the 48-bit limit, it's just a matter of releasing CPUs that handle the full address space, but it won't require any changes to the instruction set, and it won't break compatibility.
Any answer referring to the bus size and physical memory is slightly mistaken, since OP's question was about virtual address space not physical address space. For example the supposedly analogous limit on some 386's was a limit on the physical memory they could use, not the virtual address space, which was always a full 32 bits. In principle you could use a full 64 bits of virtual address space even with only a few MB of physical memory; of course you could do so by swapping, or for specialized tasks where you want to map the same page at most addresses (e.g. certain sparse-data operations).
I think the real answer is that AMD was just being cheap and hoped nobody would care for now, but I don't have references to cite.
Read the limitations section of the wikipedia article:
A PC cannot contain 4 petabytes of memory (due to the size of current memory chips if nothing else) but AMD envisioned large servers, shared memory clusters, and other uses of physical address space that might approach this in the foreseeable future, and the 52 bit physical address provides ample room for expansion while not incurring the cost of implementing 64-bit physical addresses
That is, there's no point implementing full 64 bit addressing at this point, because we can't build a system that could utilize such an address space in full - so we pick something that's practical for today's (and tomorrow's) systems.
The internal native register/operation width does not need to be reflected in the external address bus width.
Say you have a 64 bit processor which only needs to access 1 megabyte of RAM. A 20 bit address bus is all that is required. Why bother with the cost and hardware complexity of all the extra pins that you won't use?
The Motorola 68000 was like this; 32 bit internally, but with a 23 bit address bus (and a 16 bit data bus). The CPU could access 16 megabytes of RAM, and to load the native data type (32 bits) took two memory accesses (each bearing 16 bits of data).
There is a more severe reason than just saving transistors in the CPU address path: if you increase the size of the address space you need to increase the page size, increase the size of the page tables, or have a deeper page table structure (that is more levels of translation tables). All of these things increase the cost of a TLB miss, which hurts performance.
From my point of view, this is result from the page size.Each page at most contains 4096/8 =512 entries of page table. And 2^9 =512. So 9 * 4 + 12=48.
Many people have this misconception. But I am promising to you if you read this carefully, after reading this all your misconceptions will be cleart.
To say a processor 32 bit or 64 bit doesn't signify it should have 32 bit address bus or 64 bit address bus respectively!...I repeat it DOESN'T!!
32 bit processor means it has 32 bit ALU (Arithmetic and Logic Unit)...that means it can operate on 32 bit binary operand (or simply saying a binary number having 32 digits) and similarly 64 bit processor can operate on 64 bit binary operand. So weather a processor 32 bit or 64 bit DOESN'T signify the maximum amount of memory can be installed. They just show how large the operand can be...(for analogy you can think of a 10-digit calculator can calculate results upto 10 digits...it cannot give us 11 digits or any other bigger results... although it is in decimal but I am telling this analogy for simplicity)...but what you are saying is address space that is the maximum directly interfaceable size of memory (RAM). The RAM's maximum possible size is determined by the size of the address bus and it is not the size of the data bus or even ALU on which the processor's size is defined (32/64 bit). Yes if a processor has 32 bit "Address bus" then it is able to address 2^32 byte=4GB of RAM (or for 64 bit it will be 2^64)...but saying a processor 32 bit or 64 bit has nothing relevance to this address space (address space=how far it can access to the memory or the maximum size of RAM) and it is only depended on the size of its ALU. Of course data bus and address bus may be of same sized and then it may seem that 32 bit processor means it will access 2^32 byte or 4 GB memory...but it is a coincidence only and it won't be the same for all....for example intel 8086 is a 16 bit processor (as it has 16 bit ALU) so as your saying it should have accessed to 2^16 byte=64 KB of memory but it is not true. It can access upto 1 MB of memory for having 20 bit address bus....You can google if you have any doubts:)
I think I have made my point clear.Now coming to your question...as 64 bit processor doesn't mean that it must have 64 bit address bus so there is nohing wrong of having a 48 bit address bus in a 64 bit processor...they kept the address space smaller to make the design and fabrication cheap....as nobody gonna use such a big memory (2^64 byte)...where 2^48 byte is more than enough nowadays.
To answer the original question: There was no need to add more than 48 Bits of PA.
Servers need the maximum amount of memory, so let's try to dig deeper.
1) The largest (commonly used) server configuration is an 8 Socket system. An 8S system is nothing but 8 Server CPU's connected by a high speed coherent interconnect (or simply, a high speed "bus") to form a single node. There are larger clusters out there but they are few and far between, we are talking commonly used configurations here. Note that in the real world usages, 2 Socket system is one of the most commonly used servers, and 8S is typically considered very high end.
2) The main types of memory used by servers are byte addressable regular DRAM memory (eg DDR3/DDR4 memory), Memory Mapped IO - MMIO (such as memory used by an add-in card), as well as Configuration Space used to configure the devices that are present in the system. The first type of memory is the one that are usually the biggest (and hence need the biggest number of address bits). Some high end servers use a large amount of MMIO as well depending on what the actual configuration of the system is.
3) Assume each server CPU can house 16 DDR4 DIMMs in each slot. With a maximum size DDR4 DIMM of 256GB. (Depending on the version of server, this number of possible DIMMs per socket is actually less than 16 DIMMs, but continue reading for the sake of the example).
So each socket can theoretically have 16*256GB=4096GB = 4 TB.
For our example 8S system, the DRAM size can be a maximum of 4*8= 32 TB. This means that
the max number of bits needed to address this DRAM space is 45 (=log2 32TB/log2 2).
We wont go into the details of the other types of memory (MMIO, MMCFG etc), but the point here is that the most "demanding" type of memory for an 8 Socket system with the largest types of DDR4 DIMMs available today (256 GB DIMMs) use only 45 bits.
For an OS that supports 48 bits (WS16 for example), there are (48-45=) 3 remaining bits.
Which means that if we used the lower 45 bits solely for 32TB of DRAM, we still have 2^3 times of addressable memory which can be used for MMIO/MMCFG for a total of 256 TB of addressable space.
So, to summarize:
1) 48 bits of Physical address is plenty of bits to support the largest systems of today that are "fully loaded" with copious amounts of DDR4 and also plenty of other IO devices that demand MMIO space. 256TB to be exact.
Note that this 256TB address space (=48bits of physical address) does NOT include any disk drives like SATA drives because they are NOT part of the address map, they only include the memory that is byte-addressable, and is exposed to the OS.
2) CPU hardware may choose to implement 46, 48 or > 48 bits depending on the generation of the server. But another important factor is how many bits does the OS recognize.
Today, WS16 supports 48 bit Physical addresses (=256 TB).
What this means to the user is, even though one has a large, ultra modern server CPU that can support >48 bits of addressing, if you run an OS that only supports 48 bits of PA, then you can only take advantage of 256 TB.
3) All in all, there are two main factors to take advantage of higher number of address bits (= more memory capacity).
a) How many bits does your CPU HW support? (This can be determined by CPUID instruction in Intel CPUs).
b) What OS version are you running and how many bits of PA does it recognize/support.
The min of (a,b) will ultimately determine the amount of addressable space your system can take advantage of.
I have written this response without looking into the other responses in detail. Also, I have not delved in detail into the nuances of MMIO, MMCFG and the entirety of the address map construction. But I do hope this helps.
Thanks,
Anand K Enamandram,
Server Platform Architect
Intel Corporation
It's not true that only the low-order 48 bits of a 64 bit VA are used, at least with Intel 64. The upper 16 bits are used, sort of, kind of.
Section 3.3.7.1 Canonical Addressing in the Intel® 64 and IA-32 Architectures Software Developer’s Manual says:
a canonical address must have bits 63 through 48 set to zeros or ones (depending on whether bit 47 is a zero or one)
So bits 47 thru 63 form a super-bit, either all 1 or all 0. If an address isn't in canonical form, the implementation should fault.
On AArch64, this is different. According to the ARMv8 Instruction Set Overview, it's a 49-bit VA.
The AArch64 memory translation system supports a 49-bit virtual address (48 bits per translation table). Virtual addresses are sign- extended from 49 bits, and stored within a 64-bit pointer. Optionally, under control of a system register, the most significant 8 bits of a 64-bit pointer may hold a “tag” which will be ignored when used as a load/store address or the target of an indirect branch
A CPU is considered "N-bits" mainly upon its data-bus size, and upon big part of it's entities (internal architecture): Registers, Accumulators, Arithmetic-Logic-Unit (ALU), Instruction Set, etc. For example: The good old Motorola 6800 (or Intel 8050) CPU is a 8-bits CPU. It has a 8-bits data-bus, 8-bits internal architecture, & a 16-bits address-bus.
Although N-bits CPU may have some other than N-size entities. For example the impovments in the 6809 over the 6800 (both of them are 8-bits CPU with a 8-bits data-bus). Among the significant enhancements introduced in the 6809 were the use of two 8-bit accumulators (A and B, which could be combined into a single 16-bit register, D), two 16-bit index registers (X, Y) and two 16-bit stack pointers.