What's the difference between "virtual memory" and "swap space"? - operating-system

Can any one please make me clear what is the difference between virtual memory and swap space?
And why do we say that for a 32-bit machine the maximum virtual memory accessible is 4 GB only?

There's an excellent explantation of virtual memory over on superuser.
Simply put, virtual memory is a combination of RAM and disk space that running processes can use.
Swap space is the portion of virtual memory that is on the hard disk, used when RAM is full.
As for why 32bit CPU is limited to 4gb virtual memory, it's addressed well here:
By definition, a 32-bit processor uses
32 bits to refer to the location of
each byte of memory. 2^32 = 4.2
billion, which means a memory address
that's 32 bits long can only refer to
4.2 billion unique locations (i.e. 4 GB).

There is some confusion regarding the term Virtual Memory, and it actually refers to the following two very different concepts
Using disk pages to extend the conceptual amount of physical memory a computer has - The correct term for this is actually Paging
An abstraction used by various OS/CPUs to create the illusion of each process running in a separate contiguous address space.
Swap space, OTOH, is the name of the portion of disk used to store additional RAM pages when not in use.
An important realization to make is that the former is transparently possible due to the hardware and OS support of the latter.
In order to make better sense of all this, you should consider how the "Virtual Memory" (as in definition 2) is supported by the CPU and OS.
Suppose you have a 32 bit pointer (64 bit points are similar, but use slightly different mechanisms). Once "Virtual Memory" has been enabled, the processor considers this pointer to be made as three parts.
The highest 10 bits are a Page Directory Entry
The following 10 bits are a Page Table Entry
The last 12 bits make up the Page Offset
Now, when the CPU tries to access the contents of a pointer, it first consults the Page Directory table - a table consisting of 1024 entries (in the X86 architecture the location of which is pointed to by the CR3 register). The 10 bits Page Directory Entry is an index in this table, which points to the physical location of the Page Table. This, in turn, is another table of 1024 entries each of which is a pointer in physical memory, and several important control bits. (We'll get back to these later). Once a page has been found, the last 12 bits are used to find an address within that page.
There are many more details (TLBs, Large Pages, PAE, Selectors, Page Protection) but the short explanation above captures the gist of things.
Using this translation mechanism, an OS can use a different set of physical pages for each process, thus giving each process the illusion of having all the memory for itself (as each process gets its own Page Directory)
On top of this Virtual Memory the OS may also add the concept of Paging. One of the control bits discussed earlier allows to specify whether an entry is "Present". If it isn't present, an attempt to access that entry would result in a Page Fault exception. The OS can capture this exception and act accordingly. OSs supporting swapping/paging can thus decide to load a page from the Swap Space, fix the translation tables, and then issue the memory access again.
This is where the two terms combine, an OS supporting Virtual Memory and Paging can give processes the illusion of having more memory than actually present by paging (swapping) pages in and out of the swap area.
As to your last question (Why is it said 32 bit CPU is limited to 4GB Virtual Memory). This refers to the "Virtual Memory" of definition 2, and is an immediate result of the pointer size. If the CPU can only use 32 bit pointers, you have only 32 bit to express different addresses, this gives you 2^32 = 4GB of addressable memory.
Hope this makes things a bit clearer.

IMHO it is terribly misleading to use the concept of swap space as equivalent to virtual memory. VM is a concept much more general than swap space. Among other things, VM allows processes to reference virtual addresses during execution, which are translated into physical addresses with the support of hardware and page tables. Thus processes do not concern about how much physical memory the system has, or where the instruction or data is actually resident in the physical memory hierarchy. VM allows this mapping. The referenced item (instruction or data) may be resident in L1, or L2, or RAM, or finally on disk, in which case it is loaded into main memory.
Swap space it is just a place on secondary memory where pages are stored when they are inactive. If there is no sufficient RAM, the OS may decide to swap-out pages of a process, to make room for other process pages. The processor never ever executes instruction or read/write data directly from swap space.
Notice that it would be possible to have swap space in a system with no VM. That is, processes that directly access physical addresses, still could have portions of it on
disk.

Though the thread is quite old and has already been answered. Still would like to share this link as this is the simplest explanation I have found so far. Below link has got diagrams for better visualization.
Key Difference: Virtual memory is an abstraction of the main memory. It extends the available memory of the computer by storing the inactive parts of the content RAM on a disk. Whenever the content is required, it fetches it back to the RAM. Swap memory or swap space is a part of the hard disk drive that is used for virtual memory. Thus, both are also used interchangeably.
Virtual memory is quiet different from the physical memory. Programmers get direct access to the virtual memory rather than physical memory. Virtual memory is an abstraction of the main memory. It is used to hide the information of the real physical memory of the system. It extends the available memory of the computer by storing the inactive parts of the RAM's content on a disk. When the content is required, it fetches it back to the RAM. Virtual memory creates an illusion of a whole address space with addresses beginning with zero. It is mainly preferred for its optimization feature by which it reduces the space requirements. It is composed of the available RAM and disk space.
Swap memory is generally called as swap space. Swap space refers to the portion of the virtual memory which is reserved as a temporary storage location. Swap space is utilized when available RAM is not able to meet the requirement of the system’s memory. For example, in Linux memory system, the kernel locates each page in the physical memory or in the swap space. The kernel also maintains a table in which the information regarding the swapped out pages and pages in physical memory is kept.
The pages that have not been accessed since a long time are sent to the swap space area. The process is referred to as swapping out. In case the same page is required, it is swapped in physical memory by swapping out a different page. Thus, one can conclude that swap memory and virtual memory are interconnected as swap memory is used for the technique of virtual memory.
difference-between-virtual-memory-and-swap-memory

"Virtual memory" is a generic term. In Windows, it is called as Paging or pagination. In Linux, it is called as Swap.

Related

Memory Address Translation in OS

Is Memory address translation only useful when the total size of virtual memory
(summed over all processes) needs to be larger than physical memory?
Basically, the size of virtual memory depends on what you call "virtual memory". If you call virtual memory the virtual memory of one process then virtual memory has the same size than physical memory. If you call virtual memory the whole virtual memory of all processes than virtual memory can (technically) have an infinite size. This is because every process can have a whole address space. The virtual address space of one process cannot be bigger than physical memory because the processor has limited bits to address this memory. In modern long mode the processor has only 48 bits to address RAM at the byte level. This gives a very big amount of RAM but most systems will have 8GB to 32GB.
Technically on a 8GB RAM computer, every process could have 8GB allocated. I say technically because eventually, the computer will constantly be removing page frames from RAM and that will put too much overhead on the OS and on the computer which will make your system freeze. In the end, the size of the sum of the virtual memory of every process is limited by the capacity of your system (and OS) to have an efficient page swapping algorithm (and on your willingness to have a slow system).
Now to answer your question, paging (virtual memory) is used also to avoid fragmentation and for securing the system. With the old segmentation model, fragmentation was an issue because you had to run a complex algorithm to determine which part of memory a process gets. With paging, the smallest granularity of memory is 4KB. This makes everything much easier because a small process just gets a 4KB page and the process can work in that page the way it wants. While a bigger process will get several pages and can allocate more pages by doing a system call. There is still the issue of external fragmentation but it is mostly due to latency of accessing high memory vs low memory. Basically, paging solves the issue of external fragmentation because a process can get a page anywhere (where it's available) and it will not make a difference (except for high vs low memory). There is still the issue of internal fragmentation with paging.
Paging also secures the system. With segmentation you had several levels of ring protection. With paging you have only user or supervisor. With segmentation, the memory is not well protected because one process can access the memory of another process in the same segment. With paging, there are 2 different protections. The first protection is the ring itself (user vs supervisor) the second are the page tables. The page tables isolate one process from another because the memory accesses are translated to other positions in RAM. It is the job of the OS to fill the page tables properly so that one process doesn't have access to the physical memory allocated to another process. The user vs supervisor bit in the page tables, prevent one process from accessing the kernel except via a system call interface (the instruction syscall in assembly for x86).

Logical Address Space is Larger than Physical and Backing store combined

When a virtual address space is larger than the physical memory, OS can use swapping to evict page frames (e.g. LRU eviction). CPU generates Page Fault where then page that is in disk is swapped into the main memory. What happens when the virtual address is large enough that neither primary memory or disk have enough storage to hold it? What happens when a page frame is not in the disk either? Is another page fault called?
What happens when the virtual address is large enough that neither primary memory or disk have enough storage to hold it?
A virtual memory system maintains an image of the logical address space in secondary storage. A well-designed operating system is not going to allow a process to map a logical address that does not have a backing already in secondary storage. When your application calls a system service to map pages to the logical address space, the call will fail if there is no secondary storage available for the pages.
What happens when a page frame is not in the disk either?
There are some poorly designed operating systems that will map pages without having secondary storage behind them. You call the system service to map pages, it succeeds even if the pages could not be backed in secondary storage.
In that case, you get a memory exception upon access (and get no hint in your application that the real problem as a memory allocation failure).
Is another page fault called?
No.
In a logical memory system (as supported by most processors) a page has two states:
1. Mapped
2. Unmapped
In a virtual memory system, there are three states:
1. Mapped
2. Unmapped and valid
3. Unmapped and invalid
When a page fault occurs, the processor just knows the page is not mapped to memory. The operating system then has to figure out if the page is in secondary storage somewhere. If it is not, the operating causes the process to see an exception. If it is, the operating system loads and maps the page, the lets the process continue on its merry way.
When a virtual address space is larger than the physical memory, OS can use swapping to evict page frames (e.g. LRU eviction)
Lets assume that a virtual address is 48-bit (so the size of one virtual address space is 256 TiB), and you're running 123 processes where each one has its own virtual address space. This adds up to a total of 31488 TiB of virtual address space. Note: This is "very normal" for a modern 80x86 PC running a modern OS (Windows, Linux, ...).
Out of this 31488 TiB:
almost all of it will be unused and marked as "not present". If software tries to access it you get a page fault, the page fault handler realizes it's a bug, and you probably end up with a SIGSEGV (or "blue screen of death" or ...). Because it isn't being used the OS doesn't need any RAM or any disk space for it.
some of it will be the same things loaded into RAM once and then mapped into many virtual address spaces. This is extremely common for the kernel itself and for shared libraries/DLLs. It also includes cases where the same RAM is used for the virtual file system cache and for memory mapped files, or the same RAM is mapped into 2 or more processes as "shared memory", or when the same RAM is mapped into 2 or more virtual address spaces as "copy on write" (e.g. in the aftermath of fork()).
some will be "allocate on write" - literally the same page full of zeros mapped at many virtual addresses in many virtual address spaces, where if you write to it you get a page fault and the page fault handler allocates a new page of RAM for the page you tried to write to. This allows the OS to pretend that a huge amount of virtual space is allocated and filled with zeros without using any RAM or any disk space (until it actually is modified).
some will be (modified) data that is unique to a specific process.
The end result is that the 31488 TiB of total virtual space might only need a few GiB of RAM (and probably won't use swap space at all).
Over-commit
The OS does a pile of tricks to pretend memory was allocated when it actually wasn't. This creates the potential for a worst case where all the memory the OS pretends is allocated actually does need to be allocated. There are 2 ways to deal with this:
a) Refuse to let processes allocate more if you can't cover the worst case (e.g. return a "not enough memory" error when a process tries to allocate more than the OS can supply). This is bad because the worst case is extremely unlikely and you end up with software failing for no reason ("not enough memory" when there's actually plenty of memory to cover current requirements).
b) Allow processes allocate more (within reason), even if you can't cover the worst case. This works fine most of the time, but if the worst case actually happens something has to break (e.g. the OS terminates a process to free up some RAM).
The best option (in my opinion) is the first option (don't allow over-commit), but to have a large amount of swap space. Essentially; this is like "allow over-commit of RAM, but don't allow over-commit of swap space + RAM"; where the OS will probably be running slowly (due to excessive swap space use) before it has to start telling processes "no more memory"; and where most of the time everything will be in RAM (and ideally swap space is only used to cover the unlikely worst case).

Are pages only secondary memory like hard drives, or are they used for RAM too?

I'm confused by this.
Are pages only memory units that exist in secondary memory or do they also exist in RAM too?
A memory page is the smallest unit of memory used by a virtual memory manager. A page can be backed by physical RAM, or by swap space or a page file on a hard drive. Pages backed by RAM have much faster IO, but as RAM gets full the OS may have to swap out pages to the hard drive.
Pages do not exist [physically] at all. A page is simply a redirection mechanism.
The operating system sets up of linear, logical address space for each process. The logical address space is organized into pages that in turn may map to:
A physical page frame of memory
No where
Somewhere on disk and managed by the operating system.
Paging is a memory management scheme by which a computer stores and retrieves data from secondary storage for use in main memory. Pages are used in RAM too, as a solution of external fragmentation.External fragmentation is a situation when total free space is enough to hold another process but space available is not contiguous. Compaction is one of the solution but for processes which are run-time loaded only. So, Paging is the true solution for external fragmentation where we implement page table which gives illusion that process has been given contiguous memory. Every address from CPU is broken down to page number and offset.

Diff. between Logical memory and Physical memory

While understanding the concept of Paging in Memory Management, I came through the terms "logical memory" and "physical memory". Can anyone please tell me the diff. between the two ???
Does physical memory = Hard Disk
and logical memory = RAM
There are three related concepts here:
Physical -- An actual device
Logical -- A translation to a physical device
Virtual -- A simulation of a physical device
The term "logical memory" is rarely used because we normally use the term "virtual memory" to cover both the virtual and logical translations of memory.
In an address translation, we have a page index and a byte index into that page.
The page index to the Nth path in the process could be called a logical memory. The operating system redirects the ordinal page number into some arbitrary physical address.
The reason this is rarely called logical memory is that the page made be simulated using paging, becoming a virtual address.
Address transition is a combination of logical and virtual. The normal usage is to just call the whole thing "virtual memory."
We can imagine that in the future, as memory grows, that paging will go away entirely. Instead of having virtual memory systems we will have logical memory systems.
Not a lot of clarity here thus far, here goes:
Physical Memory is what the CPU addresses on its address bus. It's the lowest level software can get to. Physical memory is organized as a sequence of 8-bit bytes, each with a physical address.
Every application having to manage its memory at a physical level is obviously not feasible. So, since the early days, CPUs introduced abstractions of memory known collectively as "Memory Management." These are all optional, but ubiquitous, CPU features managed by your kernel:
Linear Memory is what user-level programs address in their code. It's seen as a contiguous addresses space, but behind the scenes each linear address maps to a physical address. This allows user-level programs to address memory in a common way and leaves the management of physical memory to the kernel.
However, it's not so simple. User-level programs address linear memory using different memory models. One you may have heard of is the segmented memory model. Under this model, programs address memory using logical addresses. Each logical address refers to a table entry which maps to a linear address space. In this way, the o/s can break up an application into different parts of memory as a security feature (details out of scope for here)
In Intel 64-bit (IA-32e, 64-bit submode), segmented memory is never used, and instead every program can address all 2^64 bytes of linear address space using a flat memory model. As the name implies, all of linear memory is available at a byte-accessible level. This is the most straightforward.
Finally we get to Virtual Memory. This is a feature of the CPU facilitated by the MMU, totally unseen to user-level programs, and managed by the kernel. It allows physical addresses to be mapped to virtual addresses, organized as tables of pages ("page tables"). When virtual memory ("paging") is enabled, tables can be loaded into the CPU, causing memory addresses referenced by a program to be translated to physical addresses transparently. Page tables are swapped in and out on the fly by the kernel when different programs are run. This allows for optimization and security in process/memory management (details out of scope for here)
Keep in mind, Linear and Virtual memory are independent features which can work in conjunction. If paging is disabled, linear addresses map one-to-one with physical addresses. When enabled, linear addresses are mapped to virtual memory.
Notes:
This is all linux/x86 specific but the same concepts apply almost everywhere.
There are a ton of details I glossed over
If you want to know more, read The Intel® 64 and IA-32 Architectures Software Developer Manual, from where I plagiarized most of this
I'd like to add a simple answer here.
Physical Memory : This is the memory that is actually present and every process needs space here to execute their code.
Logical Memory:
To a user program the memory seems contiguous,Suppose a program needs 100 MB of space in memory,To this program a virtual address space / Logical address space starts from 0 and continues to some finite number.This address is generated by CPU and then The MMU then maps this virtual address to real physical address through some page table or any other way the mapping is implemented.
Please correct me or add some more content here. Thanks !
Physical memory is RAM; Actually belongs to main memory. Logical address is the address generated by CPU. In paging,logical address is mapped into physical address with the help of page tables. Logical address contains page number and an offset address.
An address generated by the CPU is commonly referred to as a logical address, whereas an address seen by the memory unit—that is, the one loaded into the memory-address register of the memory—is commonly referred to as a physical address
The physical address is the actual address of the frame where each page will be placed, whereas the logical address is the address generated by the CPU for each page.
What exactly is a frame?
Processes are retrieved from secondary memory and stored in main memory using the paging storing technique.
Processes are kept in secondary memory as non-contiguous pages, which implies they are stored in random locations.
Those non-contiguous pages are retrieved into main Memory as a frame by the paging operating system.
The operating system divides the memory frame size equally in main memory, and all processes retrieved from secondary memory are stored concurrently.

Where are multiple stacks and heaps put in virtual memory?

I'm writing a kernel and need (and want) to put multiple stacks and heaps into virtual memory, but I can't figure out how to place them efficiently. How do normal programs do it?
How (or where) are stacks and heaps placed into the limited virtual memory provided by a 32-bit system, such that they have as much growing space as possible?
For example, when a trivial program is loaded into memory, the layout of its address space might look like this:
[ Code Data BSS Heap-> ... <-Stack ]
In this case the heap can grow as big as virtual memory allows (e.g. up to the stack), and I believe this is how the heap works for most programs. There is no predefined upper bound.
Many programs have shared libraries that are put somewhere in the virtual address space.
Then there are multi-threaded programs that have multiple stacks, one for each thread. And .NET programs have multiple heaps, all of which have to be able to grow one way or another.
I just don't see how this is done reasonably efficient without putting a predefined limit on the size of all heaps and stacks.
I'll assume you have the basics in your kernel done, a trap handler for page faults that can map a virtual memory page to RAM. Next level up, you need a virtual memory address space manager from which usermode code can request address space. Pick a segment granularity that prevents excessive fragmentation, 64KB (16 pages) is a good number. Allow usermode code to both reserve space and commit space. A simple bitmap of 4GB/64KB = 64K x 2 bits to keep track of segment state gets the job done. The page fault trap handler also needs to consult this bitmap to know whether the page request is valid or not.
A stack is a fixed size VM allocation, typically 1 megabyte. A thread usually only needs a handful of pages of it, depending on function nesting level, so reserve the 1MB and commit only the top few pages. When the thread nests deeper, it will trip a page fault and the kernel can simply map the extra page to RAM to allow the thread to continue. You'll want to mark the bottom few pages as special, when the thread page faults on those, you declare this website's name.
The most important job of the heap manager is to prevent fragmentation. The best way to do that is to create a lookaside list that partitions heap requests by size. Everything less than 8 bytes comes from the first list of segments. 8 to 16 from the second, 16 to 32 from the third, etcetera. Increasing the size bucket as you go up. You'll have to play with the bucket sizes to get the best balance. Very large allocations come directly from the VM address manager.
The first time an entry in the lookaside list is hit, you allocate a new VM segment. You subdivide the segment into smaller blocks with a linked list. When such an allocation is released, you add the block to the list of free blocks. All blocks have the same size regardless of the program request so there won't be any fragmentation. When the segment is fully used and no free blocks are available you allocate a new segment. When a segment contains nothing but free blocks you can return it to the VM manager.
This scheme allows you to create any number of stacks and heaps.
Simply put, as your system resources are always finite, you can't go limitless.
Memory management always consists of several layers each having its well defined responsibility. From the perspective of the program, the application-level manager is visible that is usually concerned only with its own single allocated heap. A level above could deal with creating the multiple heaps if needed out of (its) one global heap and assigning them to subprograms (each with its own memory manager). Above that could be the standard malloc()/free() that it uses and above those the operating system dealing with pages and actual memory allocation per process (it is basically not concerned not only about multiple heaps, but even user-level heaps in general).
Memory management is costly and so is trapping into the kernel. Combining the two could impose severe performance hit, so what seems to be the actual heap management from the application's point of view is actually implemented in user space (the C runtime library) for the sake of performance (and other reason out of scope for now).
When loading a shared (DLL) library, if it is loaded at program startup, it will of course be most probably loaded to CODE/DATA/etc so no heap fragmentation occurs. On the other hand, if it is loaded at runtime, there's pretty much no other chance than using up heap space.
Static libraries are, of course, simply linked into the CODE/DATA/BSS/etc sections.
At the end of the day, you'll need to impose limits to heaps and stacks so that they're not likely to overflow, but you can allocate others.
If one needs to grow beyond that limit, you can either
Terminate the application with error
Have the memory manager allocate/resize/move the memory block for that stack/heap and most probably defragment the heap (its own level) afterwards; that's why free() usually performs poorly.
Considering a pretty large, 1KB stack frame on every call as an average (might happen if the application developer is unexperienced) a 10MB stack would be sufficient for 10240 nested call -s. BTW, besides that, there's pretty much no need for more than one stack and heap per thread.