Need of Virtual memory? [closed] - operating-system

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I was recently asked a question that in a computer system, if the primary memory(RAM) is comparable to the secondary memory (HDD) then is there a need for virtual memory to be implemented in such a computer system ?
Since paging and segmentation require context switching, which is purely processing overhead, would the benefits of virtual memory overshoot the processing overhead it requires ?
Can someone help me with this question ?
Thanku

It is true that with virtual memory, you are able to have your programs commit (i.e. allocate) a total of more memory that physically available. However, this is only one of many benefits if having virtual memory and it's not even the most important one. Personally, when I use a PC, I periodically check task manager to see how close I come to using my actual RAM. If I constantly go over, I go and I buy more RAM.
The key attribute of all OSes that use virtual memory is that every process has its own isolated address space. That means you can have a machine with 1GB of RAM and have 50 processes running but each one will still have 4GB of addressable memory space (32-bit OS assumed). Why is it important? It's not that you can "fake things out" and use RAM that isn't there. As soon as you go over and swapping starts, your virtual memory manager will begin thrashing and performance will come a halt. A much more important implication of this is that if each program has it's own address space, there's no way it can write to any random memory location and affect another program.
That's the main advantage: stability/reliability. In Windows 95, you could write an application that would crash entire operating system. In W2K+, it is simply impossible to write a program that paves all over its own address space and crashes anything other than self.
There are few other advantages as well. When executables and DLLs are loaded into RAM, virtual memory manager can detect when the same binary is loaded more than once and it will make multiple processes share the same physical RAM. At virtual memory level, it appears as if each process has its own copy, but at a lower level, it all gets mapped to one spot. This speeds up program startup and also optimizes memory usage since each DLL is only loaded once.
Virtual memory managers also allow you to perform file I/O by simply mapping files to pages in the virtual address space. In addition to introducing interesting alternative to working with files, this also allows for implementations of shared memory segments which is when physical RAM with read/write pages is intentionally shared between processes for extremely efficient inter-process communications (IPC).
With all these benefits, if we consider that most of the time you still want to shoot for having more physical RAM than total commit size and consider that modern CPUs have support for virtual address mapping built directly into the hardware, the overhead of having virtual memory manager is actually very minimal. On the other hand, in environments where many applications from many different vendors run concurrently, process address space is priceless.

I'm going to dump my understanding of this matter, with absolutely no background credentials to back it up. Gonna get downvoted? :)
First up, by saying primary memory is comparable to secondary memory, I assume you mean in terms of space. (Afterall, accessing RAM is faster than accessing storage).
Now, as I understand it,
Random Access Memory is limited by Address Space, which is the addresses which the operating system can store stuff in. A 32bit operating system is limited to roughly 4gb of RAM, while 64bit operating systems are (theoretically) limited to 2.3EXABYTES of RAM, although Windows 7 limits it to 200gb for Ultimate edition, and 2tb for Server 2008.
Of course, there are still multiple factors, such as
cost to manufacture RAM. (8gb on a single ram thingie(?) still in the hundreds)
dimm slots on motherboards (I've seen boards with 4 slots)
But for the purpose of this discussion let us ignore these limitations, and talk just about space.
Let us talk about how applications nowadays deal with memory. Applications do not know how much memory exists - for the most part, it simply requisitions it from the operating system. The operating system is the one responsible for managing which address spaces have been allocated to each application that is running. If it does not have enough, well, bad things happen.
But, surely with theoretical 2EXABYTES of RAM, you'd never run out?
Well, a famous person long ago once said we'd never need more than 64kBs of RAM.
Because most Applications nowadays are greedy (they take as much as the operating system is willing to give), if you ran enough applications, on a powerful enough computer, you could theoretically exceed the storage limits of the physical memory. In that case, Virtual Memory would be required to make up the extra required memory.
So to answer your question: (in my humble opinion formed from limited knowledge on the matter,) yes you'd still need to implement virtual memory.
Obviously take all this and do your own research. I'm turning this into a community wiki so others can edit it or just delete it if it is plain wrong :)

Virtual memory working
It may not ans your whole question. But it seems the ans to me

Related

The need of virtual memory on 64 bit processors

What is the need of virtual memory on 64 bit microprocessor? As i know it can address around 16exabyte memory. So why do we still need paging.
Thanks in advance
In addition to providing virtual memory, paging is used to control memory protection, to provide separation between different applications and between applications and the operating system. Paging also allow different applications to use the same linear address to access different memory locations.
The memory pager is also capable of doing other very useful things, such as mapping a file to memory and paging only the blocks that are actually used from disk, mapping the same data into multiple processes with copy-on-write, giving each program only as much physical RAM as it actually uses, implementing shared memory, memory-mapped I/O and virtualization.
The main reason to have virtual memory is to be able to work with more data than the system has physical memory, but most of the underlying infrastructure (with the significant exception of the paging algorithm) would be needed anyway, and has hardware support.
In the future we may see paging go away. One other problem is that we have systems with 8GB of physical memory with 64 bit processors. As soon as you need more than 8GB of memory, you have to resort to paging. It should not be that long until we have computer systems that have terrabytes of memory and paging will not be necessary.
In that case we will need new operating systems and even new computer systems to take advantage of such large memory.

How do computers prevent programs from interfering with each other?

For example, I heard in class that global variables are just put in a specific location in memory. What is to prevent two programs from accidentally using the same memory location for different variables?
Also, do both programs use the same stack for their arguments and local variables? If so, what's to prevent the variables from interleaving with each other and messing up the indexing?
Just curious.
Most modern processors have a memory management unit (MMU) that provide the OS the ability to create protected separate memory sections for each process including a separate stack for each process. With the help of the MMU the processor can restrict each process to modifying / accessing only memory that has been allocated to it. This prevents one process from writing into a another processes memory space.
Most modern operating systems will use the features of the MMU to provide protection for each process.
Here are some useful links:
Memory Management Unit
Virtual Memory
This is something that modern operating systems do by loading each process in a separate virtual address space. Multiple processes may reference the same virtual address, but the operating system, helped by modern hardware, will map each one to a separate physical address, and make sure that one process cannot access physical memory allocated to another process1.
1 Debuggers are a notable exception: operating system often provide special mechanisms for debuggers to attach to other processes and examine their memory space.
The short answer to your question is that the operating system deals with these issues. They are very serious issues, and a significant percentage of an operating systems job is keeping everything in a separate space. The operating system runs programs that track all the other programs and make sure they are each using a space. This keeps the stacks separate too. Each program is running its own stack assigned by the OS. How the OS does this assigning is actually a complex task.

Application to manage its own Virtual Memory

I have a slight doubt regarding virtual memory.
Normally, it is up to the OS to provide virtual memory to using disk space to expand the amount of memory which appears to be available for applications.
The OS will clear physical memory by copying the data to disk and restoring when needed.
However, it is possible for an application to manage its own “virtual memory” rather than the OS, for example by writing objects to a file then destroying them?
If so, is allowing application to manage its own virtual memory for advantageous or allowing the OS to provide?
Most applications would not be able to even know that they are being managed using virtual memory because the operating system would perform address translation on every memory request made by your application.
This is a task definitely best left to the operating system unless you are working in a very low-level environment (in which case you are probably writing your own operating system anyway).
Aside from the fact that this requires kernel privileges to accomplish, you would need to take care not to corrupt other process' memory.
The operating system is the best place for this kind of logic.
It is not just not advantageous for the application to manage its own virtual memory, it is not possible with standard operating systems (Windows, Unix, Linux, Mac OS X, etc.).
Translation from virtual address to physical address is done by the Memory Management Unit of the system, which is typically firmware, not strictly part of the operating system software.
The only part of the process done by the operating system software is handling of page faults (swapping units of virtual memory to and from backing store), when the address translation finds a reference to a virtual address that is not currently mapped in physical memory.
What could be advantageous is for an application to minimize its use of virtual memory by writing out its own data to disk rather than allocating larger amounts of virtual memory. However, this will only yield a benefit if the application's disk i/o is more efficient than the operating system page handler's disk i/o - an unlikely scenario these days.

Is Virtual memory really useful all the time?

Virtual memory is a good concept currently used by modern operating systems. But I was stuck answering a question and was not sure enough about it. Here is the question:
Suppose there are only a few applications running on a machine, such that the
physical memory of system is more than the memory required by all the
applications. To support virtual memory, the OS needs to do a lot work. So if
the running applications all fit in the physical memory, is virtual memory
really needed?
(Furthermore, the applications running together will always fit in RAM.)
Even when the memory usage of all applications fits in physical memory, virtual memory is still useful. VM can provide these features:
Privileged memory isolation (every app can't touch the kernel or memory-mapped hardware devices)
Interprocess memory isolation (one app can't see another app's memory)
Static memory addresses (e.g. every app has main() at address 0x0800 0000)
Lazy memory (e.g. pages in the stack are allocated and set to zero when first accessed)
Redirected memory (e.g. memory-mapped files)
Shared program code (if more than one instance of a program or library is running, its code only needs to be stored in memory once)
While not strictly needed in this scenario, virtual memory is about more than just providing "more" memory than is physically available (swapping). For example, it helps avoiding memory fragmentation (from an application point of view) and depending on how dynamic/shared libraries are implemented, it can help to avoid relocation (relocation is when the dynamic linker needs to adapt pointers in a library or executable that was just loaded).
A few more points to consider:
Buggy apps that don't handle failures in the memory allocation code
Buggy apps that leak allocated memory
Virtual memory reduces severity of these bugs.
The other replies list valid reasons why virtual memory is useful but
I would like to answer the question more directly : No, virtual memory
is not needed in the situation you describe and not using virtual
memory can be the right trade-off in such situations.
Seymour Cray took the position that "Virtual memory leads to virtual
performance." and most (all?) Cray vector machines lacked virtual
memory. This usually leads to higher performance on the process level
(no translations needed, processes are contiguous in RAM) but can lead
to poorer resource usage on the system level (the OS cannot utilize
RAM fully since it gets fragmented on the process level).
So if a system is targeting maximum performance (as opposed to maximum
resource utilization) skipping virtual memory can make sense.
When you experience the severe performance (and stability) problems
often seen on modern Unix-based HPC cluster nodes when users
oversubscribe RAM and the system starts to page to disk, there is a
certain sympathy with the Cray model where the process either starts
and runs at max performance, or it doesn't start at all.

Why can't we build virtualization that runs on several machines? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
You're probably familiar with virtualization which takes a single host and is able to "emulate" many instances by sharing the resources among them all. You probably heard about XEN.
Is it completely insane to imagine the "opposite" of XEN : a layer that would abstract several hosts in a single running instance? I believe this would allow building apps which wouldn't need to really care much about a "clustering" layer themselves.
I wonder what are the technical limits to this, because I'm pretty sure some people are already working on it somewhere :)
The goal is NOT to achieve any kind of failure recovery. I believe this can (and should?) be handled at a higher level. For example, if someone is able to run a MySQL server on a gigantic instance (made of say 50 hosts), then one can easily use MySQL's replication features to replicate the database over a similar virtual instance.
Good question. Microsoft Azure is attempting to address this by allowing you to put applications "in the cloud" and not have to be as concerned with scalability up/down, redundancy, data storage, etc. But this is not accomplished at the hypervisor level.
http://www.microsoft.com/windowsazure/
Hardware-wise, there are some downsides to having everything be one big VM rather than many smaller ones. For one thing, software doesn't always understand how to handle all the resources. For example, some applications still can't handle multiple processor cores. I've seen informal benchmarks showing that IIS performs better spreading the same resources over multiple instances rather than one giant instance.
From a management perspective, it is probably better to have multiple VMs in certain cases. Imagine that a bad deployment corrupts a node. If that were your one and only (albeit giant) node, now your whole application is down.
You're probably talking about the concept Single System Image.
There used to be a Linux implementation, openMosix that since closed down. I don't know of any replacements. openMosix made it very easy to create and use SSI on a standard Linux kernel; too bad it got overtaken by events.
I do not know enough about Xen to know if it is possible but with VMware you can create pools of resources which come from many physical hosts. Then you can assign the resources to your VMs. That could be many VMs or just one VM.
Aggregation: Transform Isolated Resources into Shared Pools
Simulating a single core over multiple physical cores is very inefficient. You can do it, but it'll be slower than a cluster. Two physical cores can talk to each other in near-real-time, if they're on separate machines then you're doing something like say clocking down your motherboard speed by factors of 10 or more if these two physical cores (and RAM) are communicating even over a fibre optic network.
Dual cores can communicate faster than two distinct CPUs on the same motherboard, if they are on separate machines, thats slower again, if there are multiple machines, slower even again.
Basically you can, but there is net performance loss compared to the net performance gain you would be hoping to achieve.
Real life example, I had a bunch of VMs on a dual quad core server (~2.5Ghz/core) performing way, way below what they should have been. On closer inspection, it turned out that the hypervisor was emulating a single 3.5-4Ghz core when the load on an individual VM was more than 2.5Ghz -- after limiting each VM to 2.5Ghz performance went back to what was expected.
I agree with saidimu, you are talking about the Single System Image concept. In addition to the OpenMosix project, there have been several commercial implementations of the same idea (one contemporary example is ScaleMP). It's not a new idea.
I just wanted to elaborate on some of the technical points of SSI.
Basically, the reason it's not done is because the performance is generally absolutely unpredictable or terrible. There is a concept in computer systems known as [NUMA][3], which basically means that the cost of accessing different pieces of memory is not uniform. This can apply to huge systems where CPUs may have some memory accesses routed around to different chips, or in cases where memory is accessed remotely over a network (such as in SSI). Typically, the operating system will attempt to compensate for this by laying out programs and data in memory in such a way that a program can run as quickly as possible. I.e., the code and data will all be placed in the same NUMA "region", and be scheduled on the closest possible CPU.
However, in cases where you are running big applications (attempting to use all the memory in your SSI), there is little the operating system can do to reduce the impact of remote memory fetches. MySQL is not aware that accessing page 0x1f3c will cost 8 nanoseconds, while accessing page 0x7f46 will stall it for hundreds of microseconds, possibly milliseconds while the memory is fetched over the network. This means that non-NUMA aware applications will run like crap (seriously, very bad) in this kind of environment. As far as I know, most comtemporary SSI products rely on the fastest possible interconnects (such as Infiniband) between machines to achieve even a passable performance.
This is also why frameworks that expose the true cost of accessing data to the programmer (such as MPI: message passing interface) have achieved more traction than SSI or DSM (distributed shared memory) approaches. In fact, there is basically no way for a programmer to optimize an application to run in an SSI environment, which just sucks.