I have read that Operating System is loaded in main memory (RAM) when computer boots up. Also, application programs are loaded into main memory (RAM) for execution. How do both of these run simultaneously in main memory? Does the operating system stop its execution when an application program is running?
I don't know of a good overview of these areas, so I'll try to help.
Memory (RAM) can be visualized as a collection of lockers. Each locker can store something independently of all other lockers. Each locker has a number, so you can find a particular locker easily. In RAM, the locker is a byte that can store a value between zero and 255, and the locker number is an address. Better than a locker; you can open the byte at address zero, then the byte at address 1000000 instantly. You don't have to walk down a long hallway. That is what the R in RAM refers to: Random, as in Random Access Memory. Essentially every location takes the same amount of time to access.
Machines have a lot of RAM, on the order of billions of bytes. Even very big operating systems do not need all of RAM; if they require 50 million bytes, that is only 50 / 1000 or 5% of what is now considered a small system. That leaves 950 million bytes for programs to use. If every program was as big as the operating system, you could run 950/50 = 19 of them. There are tricks to permit running even more.
One of the fundamental jobs of the operating system is to provision resources like RAM to applications, and make sure that applications cannot snoop on or modify each others RAM without prior arrangement. To do this, the operating system typically uses a trick where program addresses are indirectly translated to RAM addresses under control of the operating system. This way, all applications can think they have ram at (say) address 4194304. This trick is called an MMU (Memory Management Unit), and the details start to explode at this point.
Review:
RAM is a collection of places to store numbers, and each storage place has a unique address.
There is lots of RAM, so we just have to divvy it up between applications.
We can keep applications RAM separate and secret from other applications.
The operating system only uses a relatively small amount of RAM.
Related
Is Memory address translation only useful when the total size of virtual memory
(summed over all processes) needs to be larger than physical memory?
Basically, the size of virtual memory depends on what you call "virtual memory". If you call virtual memory the virtual memory of one process then virtual memory has the same size than physical memory. If you call virtual memory the whole virtual memory of all processes than virtual memory can (technically) have an infinite size. This is because every process can have a whole address space. The virtual address space of one process cannot be bigger than physical memory because the processor has limited bits to address this memory. In modern long mode the processor has only 48 bits to address RAM at the byte level. This gives a very big amount of RAM but most systems will have 8GB to 32GB.
Technically on a 8GB RAM computer, every process could have 8GB allocated. I say technically because eventually, the computer will constantly be removing page frames from RAM and that will put too much overhead on the OS and on the computer which will make your system freeze. In the end, the size of the sum of the virtual memory of every process is limited by the capacity of your system (and OS) to have an efficient page swapping algorithm (and on your willingness to have a slow system).
Now to answer your question, paging (virtual memory) is used also to avoid fragmentation and for securing the system. With the old segmentation model, fragmentation was an issue because you had to run a complex algorithm to determine which part of memory a process gets. With paging, the smallest granularity of memory is 4KB. This makes everything much easier because a small process just gets a 4KB page and the process can work in that page the way it wants. While a bigger process will get several pages and can allocate more pages by doing a system call. There is still the issue of external fragmentation but it is mostly due to latency of accessing high memory vs low memory. Basically, paging solves the issue of external fragmentation because a process can get a page anywhere (where it's available) and it will not make a difference (except for high vs low memory). There is still the issue of internal fragmentation with paging.
Paging also secures the system. With segmentation you had several levels of ring protection. With paging you have only user or supervisor. With segmentation, the memory is not well protected because one process can access the memory of another process in the same segment. With paging, there are 2 different protections. The first protection is the ring itself (user vs supervisor) the second are the page tables. The page tables isolate one process from another because the memory accesses are translated to other positions in RAM. It is the job of the OS to fill the page tables properly so that one process doesn't have access to the physical memory allocated to another process. The user vs supervisor bit in the page tables, prevent one process from accessing the kernel except via a system call interface (the instruction syscall in assembly for x86).
I am having a bit of trouble understanding how applications and data are accessed by the CPU from RAM after the application has been loaded into RAM and a file opened (thus data for the file also stored in RAM).
By my understanding, a CPU just gets instructions from RAM as the program counter ticks or carries out tasks after an interrupt. How then does it access the application and data. Is it that it doesn't and still just gets instructions (for example to load a file on the hard drive to be opened in the application) and processes any requests made by the application which are stored in RAM as instructions thereafter (like saving a file). Or does the application and data relating to an opened file (for example) just stay in RAM and not get accessed by the CPU at all.
Similarly, after reading an article, it said that a copy of the operating system is stored in RAM. The CPU can then access the operating system. (I thought the CPU just worked with instructions from RAM). How does it then communicate with the operating system and how are interrupts sent to the CPU, from the copy of the OS in RAM or from the OS in the hard drive.
Sorry if this is really confusing, alot i didn't understand.
Root of your question: Lack of clear differentiation between Computer's Hardware and Computer's Software.
Components of a Computer System
Just so that we are clear about both of them and that we understand their nature, let me state as follows:
Hardware: It includes CPU, RAM, Disk, Register, Graphics Card, Network Card, Memory BUS and everything that you can touch and call to be the 'Computer'. It is the body.
Software: It includes Operating System, Program, CPU instruction, Compiler, Programming Language and almost everything intangible about the computer. It is the soul.
Firmware: It is that basic code which is absolutely essential for hardware's working. This is stored on a Read Only Memory installed in the hardware itself. This piece of software is vital for hardware therefore is considered in the mid of hardware and software and hence called Firmware.
We will start with understanding from the time when we say that the computer is up and running and is properly executing our instructions. But at that time you will say - How did I reach here? So I will mention a few points about the startup of the computer.
When the power button is pressed...
...the most primitive and basic input output system (therefore called BIOS), which is hard written on the computer hardware begins execution. This is written on Read Only Memory and this starts the process to get the machine to stand on its own. And it loads the software (Operating System) from one piece of hardware (disks) into another piece of hardware (RAM and CPU registers) enabling the software to work properly with hardware.
Now the body and soul are together and the individual (machine) can work.
Until now, OS is already in RAM and CPU. (Read When the power button is pressed if you doubt it.) Let's handle your question paragraph by paragraph now -
First Paragraph
I am having a bit of trouble understanding how applications and data
are accessed by the CPU from RAM after the application has been loaded
into RAM and a file opened (thus data for the file also stored in
RAM).
The explanation is as follows:
The exact issue here is your thinking that it is CPU and RAM that access the data. CPU and RAM are only executing units.
It is OS (software) that accesses the data by means of CPU and RAM (hardware). It is in the realm of OS where applications are executed.
This is why you can install Linux and Windows on same hardware but cannot execute .exe files in Linux because OS does the execution and not RAM/CPU.
Further, how do CPU and RAM and disk physically interact to bring in the data, execute it, save it back etc. is in the domain of hardware. That would require explanation which involves logic gates (AND, OR, NOT...), diodes, circuitry and a hell lot of other things which an Electronics guy can explain.
Second Paragraph
By my understanding, a CPU just gets instructions from RAM as the
program counter ticks or carries out tasks after an interrupt. How
then does it access the application and data. Is it that it doesn't
and still just gets instructions (for example to load a file on the
hard drive to be opened in the application) and processes any
requests made by the application which are stored in RAM as
instructions thereafter (like saving a file).
As you have guessed it - CPU doesn't get instructions, Operating System does it through CPU. Also, just the way brain doesn't directly instruct the hands and legs to move and instead uses nerves for interaction, the CPU doesn't tell the disks to give/take the data. CPU works with RAM and registers only. Multiple units of hardware work in conjunction to provide a path for data and instruction to travel. The important pieces of involved hardware are:
Processor (CPU and registers built in the CPU)
Cache
Memory (RAM)
Disk
Tape
I like the image provided in this answer. This image not only lists the hardware pieces but also illustrates the mammoth difference in the execution speed of these pieces.
Let's move on to the...
Third Paragraph
Similarly, after reading an article, it said that a copy of the
operating system is stored in RAM. The CPU can then access the
operating system. (I thought the CPU just worked with instructions
from RAM). How does it then communicate with the operating system and
how are interrupts sent to the CPU, from the copy of the OS in RAM or
from the OS in the hard drive.
By now you already know that indeed OS is present in RAM and CPU registers. That is where it lives. That is from where it tells the CPU how to work. If OS would be small enough (or if Registers and Caches would be big enough), the OS would live even closer to CPU.
The CPU does not communicate with the OS. It can't. It is the worker that is controlled by a boss. OS is that boss.
CPU cannot access Operating System. CPU is the body, OS is the soul. Soul tells the body what to do, not vice-versa.
CPU doesn't work with instructions from RAM. It merely executes the instructions given by the Operating System (which may be living in RAM). So even when there is an instruction to load some module of OS into the RAM, it is not RAM/CPU but OS itself that issues that instruction.
Interrupts are of two types - Hardware and Software - and your query is about the software interrupts. Since the executive part of OS is in the RAM, in simple words we can say that interrupts are sent to CPU from OS living in RAM.
Conclusions
The lack of distinction between hardware and software is the basic cause of your confusions. Take some course about Operating Systems on Coursera or Academic Earth for deeper understanding.
It is confusing indeed. Let me try to explain.
CPU and RAM
The CPU is hardwired to the RAM via the 'motherboard', and they work together. The CPU can perform many instructions, but it has to be told what to do by instructions in RAM. The CPU is basically in a loop: all it does it fetch the next instruction from RAM and execute it, over and over.
So how does this RAM get filled with instructions?
BIOS (basic input/output system)
When the computer first boots up, a portion of RAM is filled with data from a chip on the motherboard (the BIOS chip), and the CPU is turned on and starts processing. These are the factory settings.
The data from the BIOS chip that is copied to RAM consists of a library of instructions to access hardware devices (hard disks, CD/ROM, USB storage, network cards etc.),
and a program using that library to load what is called the bootsector, the first sector on the boot device, into RAM, and transfer control to it (with a jump instruction).
BOOTLOADER
The bootsector data that the BIOS program loaded from the boot device is very small - only 440 bytes - but with the help of the BIOS library, this is enough to be able to load more sectors and execute these. The bootsector and the data it loads is called the bootloader, which is in charge of loading the Operating System.
In effect, the bootloader is a more dynamic version of the BIOS: the BIOS program resides in flash memory, whereas the bootloader resides on hard disks, USB sticks, SSD drives etc., and thus can be larger and more complex.
OPERATING SYSTEM
In it's turn, The operating system (OS) is simply a more advanced version of the bootloader, as it can load and run multiple programs from multiple locations at the same time.
--
The BIOS knows about drives.
The Bootloader knows about drives and partitions.
The OS knows about drives, partitions, and file systems.
CPU,as you've noticed, reads the program from RAM, instruction by instruction. When an instruction is executed, it might refer to data stored in memory, which it either fetches explicitly to the registers (internal storage of the CPU, quite small - on x86_64 that's like several 64-bit registers + other stuff like segment registers, IP, SP etc) with a separate instruction, or the data read from the memory (we are talking about small amount of data). That's all it really does.
Loading a file from a disk would be done by asking the appropriate controller to fetch the data into a specific place in memory. CPU is connected to buses which will carry instructions to appropriate controllers.
As to interrupts these are special things - CPU has several interrupt lines which can be activated by various devices, for example your network card. When it receives such an interrupt, it is usually handled by an interrupt handler, which is just a program located in a well-known place in memory. They can be registered by, for example, operating system. Each interrupt line has its own interrupt handler. When interrupt happens, the CPU saves the current state of the program it happens to be executing, handles interrupt, restores the state and resumes the program.
You seem to be asking about addressing modes. At the risk of gross oversimplification (ignoring caching, segments, and logical memory), memory stored as a sequential array accessed by an integer address.
The CPU has a number of internal storage areas called registers. We will call them R0 to Rn. The processor assigns some registers dedicated purposes. One of those registers is the PC.
One common addressing mode is deferred. I indicate this mode as (Rn). An instruction like this:
MOV (R0), R1
uses the value contained in R0 as a memory address, fetches the value stored that memory location, and stores a copy of that value in R1.
An instruction sequence like this:
MOV (R0), R1
MOV (R2), R3
is stored in memory as data (ignoring protection), code, data, and variables all use the same type of memory. In other words, any memory location can be interpreted as code, data, or variable.
The CPU executes the next instruction located at (PC). After executing the instruction, the CPU automatically increments the PC to point to the next instruction.
I'm studying OS and I have a doubt about physical and logical addresses.
If logical addresses do not exist in real and are only used to indicate physical addresses, why do we use logical addresses at all? Why not directly physical address?
Thanks in advance!
One good example of why an operating system uses a logical address is the concept of virtual memory. For example, a process can be running in Windows which requires 100MB of RAM to execute. Without virtual memory, if this amount of RAM were not available, the process could not run. With virtual memory, the Windows OS can tell the process that the memory it needs is available. However, the OS cannot expose 100MB of physical memory because it does not exist. Instead, the OS will expose 100MB of logical memory. Some or all of this memory may not map to a physical address. Instead, it might map to disk or another location.
Another reason is fragmentation.
Lets say you have 100 MB of memory and the first three processes need 20 MB each. You give them the memory they want, they run and then the second one terminates. You are left with 60 MB of free memory, but any process that wants a sequential address space of 50 MB can't have it.
Using logical addresses gives you that ability.
One of the major benefits of using logical addresses (and in fact, similar thinking is true for most naming abstractions such as domain names in the networking context, file descriptors, etc.) is that programs can be written in a way that is agnostic to the actual layout of physical memory. That is, you can write your program such that data is stored in say, address 0xdeadbeef (this is a logical address), and your program would work just fine (the MMU or a boot loader that performs binary translation would convert the logical address into an appropriate physical address at run-time or at boot time).
In the above scenario, if your program were written to use physical addresses from the get go, you would run the risk of running into conflicts with other processes that use the same address to store data (e.g., other instances of your program).
I have a question,
is it possible for multiple processor machine to access data from RAM ( single ram system ) ?
for eg machine has 2 processors p1, p2 which are executing in parallel , is it possible that they access same ram for read and write ( ofcos write is not on same location )
i understand that in multi core machines it will not be possible since data bus is shared.
As long as the RAM is mapped to all cores or processors (such as in a multi-threaded application) it may be accessed from any core or processor.
There is no difference if you're discussing single processor/single core, single processor/multi-core, multi-processor/(each with a) single-core or multi-processor/multi-core. Since they have no system RAM of their own - the RAM in caches is not system RAM - the only RAM available to all of them is the system RAM.
The only difference between multi-processor/single core (as in older systems) and single processor/multi-core (modern systems) is that the former needs to coordinate RAM accesses with off-chip logic whereas for the latter all coordination is on-chip and sometimes even on-die which of course results in much faster and more electronically efficient RAM accesses.
In the case of AMD's multi-processor/multi-core solutions each processor owns part of the system RAM. The processors themselves are interconnected with high-speed data (HyperTransport) channels to facilitate accesses to RAM not owned by the processor accessing it.
In any case it is up to the programmer to decide how the processors/cores access the RAM. Sure they can read and/or write to the same location if that is what the programmer wants.
Can any one please make me clear what is the difference between virtual memory and swap space?
And why do we say that for a 32-bit machine the maximum virtual memory accessible is 4 GB only?
There's an excellent explantation of virtual memory over on superuser.
Simply put, virtual memory is a combination of RAM and disk space that running processes can use.
Swap space is the portion of virtual memory that is on the hard disk, used when RAM is full.
As for why 32bit CPU is limited to 4gb virtual memory, it's addressed well here:
By definition, a 32-bit processor uses
32 bits to refer to the location of
each byte of memory. 2^32 = 4.2
billion, which means a memory address
that's 32 bits long can only refer to
4.2 billion unique locations (i.e. 4 GB).
There is some confusion regarding the term Virtual Memory, and it actually refers to the following two very different concepts
Using disk pages to extend the conceptual amount of physical memory a computer has - The correct term for this is actually Paging
An abstraction used by various OS/CPUs to create the illusion of each process running in a separate contiguous address space.
Swap space, OTOH, is the name of the portion of disk used to store additional RAM pages when not in use.
An important realization to make is that the former is transparently possible due to the hardware and OS support of the latter.
In order to make better sense of all this, you should consider how the "Virtual Memory" (as in definition 2) is supported by the CPU and OS.
Suppose you have a 32 bit pointer (64 bit points are similar, but use slightly different mechanisms). Once "Virtual Memory" has been enabled, the processor considers this pointer to be made as three parts.
The highest 10 bits are a Page Directory Entry
The following 10 bits are a Page Table Entry
The last 12 bits make up the Page Offset
Now, when the CPU tries to access the contents of a pointer, it first consults the Page Directory table - a table consisting of 1024 entries (in the X86 architecture the location of which is pointed to by the CR3 register). The 10 bits Page Directory Entry is an index in this table, which points to the physical location of the Page Table. This, in turn, is another table of 1024 entries each of which is a pointer in physical memory, and several important control bits. (We'll get back to these later). Once a page has been found, the last 12 bits are used to find an address within that page.
There are many more details (TLBs, Large Pages, PAE, Selectors, Page Protection) but the short explanation above captures the gist of things.
Using this translation mechanism, an OS can use a different set of physical pages for each process, thus giving each process the illusion of having all the memory for itself (as each process gets its own Page Directory)
On top of this Virtual Memory the OS may also add the concept of Paging. One of the control bits discussed earlier allows to specify whether an entry is "Present". If it isn't present, an attempt to access that entry would result in a Page Fault exception. The OS can capture this exception and act accordingly. OSs supporting swapping/paging can thus decide to load a page from the Swap Space, fix the translation tables, and then issue the memory access again.
This is where the two terms combine, an OS supporting Virtual Memory and Paging can give processes the illusion of having more memory than actually present by paging (swapping) pages in and out of the swap area.
As to your last question (Why is it said 32 bit CPU is limited to 4GB Virtual Memory). This refers to the "Virtual Memory" of definition 2, and is an immediate result of the pointer size. If the CPU can only use 32 bit pointers, you have only 32 bit to express different addresses, this gives you 2^32 = 4GB of addressable memory.
Hope this makes things a bit clearer.
IMHO it is terribly misleading to use the concept of swap space as equivalent to virtual memory. VM is a concept much more general than swap space. Among other things, VM allows processes to reference virtual addresses during execution, which are translated into physical addresses with the support of hardware and page tables. Thus processes do not concern about how much physical memory the system has, or where the instruction or data is actually resident in the physical memory hierarchy. VM allows this mapping. The referenced item (instruction or data) may be resident in L1, or L2, or RAM, or finally on disk, in which case it is loaded into main memory.
Swap space it is just a place on secondary memory where pages are stored when they are inactive. If there is no sufficient RAM, the OS may decide to swap-out pages of a process, to make room for other process pages. The processor never ever executes instruction or read/write data directly from swap space.
Notice that it would be possible to have swap space in a system with no VM. That is, processes that directly access physical addresses, still could have portions of it on
disk.
Though the thread is quite old and has already been answered. Still would like to share this link as this is the simplest explanation I have found so far. Below link has got diagrams for better visualization.
Key Difference: Virtual memory is an abstraction of the main memory. It extends the available memory of the computer by storing the inactive parts of the content RAM on a disk. Whenever the content is required, it fetches it back to the RAM. Swap memory or swap space is a part of the hard disk drive that is used for virtual memory. Thus, both are also used interchangeably.
Virtual memory is quiet different from the physical memory. Programmers get direct access to the virtual memory rather than physical memory. Virtual memory is an abstraction of the main memory. It is used to hide the information of the real physical memory of the system. It extends the available memory of the computer by storing the inactive parts of the RAM's content on a disk. When the content is required, it fetches it back to the RAM. Virtual memory creates an illusion of a whole address space with addresses beginning with zero. It is mainly preferred for its optimization feature by which it reduces the space requirements. It is composed of the available RAM and disk space.
Swap memory is generally called as swap space. Swap space refers to the portion of the virtual memory which is reserved as a temporary storage location. Swap space is utilized when available RAM is not able to meet the requirement of the system’s memory. For example, in Linux memory system, the kernel locates each page in the physical memory or in the swap space. The kernel also maintains a table in which the information regarding the swapped out pages and pages in physical memory is kept.
The pages that have not been accessed since a long time are sent to the swap space area. The process is referred to as swapping out. In case the same page is required, it is swapped in physical memory by swapping out a different page. Thus, one can conclude that swap memory and virtual memory are interconnected as swap memory is used for the technique of virtual memory.
difference-between-virtual-memory-and-swap-memory
"Virtual memory" is a generic term. In Windows, it is called as Paging or pagination. In Linux, it is called as Swap.