Where are Logical addresses located? - operating-system

I know that user program generates logical addresses.Suppose there is a small code snippet in C .When address is printed,the addresses are virtual addresses.My question is where are those addresses fetched from?where exactly do the allocated values and variables stay?At main memory or secondary memory?If main memory then why there is physical address?

User mode programs only see logical addresses. Only the operating system (kernel mode) sees physical memory.
My question is where are those addresses fetched from?
Those are the logical addresses assigned by the program loader ad linker.
where exactly do the allocated values and variables stay?At main memory or secondary memory?
In a virtual memory system, it may be in main memory or secondary storage.
If main memory then why there is physical address?
It is a logical address that is mapped to a physical address using page tables.

I am kind of new to the computer architecture and Operating System,
but I will try to answer as much as I can. As far as I have
understood about the logical address (Which I still have trouble
understanding, about where it is fetched from or where it is stored.
I mean these addresses (numbers) gotta be stored somewhere,
otherwise CPU can't generate it by itself, right?), these addresses
are assigned by CPU or a processor and depends on the CPU
architecture. Each process is assigned a virtual/logical
address. And this logical address is translated to physical address
by Memory Management Unit of CPU (MMU).
Where exactly do the allocated values and variables stay? As user3344003 said, it may be in main memory or secondary storage.
If main memory then why there is physical address? The reason lies
in the concept of Virtual Memory. Each process has its own virtual
address and a page table. Process's logical address are mapped
through this page table to the Physical memory (RAM). Whatever that
logical address is, it gets mapped to Physical address. If the
Physical memory gets full, then OS evicts some of the less used
or unused process to Secondary storage and puts the needed process
in the RAM. That way multiple process can run at the same time.
Every process assumes that they have all the space in RAM just for
themselves. If not for virtual memory, then physical memory would be
full and process might crash and may shut down the OS as well.
Hope it helps. I am still learning, if my understanding about Logical address and virtual memory is wrong then Please comment.

Related

How does Cpu generate the logical address for the disk

If you think question is not proper please edit or make it correct, i am asking what i google and extract from the internet.
Cpu generates the logical address which is converted into physical address but the question here is how does the cpu generates the logical address for the data that is stored on the disk.
Cpu generates the logical address which is converted into physical address but the question here is how does the cpu generates the logical address for the data that is stored on the disk.
It doesn't, at least not the way you're thinking it does.
Normally a program tries to access memory at a virtual address, but the CPU sees "virtual address isn't present" and complains to the OS (kernel) via. a page fault. The page fault handler figures out what went wrong, loads the data from disk into RAM, maps the RAM into the virtual address space, then lets the program continue/retry as if nothing happened. The second time the CPU tries to execute the code the data is in RAM so it works fine.
Of course the OS has to know the reason why data at a virtual address wasn't present, which means that the OS has to keep track of extra information that the CPU doesn't have - if the virtual address actually isn't valid at all (e.g. NULL), or if the data is in swap space (and where), or if the data is part of a memory mapped file (and which offset of which file).
There is Virtual address space and a physical address space. virtual address space defines the address space of the program. say there is a program of 4GB. in that case we can represent the address space for that program as 32 bits. (2^32 = 4GB) from 0 to 0xFFFFFFFF.
this is the space the program thinks it has.
while compilation of the program, the program is given logical addresses based on the address space of the program.
after loading in to the memory. the program counter that is assign to this program will point to these addresses (logical/virtual addresses) and cpu will only want to fetch these addresses where the program instruction's are. cpu doesn't know where are the instructions located in the memory. that is up to MMU to translate the addresses.
the main thing is CPU doesn't actually generate these addresses, these are the addresses that where given to the program while compilation, using these, the instructions in the program reference each other. so cpu just see what program counter is pointing and generate / asks for these instruction. which are located in the physical memory.
when ever a address for fetching data or operand , instruction pointed by the PC, cpu call for these addresses.

What is the purpose of Logical addresses in operating system? Why they are generated

I want to know that why the CPU generates logical addresses and then maps them into Physical addresses with the help of memory manager? Why do we need them.
Virtual addresses are required to run several program on a computer.
Assume there is no virtual address mechanism. Compilers and link editors generate a memory layout with a given pattern. Instruction (text segment) are positioned in memory from address 0. Then are the segments for initialized or uninitialized data (data and bss) and the dynamic memory (heap and stack). (see for instance https://www.geeksforgeeks.org/memory-layout-of-c-program/ if you have no idea on memory layout)
When you run this program, it will occupy part of the memory that will no longer be available for other processes in a completely unpredictable way. For instance, addresses 0 to 1M will be occupied, or 0 to 16k, or 0 to 128M, it completely depends on the program characteristics.
If you now want to run concurrently a second program, where will its instructions and data go to memory? Memory addresses are generated by the compiler that obviously do not know at compile time what will be the free memory. And remember memory addresses (for instructions or data) are somehow hard-coded in the program code.
A second problem happens when you want to run many processes and that you run out of memory. In this situations, some processes are swapped out to disk and restored later. But when restored, a process will go where memory is free and again, it is something that is unpredictable and would require modifying internal addresses of the program.
Virtual memory simplifies all these tasks. When running a process (or restoring it after a swap), the system looks at free memory and fills page tables to create a mapping between virtual addresses (manipulated by the processor and always unchanged) and physical addresses (that depends on the free memory on the computer at a given time).
Logical address translation serves several functions.
One of these is to support the common mapping of a system address space to all processes. This makes it possible for any process to handle interrupt because the system addresses needed to handle interrupts are always in the same place, regardless of the process.
The logical translation system also handles page protection. This makes is possible to protect the common system address space from individual users messing with it. It also allows protecting the user address space, such as making code and data read only, to check for errors.
Logical translation is also a prerequisite for implementing virtual memory. In an virtual memory system, each process's address space is constructed in secondary storage (ie disk). Pages within the address space are brought into memory as needed. This kind of system would be impossible to implement if processes with large address spaces had to be mapped contiguously within memory.

Location of OS Kernel Data

I'm a beginner with operating systems, and I had a question about the OS Kernel.
I'm used to the standard notion of each user process having a virtual address space of stack, heap, data, and code. My question is that when a context switch occurs to the OS Kernel, is the code run in the kernel treated as a process with a stack, heap, data, and code?
I know there is a dedicated kernel stack, which the user program can't access. Is this located in the user program address space?
I know the OS needs to maintain some data structures in order to do its job, like the process control block. Where are these data structures located? Are they in user-program address spaces? Are they in some dedicated segment of memory for kernel data structures? Are they scattered all around physical memory wherever there is space?
Finally, I've seen some diagrams where OS code is located in the top portion of a user program's address space. Is the entire OS kernel located here? If not, where else does the OS kernel's code reside?
Thanks for your help!
Yes, the kernel has its own stack, heap, data structures, and code separate from those of each user process.
The code running in the kernel isn't treated as a "process" per se. The code is privileged meaning that it can modify any data in the kernel, set privileged bits in processor registers, send interrupts, interact with devices, execute privileged instructions, etc. It's not restricted like the code in a user process.
All of kernel memory and user process memory is stored in physical memory in the computer (or perhaps on disk if data has been swapped from memory).
The key to answering the rest of your questions is to understand the difference between physical memory and virtual memory. Remember that if you use a virtual memory address to access data, that virtual address is translated to a physical address before the data is fetched at the determined physical address.
Each process has its own virtual address space. This means that some virtual address a in one process can map to a different physical address than the same virtual address a in another process. Virtual memory has many important uses, but I'm not going to go into them here. The important point is that virtual memory enforces memory isolation. This means that process A cannot access the memory of process B. All of process A's virtual addresses map to some set of physical addresses and all of process B's virtual addresses map to a different set of physical addresses. As long as the two sets of physical addresses do not overlap, the processes cannot see or modify the memory of each other. User processes cannot access physical memory addresses directly - they can only make memory accesses with virtual addresses.
There are times when two processes may have some virtual addresses that do map to the same physical addresses, such as if they both mmap the same file, both use a shared library, etc.
So now to answer your question about kernel address spaces and user address spaces.
The kernel can have a separate virtual address space from each user process. This is as simple as changing the page directory pointer in the cr3 register (in an x86 processor) on each context switch. Since the kernel has a different virtual address space, no user process can access kernel memory as long as none of the kernel's virtual memory addresses map to the same physical addresses as any of the virtual addresses in any address space for a user process.
This can lead to a minor problem. If a user process makes a system call and passes a pointer as a parameter (e.g. a pointer to a buffer in the read system call), how does the kernel know which physical address corresponds to that buffer? The virtual address in the pointer maps to a different physical address in kernel space, so the kernel cannot just dereference the pointer. There are two options:
The kernel can traverse the user process page directory/tables to find the physical address that corresponds to the buffer. The kernel can then read/write from/to that physical address.
The kernel can instead include all of its mappings in the user address space (at the top of the user address space, as you mentioned). Now, when the kernel receives a pointer through the system call, it can just access the pointer directly since it is sharing the address space with the process.
Kernels generally go with the second option, since it's more convenient and more efficient. Option 1 is less efficient because each time a context switch occurs, the address space changes, so the TLB needs to be flushed and now you lose all of your cached mappings. I'm simplifying things a bit here since kernels have started doing things differently given the recent Meltdown vulnerability discovered.
This leads to another problem. If the kernel includes its mappings in the user process address space, what stops the user process from accessing kernel memory? The kernel sets protection bits in the page table that cause the processor to prohibit the user process from accessing the virtual addresses that map to physical addresses that contain kernel memory.
Take a look at these slides for more information.
I'm used to the standard notion of each user process having a virtual address space of stack, heap, data, and code. My question is that when a context switch occurs to the OS Kernel, is the code run in the kernel treated as a process with a stack, heap, data, and code?
One every modern operating system I am aware there is NEVER a context switch to the kernel. The kernel executes in the context of a process (some systems user the fiction of a reduced process context.
The "kernel" executes when a process enters kernel mode through an exception or an interrupt.
Each process (thread) normally has its own kernel mode stack used after an exception. Usually there is a single single interrupt stack for each processor.
https://books.google.com/books?id=FSX5qUthRL8C&pg=PA322&lpg=PA322&dq=vax+%22interrupt+stack%22&source=bl&ots=CIaxuaGXWY&sig=S-YsXBR5_kY7hYb6F2pLGjn5pn4&hl=en&sa=X&ved=2ahUKEwjrgvyX997fAhXhdd8KHdT7B8sQ6AEwCHoECAEQAQ#v=onepage&q=vax%20%22interrupt%20stack%22&f=false
I know there is a dedicated kernel stack, which the user program can't access. Is this located in the user program address space?
Each process has its own kernel stack. It is often in the user space with protected memory but could be in the system space. The interrupt stack is always in the system space.
Where are these data structures located? Are they in user-program address spaces?
They are generally in the system space. However, some systems do put some structures in the user space in protected memory.
Are they in some dedicated segment of memory for kernel data structures?
If they are in the user space, they are generally for an access mode more privileged than user mode and less privileged than kernel mode.
Are they scattered all around physical memory wherever there is space?
Thinks can be spread over physical memory pretty much at random.
The data structures in questions are usually regular C structures situated in the RAM allotted to the kernel by the kernel allocator
They are not usually accessible from regular processes becuase of normal mechanisms for memory protection and paging (virtual memory)
A kind of exception to this are kernel threads which have no userspace address space so the code they execute is always the kernel code working with the kernel space data structures hence with the isolated kernel memory
Now for the interesting part: 64-bit Linux uses a thing called Direct Map for memory organization, which means that the full amount of physical memory available is mapped in the kernel page tables as just one contiguous chunk. This is not true for 32-bit as the HIGHMEM was used to avoid the limitation of 4GB address spaces
Since the kernel has all the physical RAM visible and available to its own allocator, the kernel data structures in question can be situated pretty randomly with respect to the physical addresses
You can google on there terms to gain additional information:
PTI (page table isolation)
__copy_from_user (esp. on esoteric architectures where this function is not just a bitwise copy)
EPT (Intel nested paging in virtual machines)

Operating Systems, Memory management

I'm studying OS and I have a doubt about physical and logical addresses.
If logical addresses do not exist in real and are only used to indicate physical addresses, why do we use logical addresses at all? Why not directly physical address?
Thanks in advance!
One good example of why an operating system uses a logical address is the concept of virtual memory. For example, a process can be running in Windows which requires 100MB of RAM to execute. Without virtual memory, if this amount of RAM were not available, the process could not run. With virtual memory, the Windows OS can tell the process that the memory it needs is available. However, the OS cannot expose 100MB of physical memory because it does not exist. Instead, the OS will expose 100MB of logical memory. Some or all of this memory may not map to a physical address. Instead, it might map to disk or another location.
Another reason is fragmentation.
Lets say you have 100 MB of memory and the first three processes need 20 MB each. You give them the memory they want, they run and then the second one terminates. You are left with 60 MB of free memory, but any process that wants a sequential address space of 50 MB can't have it.
Using logical addresses gives you that ability.
One of the major benefits of using logical addresses (and in fact, similar thinking is true for most naming abstractions such as domain names in the networking context, file descriptors, etc.) is that programs can be written in a way that is agnostic to the actual layout of physical memory. That is, you can write your program such that data is stored in say, address 0xdeadbeef (this is a logical address), and your program would work just fine (the MMU or a boot loader that performs binary translation would convert the logical address into an appropriate physical address at run-time or at boot time).
In the above scenario, if your program were written to use physical addresses from the get go, you would run the risk of running into conflicts with other processes that use the same address to store data (e.g., other instances of your program).

Diff. between Logical memory and Physical memory

While understanding the concept of Paging in Memory Management, I came through the terms "logical memory" and "physical memory". Can anyone please tell me the diff. between the two ???
Does physical memory = Hard Disk
and logical memory = RAM
There are three related concepts here:
Physical -- An actual device
Logical -- A translation to a physical device
Virtual -- A simulation of a physical device
The term "logical memory" is rarely used because we normally use the term "virtual memory" to cover both the virtual and logical translations of memory.
In an address translation, we have a page index and a byte index into that page.
The page index to the Nth path in the process could be called a logical memory. The operating system redirects the ordinal page number into some arbitrary physical address.
The reason this is rarely called logical memory is that the page made be simulated using paging, becoming a virtual address.
Address transition is a combination of logical and virtual. The normal usage is to just call the whole thing "virtual memory."
We can imagine that in the future, as memory grows, that paging will go away entirely. Instead of having virtual memory systems we will have logical memory systems.
Not a lot of clarity here thus far, here goes:
Physical Memory is what the CPU addresses on its address bus. It's the lowest level software can get to. Physical memory is organized as a sequence of 8-bit bytes, each with a physical address.
Every application having to manage its memory at a physical level is obviously not feasible. So, since the early days, CPUs introduced abstractions of memory known collectively as "Memory Management." These are all optional, but ubiquitous, CPU features managed by your kernel:
Linear Memory is what user-level programs address in their code. It's seen as a contiguous addresses space, but behind the scenes each linear address maps to a physical address. This allows user-level programs to address memory in a common way and leaves the management of physical memory to the kernel.
However, it's not so simple. User-level programs address linear memory using different memory models. One you may have heard of is the segmented memory model. Under this model, programs address memory using logical addresses. Each logical address refers to a table entry which maps to a linear address space. In this way, the o/s can break up an application into different parts of memory as a security feature (details out of scope for here)
In Intel 64-bit (IA-32e, 64-bit submode), segmented memory is never used, and instead every program can address all 2^64 bytes of linear address space using a flat memory model. As the name implies, all of linear memory is available at a byte-accessible level. This is the most straightforward.
Finally we get to Virtual Memory. This is a feature of the CPU facilitated by the MMU, totally unseen to user-level programs, and managed by the kernel. It allows physical addresses to be mapped to virtual addresses, organized as tables of pages ("page tables"). When virtual memory ("paging") is enabled, tables can be loaded into the CPU, causing memory addresses referenced by a program to be translated to physical addresses transparently. Page tables are swapped in and out on the fly by the kernel when different programs are run. This allows for optimization and security in process/memory management (details out of scope for here)
Keep in mind, Linear and Virtual memory are independent features which can work in conjunction. If paging is disabled, linear addresses map one-to-one with physical addresses. When enabled, linear addresses are mapped to virtual memory.
Notes:
This is all linux/x86 specific but the same concepts apply almost everywhere.
There are a ton of details I glossed over
If you want to know more, read The Intel® 64 and IA-32 Architectures Software Developer Manual, from where I plagiarized most of this
I'd like to add a simple answer here.
Physical Memory : This is the memory that is actually present and every process needs space here to execute their code.
Logical Memory:
To a user program the memory seems contiguous,Suppose a program needs 100 MB of space in memory,To this program a virtual address space / Logical address space starts from 0 and continues to some finite number.This address is generated by CPU and then The MMU then maps this virtual address to real physical address through some page table or any other way the mapping is implemented.
Please correct me or add some more content here. Thanks !
Physical memory is RAM; Actually belongs to main memory. Logical address is the address generated by CPU. In paging,logical address is mapped into physical address with the help of page tables. Logical address contains page number and an offset address.
An address generated by the CPU is commonly referred to as a logical address, whereas an address seen by the memory unit—that is, the one loaded into the memory-address register of the memory—is commonly referred to as a physical address
The physical address is the actual address of the frame where each page will be placed, whereas the logical address is the address generated by the CPU for each page.
What exactly is a frame?
Processes are retrieved from secondary memory and stored in main memory using the paging storing technique.
Processes are kept in secondary memory as non-contiguous pages, which implies they are stored in random locations.
Those non-contiguous pages are retrieved into main Memory as a frame by the paging operating system.
The operating system divides the memory frame size equally in main memory, and all processes retrieved from secondary memory are stored concurrently.