When we have page fault in demand paging , where do we get the faulty page from? From the Backing store or from the virtual memory? - operating-system

I'm confused between the difference of backing store and virtual memory. Currently studying memory management in OS

Generally, the term "backing store" refers to bitmapped displays. I've never seen it used with virtual memory.
In a virtual memory system, every page of the process address space region has a disk mapping somewhere. That is a prerequisite. That disk image of the process is the "virtual memory." That is where the page is loaded from.

Related

How is a program loaded in OS

I am reading about logical and physical addressing. I am confused that when a binary file is run, does it pass first through the CPU where the logical address are generated or is it directly copied to the physical memory ?
when a binary file is run, does it pass first through the CPU where the logical address are generated or is it directly copied to the physical memory ?
Typically some code somewhere loads the executable file's headers into memory, and then uses information from the headers to figure out where various pieces of the file (sections - e.g. .text, .data, etc) should be in virtual memory and what each virtual page's virtual permissions should be (if writes are allowed, if execution is allowed).
After this, areas of the virtual address space are set up. Often this is done by memory mapping the relevant part of the file into the virtual address space, without actually loading them into physical memory. In this case each page's actual permissions don't reflect the page's virtual permissions (e.g. a "read/write" page might be "not present" initially, and when software tries to read from the page you'll get a page fault and the page fault handler might fetch the page from disk and change the page to "present, read only"; and then later when software tries to write to the page you might get a second page fault and the page fault handler might do a "copy on write" so that anything else using the same physical page isn't effected and then make the new copy "read/write" so that it matches the original virtual permissions).
While this is happening; the OS could (depending on amount of free physical RAM and whether storage devices have more important data to transfer) be prefetching the file data from disk (e.g. into VFS cache), and could be "opportunistically" updating the process' page tables to avoid the overhead of page faults for pages that have been prefetched.
However; if the OS knows that the file was on unreliable and/or removable media it may decide that using memory mapped files is a bad idea and may actually load the needed executable's sections into memory before executing it; and an OS could have other features that cause the file to be loaded into RAM before it's executed (e.g. if the OS checks that an executable file's digital signature is correct before allowing the file to be executed, then the entire file probably needs to be loaded into memory to allow the digital signature can be checked, and in that case the entire file is likely to still be in memory when virtual address space is set up).
You need to read an entire book on these topics and spend several weeks on that
But Operating Systems: Three Easy Pieces is a good book, and it is freely downloadable.
Once you have read it, look perhaps also into osdev.org for practical things. And don't forget free software OSes such as Linux, e.g.
https://kernelnewbies.org/
Be aware of copy-on-write and virtual address space...
An executable files is generally an interpreted program itself that is executed by the loader. The executable contains instructions that tell the loader how the program should exist in VIRTUAL memory. By that, I mean the instructions in the executable define how the initial VIRTUAL representation of process address space.
So when the executable starts, there is only a virtual representation of the address space in secondary storage. As the program executes, it starts page faulting repeatedly to load pages into memory. After the initial load, the page fault rate dies down.
The executable NORMALLY only contains logical addresses.

why do we need page cache when there is virtual memory

simple question about os. I know virtual memory handle the memory mapping for us. When some data we need is not in memory, VM will page in and copy the data into main memory, and if we are short of memory it will also page out some obsolete memory to disk. My question is, since virtual memory already handle this, why do we need a page cache? It seems to me VM already make main memory a cache for disk.
Virtual memory: see all address space (ram) as my own and bigger than the actual memory (swap to disk as necessary);
Page cache: open a file that is stored in some drive (a filesystem thing)

How does my operating system get information about disk size, RAM size, CPU frequency, etc

I can see from my OS the informations about my hard disk, RAM and CPU. But I've never told my OS these info.
How does my OS know it?
Is there some place in the hard disk or CPU or RAM that stores this kind of information?
Is there some standard about the format of this kind of information?
SMBIOS (formerly known as DMI) contains much of this information. SMBIOS is a a data structure/API that is part of the BIOS/UEFI firmware and contains info like brand and model of the computer, etc.
The rest is gathered by the OS querying hardware directly.
Answer grabbed from superuser by Mokubai.
You don't need to tell it because each device already knows (or has a way) to identify itself.
If you get the idea that every device is accessed via address and data lines, and in some cases only data lines then you come to the relaisation that in those data lines you need some kind of "protocol" that determines just how you talk to those devices.
In amongst that protocol you have commands that say "read this" and "send that" or "put this over there". It is also relatively easy to have a command that says "identify yourself" which, rather than reading a block of disk or memory or painting a pixel a particular colour, will return a premade string or set of strings that tell the driver or operating system what that device is. Using a series of identity commands you could discover a device type, it's capabilities and what driver might be able to work with it.
You don't need to tell a device what it is, because it already knows. And you don't need to tell the operating system what it is because it can ask the device itself.
You don't tell people what they're called and how they talk, you ask them.
Each device has it's own protocol for these messages, and they don't store the details of other devices because to do so would be insane and near useless given that you can remove any device at any time. Your hard drive doesn't need to store information about your memory or graphics card except for the driver that the operating system uses to talk to it with.
The PC UEFI specification would define a core set of system specifications that every computer has, allowing the processor to be powered up and for a program stored in an EEPROM to begin the asbolute basic system probing necessary to determine the processor, set up the RAM, find a disk and display and thus continue to boot the computer.
From there the UEFI system would hand over to the operating system which would have more detailed probing and identification procedures, but it all starts at the most basic "I have a processor, what is around me?" situation.

Implementing larger page tables

I have studied in my operating systems class that Larger page tables can be implemented in memory management technique by using paging. When implementing this virtual memory does it really become slow memory access when it is implemented on a secondary storage device. I want to know the reason for it that how it slows down memory access? When we implement large page tables on secondary storage device.
Even smaller page tables can be (and have been) implemented using paging. The architectural issue is how to get around the chicken and egg problem of page tables being in virtual memory while referring to physical memory. A number of techniques have been developed to deal with that.
Paged page tables only slows down the memory when a page table access causes a page fault. Once the page fault is serviced, subsequent references to not trigger a page fault (unless the table is paged out). Paine the page tables is not slowing down the system constantly.

In Mongodb how does the cache get warmed up?

I'm wondering how the cache (memory) gets warmed up. I understand that MongoDB uses memory mapped files and the OS's virtual memory to swap pages in and out as needed. What I don't understand is how it gets warmed up on startup.
Upon startup does mongod map all of the pages in the database to virtual memory or is there some other mechanism to load pages that are not yet mapped which get mapped as queries are run against the database?
Similarly, is the size of the database limited to the amount of virtual memory available to the system. I understand that on a 64-bit system this is a lot. Is there another mechanism other than memory mapping for pages to moved to and from disk?
Memory mapping means that there is a representation of all the on disk files available but only a portion of these files may be present in RAM. When a given page is needed (and it is not in RAM) it can be read from disk into RAM to be accessed.
Regarding limitations, you can see them on the MongoDB limits page
MongoDB does not do any specific "warming" of pages on startup, as it does not have any concept of which pages would be useful and which not.
If you wish to "Warm" certain collections manually before using them you should look at the touch command.