How is a program loaded in OS

How is a program loaded in OS - operating-system

I am reading about logical and physical addressing. I am confused that when a binary file is run, does it pass first through the CPU where the logical address are generated or is it directly copied to the physical memory ?

when a binary file is run, does it pass first through the CPU where the logical address are generated or is it directly copied to the physical memory ?
Typically some code somewhere loads the executable file's headers into memory, and then uses information from the headers to figure out where various pieces of the file (sections - e.g. .text, .data, etc) should be in virtual memory and what each virtual page's virtual permissions should be (if writes are allowed, if execution is allowed).
After this, areas of the virtual address space are set up. Often this is done by memory mapping the relevant part of the file into the virtual address space, without actually loading them into physical memory. In this case each page's actual permissions don't reflect the page's virtual permissions (e.g. a "read/write" page might be "not present" initially, and when software tries to read from the page you'll get a page fault and the page fault handler might fetch the page from disk and change the page to "present, read only"; and then later when software tries to write to the page you might get a second page fault and the page fault handler might do a "copy on write" so that anything else using the same physical page isn't effected and then make the new copy "read/write" so that it matches the original virtual permissions).
While this is happening; the OS could (depending on amount of free physical RAM and whether storage devices have more important data to transfer) be prefetching the file data from disk (e.g. into VFS cache), and could be "opportunistically" updating the process' page tables to avoid the overhead of page faults for pages that have been prefetched.
However; if the OS knows that the file was on unreliable and/or removable media it may decide that using memory mapped files is a bad idea and may actually load the needed executable's sections into memory before executing it; and an OS could have other features that cause the file to be loaded into RAM before it's executed (e.g. if the OS checks that an executable file's digital signature is correct before allowing the file to be executed, then the entire file probably needs to be loaded into memory to allow the digital signature can be checked, and in that case the entire file is likely to still be in memory when virtual address space is set up).

You need to read an entire book on these topics and spend several weeks on that
But Operating Systems: Three Easy Pieces is a good book, and it is freely downloadable.
Once you have read it, look perhaps also into osdev.org for practical things. And don't forget free software OSes such as Linux, e.g.
https://kernelnewbies.org/
Be aware of copy-on-write and virtual address space...

An executable files is generally an interpreted program itself that is executed by the loader. The executable contains instructions that tell the loader how the program should exist in VIRTUAL memory. By that, I mean the instructions in the executable define how the initial VIRTUAL representation of process address space.
So when the executable starts, there is only a virtual representation of the address space in secondary storage. As the program executes, it starts page faulting repeatedly to load pages into memory. After the initial load, the page fault rate dies down.
The executable NORMALLY only contains logical addresses.

Related

What is the difference between File System and File Management in an operating system?

I’ve found about file management explanations and file system explanations and “file system is part of file management” explanations. But I am wondering if they are the same or two different things? Because I cannot seem to find an article about them.

A modern Operating System, to be portable, must be file system independent. i.e.: It should not matter what type of storage format a given media device contains. At the same time, a media device must contain a specific type of storage format to contain files and folders, and at the same time be Operating System independent.
For example, an OS should be able to handle any file locally, allowing the actual transfer of these files from physical media to the OS (and visa-versa) to be managed by the file system manager. Therefore, an OS can be completely independent of how the file was stored on the media.
With this in mind, there are at least two layers, usually more, of management between the file being viewed and the file on the physical media. Here is a (simple) list of layers that might be used from top down.
OS App viewing the file
OS File Manager
OS File System Manager (allowing multiple file systems)
Specific File System Driver
Media Device Driver
When a call to read a file is made, the app (1) calls the OS File manager (2), which in turn--due to the opening of the file--calls the correct OS File System Manager (3), which then calls the Specific File System Driver (4), which then calls the Media Device Driver (5) for the actual access.
Please note that any or all could have a working cache manager which means calls are processed and returned without calling lower layers. e.g.: each read more than requested in anticipation of another read.
By having multiple layers like this, you can have any (physical) file system and/or media device you wish and the OS would be none the wiser. All you need is a media driver for the specific physical device and a file system manager for the physical format of the contents of the media. As long as these layers all support the common service calls, any format of media and content on that media will be allowed by the OS.

When we have page fault in demand paging , where do we get the faulty page from? From the Backing store or from the virtual memory?

I'm confused between the difference of backing store and virtual memory. Currently studying memory management in OS

Generally, the term "backing store" refers to bitmapped displays. I've never seen it used with virtual memory.
In a virtual memory system, every page of the process address space region has a disk mapping somewhere. That is a prerequisite. That disk image of the process is the "virtual memory." That is where the page is loaded from.

How does my operating system get information about disk size, RAM size, CPU frequency, etc

I can see from my OS the informations about my hard disk, RAM and CPU. But I've never told my OS these info.
How does my OS know it?
Is there some place in the hard disk or CPU or RAM that stores this kind of information?
Is there some standard about the format of this kind of information?

SMBIOS (formerly known as DMI) contains much of this information. SMBIOS is a a data structure/API that is part of the BIOS/UEFI firmware and contains info like brand and model of the computer, etc.
The rest is gathered by the OS querying hardware directly.

Answer grabbed from superuser by Mokubai.
You don't need to tell it because each device already knows (or has a way) to identify itself.
If you get the idea that every device is accessed via address and data lines, and in some cases only data lines then you come to the relaisation that in those data lines you need some kind of "protocol" that determines just how you talk to those devices.
In amongst that protocol you have commands that say "read this" and "send that" or "put this over there". It is also relatively easy to have a command that says "identify yourself" which, rather than reading a block of disk or memory or painting a pixel a particular colour, will return a premade string or set of strings that tell the driver or operating system what that device is. Using a series of identity commands you could discover a device type, it's capabilities and what driver might be able to work with it.
You don't need to tell a device what it is, because it already knows. And you don't need to tell the operating system what it is because it can ask the device itself.
You don't tell people what they're called and how they talk, you ask them.
Each device has it's own protocol for these messages, and they don't store the details of other devices because to do so would be insane and near useless given that you can remove any device at any time. Your hard drive doesn't need to store information about your memory or graphics card except for the driver that the operating system uses to talk to it with.
The PC UEFI specification would define a core set of system specifications that every computer has, allowing the processor to be powered up and for a program stored in an EEPROM to begin the asbolute basic system probing necessary to determine the processor, set up the RAM, find a disk and display and thus continue to boot the computer.
From there the UEFI system would hand over to the operating system which would have more detailed probing and identification procedures, but it all starts at the most basic "I have a processor, what is around me?" situation.

Can you modify your own page tables (OS related)?

I am currently learning about Virtual Memory within the OS. I recently learned that access rights are stored in the page tables and so I am wondering if you can modify your own page tables? Does the hardware enforce protection from this?

Yes, you can modify your page tables—to some degree. Most operating system have system services to allow you to map and unmap pages to your address space (thus modifying your page tables).
Because page tables are stored in the system address space invariably with access limited to kernel mode, you have to modify the page tables in kernel mode. That means doing it through a system service that executes in kernel mode.
Of course you are limited to the types of modification you can make by the system services.

No, you (as user code) cannot directly modify the page tables for your processs, or any other process.
The page tables are manages exclusively by the kernel. They are stored in physical memory which is not mapped into userspace.
The hardware (specifically the MMU) enforces this protection just as it protects all of the kernel data and code.

In Mongodb how does the cache get warmed up?

I'm wondering how the cache (memory) gets warmed up. I understand that MongoDB uses memory mapped files and the OS's virtual memory to swap pages in and out as needed. What I don't understand is how it gets warmed up on startup.
Upon startup does mongod map all of the pages in the database to virtual memory or is there some other mechanism to load pages that are not yet mapped which get mapped as queries are run against the database?
Similarly, is the size of the database limited to the amount of virtual memory available to the system. I understand that on a 64-bit system this is a lot. Is there another mechanism other than memory mapping for pages to moved to and from disk?

Memory mapping means that there is a representation of all the on disk files available but only a portion of these files may be present in RAM. When a given page is needed (and it is not in RAM) it can be read from disk into RAM to be accessed.
Regarding limitations, you can see them on the MongoDB limits page
MongoDB does not do any specific "warming" of pages on startup, as it does not have any concept of which pages would be useful and which not.
If you wish to "Warm" certain collections manually before using them you should look at the touch command.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse