What is the largest file size one can store on a disk that uses inodes and disk blocks

What is the largest file size one can store on a disk that uses inodes and disk blocks - operating-system

I have a question here that I do not know how to calculate the maximal size of a file that one can store on a disk that uses inodes and disk blocks.
Assuming a page size of 4096 bytes, a page table entry that points to a frame takes 8 bytes (4
bytes for the pointer plus some flags), and a page table entry that points to another page table
takes 4 bytes, how many levels of page tables would be required to map a 32-bit address space if
each level page table must fit into a single page?
What the maximal file size one can store on a disk that uses inodes and disk blocks that store 4096 bytes. Each inode can store 10 entries, and the first inode reserves the last two entries for cascading inode???
For the first part of the question, I got the total number of levels is 3, but I do not know how to do the second part.

What you're describing sounds like the EXT filesystem.
EXT3 uses a total of 15 pointers.
The first 12 entries are direct: they point directly to data blocks. The third to final entry is a level 1 indirect: it points to a block filled entirely with level 1 entries. The second to final entry is a level 2 indirect: it points to a block completely full of level 1 indirects. The last entry is a level 3 indirect.
The maximum file size on this system is usually a restriction of the operating system, and is usually between 16GB and 2TB.
The theoretical maximum is 12I + I^2/P + I^3/P^2 + I^4/P^3, where I is the inode size in bytes (typically 4096, though different values are possible), and P is the pointer size, in bytes (4). This yields a maximum theoretical size of 4,402,345,721,856 bytes.
EXT3 Inode pointer structure

Related

How large does the FAT structure and how large is the file?

Consider the following parameters of a FAT based lesystem:
Blocks are 8KB (213 bytes) large
FAT entries are 32 bits wide, of which 24 bits are used to store a block address
A. How large does the FAT structure need to be accommodate a 1GB (2^30 bytes) disk?
B. What is the largest theoretical le size supported by the FAT structure from part (A)?

A. How large does the FAT structure need to be accommodate a 1GB (2^30 bytes) disk?
The FAT file system splits the space into clusters, then has a table (the "cluster allocation table" or FAT) with an entry for each cluster (to say if it's free, faulty or which cluster is the next cluster in a chain of clusters). To work out size of the "cluster allocation table" divide the total size of the volume by the size of a cluster (to determine how many clusters and how many entries in the "cluster allocation table"), then multiply by the size of one entry, then maybe round up to a multiple of the cluster size or not (depending on which answer you want - actual size or space consumed).
B. What is the largest theoretical le size supported by the FAT structure from part (A)?
The largest file size supported is determined by either (whichever is smaller):
the size of "file size" field in the file's directory entry (which is 32-bit for FAT32 and would therefore be 4 GiB); or
the total size of the space minus the space consumed by the hidden/reserved/system area, cluster allocation table, directories and faulty clusters.
For a 1 GiB volume formatted with FAT32, the max. size of a file would be determined by the latter ("total space - sum of areas not usable by the file").
Note that if you have a 1 GiB disk, this might (e.g.) be split into 4 partitions and a FAT file system might be given a partition with a fraction of 1 GiB of space. Even if there is only one partition for the "whole" disk, typically (assuming "MBR partitions" and not the newer "GPT partitions" which takes more space for partition tables, etc) the partition begins on the second track (the first track is "reserved" for MBR, partition table and maybe "boot manager") or a later track (e.g. to align the start of the partition to a "4 KiB physical sector size" and avoid performance problems caused by "512 logical sector size").
In other words, the size of the disk has very little to do with the size of the volume used for FAT; and when questions only tell you the size of the disk and don't tell you the size of the partition/volume you can't provide accurate answers.
What you could do is state your assumptions clearly in your answer, for example:
"I assume that a "1 GB" disk is 1000000 KiB (1024000000 bytes, and not 1 GiB or 1073741824 bytes, and not 1 GB or 1000000000 bytes); and I assume that 1 MiB (1024 KiB) of disk space is consumed by the partition table and MBR and all remaining space is used for a single FAT partition; and therefore the FAT volume itself is 998976 KiB."

Page table entry size - why a power of 2?

I solved some question, where the page table entry size needed only 26 bits - 22 for the physical address, and 4 for dirty bits and such. However it was rounded up to 32 - because 26 is not a power of 2. Must be something simple I'm missing but why do we have to do that? Thanks!

I think here that you need to realize that the page table entry needs to accessed like any other piece of data. Typically, this means that it needs to fit into a byte or a word.
Now bytes only hold 8 bits, so that is not enough room. For many machines (and I suspect, your machine too), words are 32 bits.
Thus the page table entry is allocated 32 bits of space.

Number of entries in a page table entry and the size are two things. Obviously the size is equal to the number of entries times size of a single entry. Page table entry is there to tell you which virtual page maps to which physical page. Which means, the number of entries you need in a page table entry is, number of virtual page you have. which can be calculated by dividing the total addressable space by the size of a page. (For example, 32 bit address and a 4k page size gives us 2 to the power 20 entries), virtual part of an entry will be 20 bits. Size of a physical part entry is determined by the available physical memory. Usually the page size remain same. By this way you can calculate the bits needed for a single entry. Then you can multiply this by the number of entries and you have the total size.

How PAGE table size is calculated here

I am having hard time calculating the page size as from below link:
http://www.embedded-bits.co.uk/2011/mmucode/
As we know page table entries in this table are 4 bytes long and that there is a maximum of 4096 entries (one for each 1MB of the address space) we can calculate the size of the table as 16KB
Now total size of page table is 4096 entries * 4 bytes wide entry = 16384 bytes = 16kb
But as from above statement each of the 4096 entry corresponds to 1 Mb of address space, that means 1 entry = 1MB .
Since there are 4096 entries, space required to store it is 4096MB but we have page table size of 16kb only.
Also, how many virtual address this 1mb of section has, 250000?
EDIT:
Sorry, if its going to be more stupid from my end. I tried to understand it again. This 1 Mb of section is part of Physical memory not the virtual memory/page table(which I understood earlier).
Now each entry is 4 Bytes longs, does it means 4 virtual addresses are going to cover 1 Mb of Physical Memory section ?

Not quite sure what the question is asking but in a typical AArch32 MMU (barring the newer extensions), you would have first and second level translation tables in a typical configuration.
The first level translation table splits the entire address space into 1MB sections and typically contain pointers to level 2 tables which split those 1MB sections into granular pages (traditionally 4KB on most machines). 2nd level translation tables store physical addresses that correspond to those virtual addresses (each entry stores a physical address along with some flags). The virtual address is determined by its position within the L1 and L2 tables. To further clarify, all addresses stored in page tables (including addresses to L2 tables in the L1 table) are physical.
The first level translation table has a fixed size but depending on which architecture you're running you can change its size (this is used by a lot of ARM kernels to provide a user/kernel address space split). Each gigabyte of virtual memory space requires 4096 bytes in the L1 table at 4096 byte page size.
To even further clarify as far as I remember you can use L1 entries to map 1MB chunks directly without using L2 tables, the type of L1 entry is denoted by lower bits of it.
(Sorry if I got the unit suffixes wrong)

Virtual Memory page table growth

When processes are allowed to grow larger than memory, page tables also grow very large. How could we organize page tables and TLB to keep access times as quick as possible for codes with good locality? For example, assume physical memory is 512K, each page is 1K, and a TLB of size 128. If we assume most processes are 256K or less, then we could allocate a fixed-size page table with 256 entries. Now in the unexpected case, where the page table grows larger than 256 entries, how should we organize it? What implications does your design have on average access time and on the maximum virtual memory size of a program?

The solution used on x86 is to have "sparse" page tables, that is there isn't a full table to contain a mapping for each page. Rather a two level mechanism is used:
The virtual memory is 4 GB large. A single page has size 4 KB. Using a one level approach would thus require a table of 4 GB / 4 KB = 1024 * 1024 entries. If an entry consumed 4 bytes, then every process would need 4 MB just to store its table.
Using a two level approach we have a page directory with 1024 entries, each of size 4 bytes (making it fit perfectly into a single 4 KB page). Thus each entry in that directory manages 4 GB / 1024 = 4 MB. If (and only if) there should be a mapping of some pages of virtual memory to physical memory in that 4 MB range, then the entry points to an instance of another structure, a page table. That contains 1024 entries, too, so each one manages 4 MB / 1024 = 4 KB exactly one page.
If there's a process that just needs a single page to operate, then using the single level approach we need 4 MB to store its virtual memory configuration. Using the two level mechanism described above, we need 4 KB for the page directory and 4 KB for the page table containing the mapping for that single page. Thus only 8 KB are used to store the virtual memory configuration.
If the process needs additional memory at runtime, and if that memory is at a (virtual) address not within the 4 MB range managed by its page table, then a second page table needs to be provided, increasing the memory used to store the mappings by another 4 KB.
Using this two level approach slightly increases access times for pages not in the TLB, because the memory management unit needs to access two memory locations (the page directory, and afterwards the respective page table) to be able to compute the physical address.
The TLB is unaffected by this: It stores mappings of single pages. How these mappings have been established isn't relevant to its operation.
Let's apply this to the example configuration you gave above:
A singe page has 1 KB size. Most processes, as you said, will have 256 KB or less memory. But we want to be able to have processes using more virtual memory.
If we choose to have the last level handle a full 256 KB, then we have
256 KB / 1 KB = 256 entries. Assuming a 32 bit architecture, this in turn means we can have each entry with size of 4 byte (to hold an address). 256 entries * 4 Byte = 1 KB and thus a full page. Nice.
To be able to handle more virtual memory than 256 KB we add another layer. Because it's easy, we let this level use tables with 256 entries (a 4 byte), too, to make such a table exactly fit into a page.
This gives us a virtual memory of 256 * 256 KB (roughly 65 MB). An virtual address in that system would then be 26 bit long:
DDDDDDDDTTTTTTTTPPPPPPPPPP
D := Index to page directory, highest level.
8 bit to be able to index 256 entries.
T := Index to page table, lower level.
8 bit to be able to index 256 entries.
P := Offset inside page.
10 bit to be able to address 1024 bytes.
A process using less than 256 KB needs then 2 KB to manage its memory configuration. Each additional 256 KB of virtual memory needed add another 1 KB of configuration memory.
Assuming the TLB can hold 128 entries (your question is a bit unclear here) it would need 128 * (16 + X - 10) bit, where X is the number of bits used to address physical memory. (Though this depends on the actual implemenation. I was thinking about16 bit per entry to store the indices of the paging structures + the upper bits of the physical address, not counting the 10 bits offset)
I hope this answers your question. An actual implementation will need to make design choices based on a lot of constraints.

What is page table entry size?

I found this example.
Consider a system with a 32-bit logical address space. If the page
size in such a system is 4 KB (2^12), then a page table may consist of
up to 1 million entries (2^32/2^12). Assuming that
each entry consists of 4 bytes, each process may need up to 4 MB of physical address space for the page table alone.
What is the meaning of each entry consists of 4 bytes and why each process may need up to 4 MB of physical address space for the page table?

A page table is a table of conversions from virtual to physical addresses that the OS uses to artificially increase the total amount of main memory available in a system.
Physical memory is the actual bits located at addresses in memory (DRAM), while virtual memory is where the OS "lies" to processes by telling them where it's at, in order to do things like allow for 2^64 bits of address space, despite the fact that 2^32 bits is the most RAM normally used. (2^32 bits is 4 gigabytes, so 2^64 is 16 gb.)
Most default page table sizes are 4096 kb for each process, but the number of page table entries can increase if the process needs more process space. Page table sizes can also initially be allocated smaller or larger amounts or memory, it's just that 4 kb is usually the best size for most processes.
Note that a page table is a table of page entries. Both can have different sizes, but page table sizes are most commonly 4096 kb or 4 mb and page table size is increased by adding more entries.

As for why a PTE(page table entry) is 4 bytes:
Several answers say it's because the address space is 32 bits and the PTE needs 32 bits to hold the address.
But a PTE doesn't contain the complete address of a byte, only the physical page number. The rest of the bits contain flags or are left unused. It need not be 4 bytes exactly.

1) Because 4 bytes (32 bits) is exactly the right amount of space to hold any address in a 32-bit address space.
2) Because 1 million entries of 4 bytes each makes 4MB.

Your first doubt is in the line, "Each entry in the Page Table Entry, also called PTE, consists of 4 bytes". To understand this, first let's discuss what does page table contain?", Answer will be PTEs. So,this 4 bytes is the size of each PTE which consist of virtual address, offset,( And maybe 1-2 other fields if are required/desired)
So, now you know what page table contains, you can easily calculate the memory space it will take, that is: Total no. of PTEs times the size of a PTE.
Which will be: 1m * 4 bytes= 4MB
Hope this clears your doubt. :)

The page table entry is the number number of bits required to get any frame number . for example if you have a physical memory with 2^32 frames , then you would need 32 bits to represent it. These 32 bits are stored in the page table in 4 bytes(32/8) .
Now, since the number of pages are 1 million i.e. so the total size of the page table =
page table entry*number of pages
=4b*1million
=4mb.
hence, 4mb would be required to store store the table in the main memory(physical memory).

So, the entry refers to page table entry (PTE). The data stored in each entry is the physical memory address (PFN). The underlying assumption here is the physical memory also uses a 32-bit address space. Therefore, PTE will be at least 4 bytes (4 * 8 = 32 bits).
In a 32-bit system with memory page size of 4KB (2^2 * 2^10 B), the maximum number of pages a process could have will be 2^(32-12) = 1M. Each process thinks it has access to all physical memory. In order to translate all 1M virtual memory addresses to physical memory addresses, a process may need to store 1 M PTEs, that is 4MB.

Honestly a bit new to this myself, but to keep things short it looks like 4MB comes from the fact that there are 1 million entries (each PTE stores a physical page number, assuming it exists); therefore, 1 million PTE's, which is 2^20 = 1MB. 1MB * 4 Bytes = 4MB, so each process will require that for their page tables.

size of a page table entry depends upon the number of frames in the physical memory, since this text is from "OPERATING SYSTEM CONCEPTS by GALVIN" it is assumed here that number of pages and frames are same, so assuming the same, we find the number of pages/frames which comes out to be 2^20, since page table only stores the frame number of the respective page, so each page table entry has to be of atleast 20 bits to map 2^20 frame numbers with pages, here 4 byte is taken i.e 32 bits, because they are using the upper limit, since page table not only stores the frame numbers, but it also stores additional bits for protection and security, for eg. valid and invalid bit is also stored in the page table, so to map pages with frames we need only 20 bits, the rest are extra bits to store protection and security information.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse