How to convert a Virtual address to Physical Address? - operating-system

if i have a Virtual Address: 0xF3557100 , how do i convert it to Physical Address and what are the Values of Offset, Page Directory and Page Table ?
The PTE (page table entry) for that address has the value 0x87124053
thnx

Sadly, what you are asking is system dependent. You would need to know the size of the page to begin with.
In the simplest case, the lowest order bits corresponding to the page size are the offset and the remaining high order bits specify the page table entry.
You say that you have the value of he page table entry. You then need to know the structure of the page table entry. Some part of that will indicate the physical address. Other parts will define page attributes.
In short, we'd need to know a whole lot more information.

In general from this info you can not translate a VA to PA.
Each architecture has some constant value for PAGE_SHIFT. as your address is 32 bit, most of such architecture has 12 bit PAGE_SHIFT value.
this value determines the offset value so your offset value is 12 bits.that also means your page size is 4096 bytes. even though a architecture can support more than one value for PAGE_SHIFT, we take case of 12 bits offset which is usually default value in most systems making page of 4096
PTE contains address of the page frame/number along with other status and protection information.Lower 12 bits in PTE are used for status and protection while other 20 bits are used for PPN. as a principle virtual frame number is mapped to physical frame number and offset is same in both. so exclude lower most 12 bits from PTE and append 12 lower most bits from va.
so offset from va is 0x100 so physical address is 0x87124100
according to 10-10-12 rule (there is no general rule for this division)
offset = 12 bits
page table = page directory=10 bits
now you CAN easily calculate relevant bits value from given address.
1111001101 0101010111 000100000000
page directory offset = 1111001101
page table offset = 0101010111
page offset = 000100000000

Related

What is page table No (0 or 1 or 2) in 2-level paging?

A system with 2-level page table in the form of (p1, p2, offset), and the content of page table
No 1 of level 2 is: {(32, v),(45, i),(12, v),(5, v),(34, i)}. Suppose the frame size is 4KB; the
address register is 32 bit, the number of bits for level 2 (p2) of the address register is 10.
Given a reference =4225332, calculate its physical address, e.g., 12764?
I don't understand the meaning of page table No 1 of level 2. In some problems, it could be No 0 or No 2. Could you help me explain that and solve this problem? Thanks a lot!

How to compute the address at which the page table entry?

Suppose a system has:
20-bit virtual addresses,
1024 byte pages,
24-bit physical addresses,
4 byte page table enties,
a page table base pointer set to physical (byte) address 0x1000,
a single-level page table structure.
Based on the above information, what is the address at which the page table entry for the virtual address 0x1000 is stored? (Note that page table entries are larger than one byte.) Write your answer as a hexadecimal number.
1024 (2^10) byte pages -> page offset = 10
virtual address 0x1000 -> 100 0000000000
VPN: 100 -> 0x4
PTE address = 0x1000(page table base pointer) + 0x4(VPN)* 4(size of PTE) = 0x1010
So the correct answer is 0x1010.

Are the Physical Page Numbers in this diagram the same between all?

I'm currently reading a text book on xv6, and understand this so far ...
Virtual Address: First 20 bits to index into a PTE. The PTE takes these 20 bits and turns them into a Physical Page Number: PPN. The remaining 12 bits are used for offset, which will be the same in both virtual and physical addresses.
Paging: Paging hardware uses first 10 bits of 20 bits in the virtual address to select a page directory entry (PDE). If a PDE is present, uses next 10 bits of virtual address to select a page table entry (PTE). Something like this ...
00 0000 0011 | 00 0000 0010 | 0000 0000 0101
Page Dir. (3) | Page Table E. (2) | Offset (5)
Question: Is the PPN showed in the diagrams the same all across? I also know the difference between a page directory and page table entry is only by 1 bit, which is set to 0 or 1 depending if you are at page directory or table. Is the PPN common between all 3 then? (Physical Address, Page Table, Page Directory).
Hopefully, this answers your question. If you access a 32-bit address, 12-bits are saved for the offset into the page. They play no part in address translation.
The CR3 register points to a page table directory. Although not specified in your diagram, I believe this points to a physical page frame. That page frame contains an array of directories. The top 10 bits in your address are an index into that directories.
So now you have a structure like the one in your diagram. That structure contains a pointer to a physical page frame (PPN) containing a page table. Again this is physical address that would be padded with zeroes. You use the value in the PPN field to find the page table.
Your page table is an array of structures that look just like the directory. What is misleading in your diagram is that the D bit may or may not be set in a page table while it is always clear in a directory. The next 10 bits in your address are an index into this table. Use those to locate the desired page table entry.
As before you have a PPN. On this second iteration, this is a pointer to a physical address BUT now it is the actual memory page you want to access. Pad the 20 bits of the PPN with zero and add the lower 12 bits of your address and you have the physical address.

Making sense of Postgres row sizes

I got a large (>100M rows) Postgres table with structure {integer, integer, integer, timestamp without time zone}. I expected the size of a row to be 3*integer + 1*timestamp = 3*4 + 1*8 = 20 bytes.
In reality the row size is pg_relation_size(tbl) / count(*) = 52 bytes. Why?
(No deletes are done against the table: pg_relation_size(tbl, 'fsm') ~= 0)
Calculation of row size is much more complex than that.
Storage is typically partitioned in 8 kB data pages. There is a small fixed overhead per page, possible remainders not big enough to fit another tuple, and more importantly dead rows or a percentage initially reserved with the FILLFACTOR setting.
And there is even more overhead per row (tuple): an item identifier of 4 bytes at the start of the page, the HeapTupleHeader of 23 bytes and alignment padding. The start of the tuple header as well as the start of tuple data are aligned at a multiple of MAXALIGN, which is 8 bytes on a typical 64-bit machine. Some data types require alignment to the next multiple of 2, 4 or 8 bytes.
Quoting the manual on the system table pg_tpye:
typalign is the alignment required when storing a value of this type.
It applies to storage on disk as well as most representations of the
value inside PostgreSQL. When multiple values are stored
consecutively, such as in the representation of a complete row on
disk, padding is inserted before a datum of this type so that it
begins on the specified boundary. The alignment reference is the
beginning of the first datum in the sequence.
Possible values are:
c = char alignment, i.e., no alignment needed.
s = short alignment (2 bytes on most machines).
i = int alignment (4 bytes on most machines).
d = double alignment (8 bytes on many machines, but by no means all).
Read about the basics in the manual here.
Your example
This results in 4 bytes of padding after your 3 integer columns, because the timestamp column requires double alignment and needs to start at the next multiple of 8 bytes.
So, one row occupies:
23 -- heaptupleheader
+ 1 -- padding or NULL bitmap
+ 12 -- 3 * integer (no alignment padding here)
+ 4 -- padding after 3rd integer
+ 8 -- timestamp
+ 0 -- no padding since tuple ends at multiple of MAXALIGN
Plus item identifier per tuple in the page header (as pointed out by #A.H. in the comment):
+ 4 -- item identifier in page header
------
= 52 bytes
So we arrive at the observed 52 bytes.
The calculation pg_relation_size(tbl) / count(*) is a pessimistic estimation. pg_relation_size(tbl) includes bloat (dead rows) and space reserved by fillfactor, as well as overhead per data page and per table. (And we didn't even mention compression for long varlena data in TOAST tables, since it doesn't apply here.)
You can install the additional module pgstattuple and call SELECT * FROM pgstattuple('tbl_name'); for more information on table and tuple size.
Related:
Table size with page layout
Calculating and saving space in PostgreSQL
Each row has metadata associated with it. The correct formula is (assuming naïve alignment):
3 * 4 + 1 * 8 == your data
24 bytes == row overhead
total size per row: 23 + 20
Or roughly 53 bytes. I actually wrote postgresql-varint specifically to help with this problem with this exact use case. You may want to look at a similar post for additional details re: tuple overhead.

Does not using NULL in PostgreSQL still use a NULL bitmap in the header?

Apparently PostgreSQL stores a couple of values in the header of each database row.
If I don't use NULL values in that table - is the null bitmap still there?
Does defining the columns with NOT NULL make any difference?
It's actually more complex than that.
The null bitmap needs one bit per column in the row, rounded up to full bytes. It is only there if the actual row includes at least one NULL value and is fully allocated in that case. NOT NULL constraints do not directly affect that. (Of course, if all fields of your table are NOT NULL, there can never be a null bitmap.)
The "heap tuple header" (per row) is 23 bytes long. Actual data starts at a multiple of MAXALIGN (Maximum data alignment) after that, which is typically 8 bytes on 64-bit OS (4 bytes on 32-bit OS). Run the following command from your PostgreSQL binary dir as root to get a definitive answer:
./pg_controldata /path/to/my/dbcluster
On a typical Debian-based installation of Postgres 12 that would be:
sudo /usr/lib/postgresql/12/bin/pg_controldata /var/lib/postgresql/12/main
Either way, there is one free byte between the header and the aligned start of the data, which the null bitmap can utilize. As long as your table has 8 columns or less, NULL storage is effectively absolutely free (as far as disk space is concerned).
After that, another MAXALIGN (typically 8 bytes) is allocated for the null bitmap to cover another (typically) 64 fields. Etc.
This is valid for at least versions 8.4 - 12 and most likely won't change.
The null bitmap is only present if the HEAP_HASNULL bit is set in t_infomask. If it is present it begins just after the fixed header and occupies enough bytes to have one bit per data column (that is, t_natts bits altogether). In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null.
http://www.postgresql.org/docs/9.0/static/storage-page-layout.html#HEAPTUPLEHEADERDATA-TABLE
so for every 8 columns you use one byte of extra storage. Then for every about million rows that would take up one megabyte of storage. Does not really seem that important. I would define the tables how they needed to be defined and not worry about null headers.