Memory blocks and tags - cpu-architecture

Suppose that we have a cpu with cache that consists of 128 blocks. 8 bytes of memory can be saved to each block.How can I find which block each address belongs to? Also what is each address' tag?
The following is my way of thinking.
Take the 32bit address 1030 for example. If I do 1030 * 4 = 4120 I have the address in a byte format. Then I turn it in a 8byte format 4120 / 8 = 515.
Then I do 515 % 128 = 3 which is (8byte address)%(number of blocks) to find the block that this address is on (block no.3).
Then I do 515 / 128 = 4 to find the possition that the address is on block no.3. So tag = 4.
Is my way of thinking correct?
Any comment is welcomed!

What we know generically:
A cache decomposes addresses into fields, namely: a tag field, an index field, and a block offset field.  For any given cache the field sizes are fixed, and, knowing their width (number of bits) allows us decompose an address the same way that cache does.
An address as a simple number:
+---------------------------+
| address |
+---------------------------+
We would view addresses as unsigned integers, and the number of bits used for the address is the address space size.  As decomposed into fields by the cache:
+----------------------------+
| tag | index | offset |
+----------------------------+
Each field uses an integer number of bits for its width.
What we know from your problem statement:
the block size is 8 bytes, therefore
the block offset field width is log2( block size in bytes )
the address space (total number of bit in an address) is 32 bits, therefore
tag width + index width + offset width = 32
Since information about associativity is not given we should assume the cache is direct mapped.  No information to the contrary is provided, and direct mapped caches are common early in coursework.  I'd verify or else state the assumption explicitly of direct mapped cache.
there are 128 blocks, therefore, for a direct mapped cache
there are 128 index positions in the cache array.
(for 2- way or 4- way we would divide by 2 or 4, respectively)
Given 128 index positions in the cache array
the index field width is log2( number of index positions )
Knowing the index field width, the block offset field width, and total address width, we can compute the tag field width
tag field width = 32 - index field width - block offset field width
Only when you have such field widths does it make sense to attempt to decode a given address and extract the fields' actual values for that address.
Because there are three fields, the preferred approach to extraction is to simply write out the address in binary and group the bits according to the fields and their widths.
(Division and modulus can be made to work but with (a) 3 fields, and (b) the index field being in the middle using math there is a arguable more complex, since to get the index we have to divide (to remove the block offset) and modulus (to remove the tag bits), but this is equivalent to the other approach.)
Comments on your reasoning:
You need to know if 1030 is in decimal or hex.  It is unusual to write an addresses in decimal notation, since hex notation converts into binary notation (and hence the various bit fields) so much easier.  (Some educational computers use decimal notation for addresses, but they generally have a much smaller address space, like 3 decimal digits, and certainly not a 32-bit address space.)
Take the 32bit address 1030 for example. If I do 1030 * 4 = 4120 I have the address in a byte format.
Unless something is really out of the ordinary, the address 1030 is already in byte format — so don't do that.
Then I turn it in a 8byte format 4120 / 8 = 515.
The 8 bytes of the cache make up the block offset field for decoding an address.  Need to decode the address into 3 fields, not necessarily divide it.
Again the key is to first compute the block size, then the index size, then the tag size.  Take a given address, convert to binary, and group the bits to know the tag, index, and block offset values in binary (then maybe convert those values to hex (or decimal if you must)).

Related

Find physical address from logical address given page table

Following is a page table -
enter image description here
Assume that a page is of size 16000 bytes. How do I calculate the physical address for say the logical address 1000.
Here is what I have worked out yet.
Logical memory = 8 pages
Logical memory size = 8 x 16000 bytes
Physical memory = 8 frames
physical memory size = 8 x 16000 bytes
Now given a logical address of 1000 it will map to the first page which is in frame 3
so considering frame0, frame1, frame2 all of 16000 x 3 bytes.
1000 will be at location 16000 x 3 + 1000
so the physical address will be = 49000 byte
Is this a correct approach?
Is this a correct approach?
Yes. To clarify:
Given a logical address; split it into pieces like:
offset_in_page = logical_address % page_size;
page_table_index = logical_address / page_size;
Then get the physical address of the page from the page table:
physical_address_of_page = page_table[page_table_index].physical_address = page_table[page_table_index].frame * page_size;
Then add the offset within the page to get the final physical address:
physical_address = physical_address_of_page + offset_in_page;
Notes:
a CPU (or MMU) would do various checks using other information in the page table entry (e.g. check if the page is present, check if you're writing to a "read-only" page, etc). When doing the conversion manually you'd have to do these checks too (e.g. when converting a logical address into a physical address the correct answer can be "there is no physical address because the page isn't present").
modulo and division (and multiplication) are expensive. In real hardware the page size will always be a power of 2 so that the modulo and division can be replaced with masks and shifts. The page size will never be 16000 bytes (but may be 16384 bytes or 0x4000 bytes or "2 to the power of 14" bytes, so that the CPU can do offset_in_page = logical_address & 0x3FFF; and page_table_index = logical_address >> 14;). For similar reasons, page table entries are typically constructed by using OR to merge the physical address of a page with other flags (present/not preset, writable/read-only, ...) and AND will be used to extract the physical address from a page table entry (like physical_address_of_page = page_table[page_table_index] & 0xFFFFC000;) and there won't be any "frame number" involved in any calculations.
for real systems (and realistic theoretical examples) it's much easier to use hexadecimal for addresses (to make it easier to do the masks and shifts in your head = e.g. 0x1234567 & 0x03FFFF = 0x0034567 is easy). For this reason (and similar reasons, like determining location in caches, physical address routing and decoding in buses, etc) logical and physical addresses should never be use decimal.
for real systems, there's almost always multiple levels of page tables. In this case approach is mostly the same - you split the logical address into more pieces (e.g. maybe offset_in_page and page_table_index and page_directory_index) and do more table lookups (e.g. maybe page_table = page_directory[page_directory_index].physical_address; then physical_address_of_page = page_table[page_table_index].physical_address;).

Hierarchical paging with 2 levels

Consider a paging system with the page table being stored in memory. The logical address space used is 32 bit and the page size is 8KB. This will result in a very large page table(s) and therefore the system uses hierarchical paging with two levels. The number of entries in the outer page table is 256.
Specify the number of bits in each of the three fields composing the logical address namely, the outer page, the inner page, and the offset.
I found some information on finding the page offset, Page offset = log2(page size in bytes), so for this case, it would be 13, but I haven't found much information on how to find the number of bits for the outer page and inner page. Can anyone shine some light on this problem for me?
Thank you.
I might not be entirely correct, but since VPN to PPN translation is one of my favorite parts from OS, I decided to share my understanding. Maybe this picture can help to understand how is virtual address translated in the physical address.
In this example page directory contains 1024 entries, so you will need 10 bits, to be able to define which entry you need. This entry contains address of the inner table. Then, as the inner page table also contains 1024 entries, once you know the address of it, similarly you still need to find the index of its entry which holds the physical page address. So next 10 bits are used for calculating that index. Finally, when the page table entry gives you the physical address of the page, offset gives the exact physical address. If this is not very clear I can go into more details.
In your case, as you have 8KB pages, as you said last 13 bits will be used to calculate the offset. If the outermost page table contains 256 entries then you will need 8 bits (log2(256)) to be able to determine the index of its entry. Then it depends on the number of the entries of the inner table. Or if the size of the entry is defined, the number of entries can be calculated from it. If we assume that left 11 bits are entirely used for the inner table, than it would have to contain 2048 entries, as based on my understanding one instance of page table fits and fills one physical page.
The lowest bits of a logical address will be used for "offset in 8192-byte page". You'd need 13 bits for that (because 1 << 13 = 8192 or because log2(8192) = 13).
The highest bits of a logical address will be used for "index into 256-entry outer page table". You'd need 8 bits for that (because 1 << 8 = 256 or because log2(2562) = 8).
If a logical address is 32 bits and the lowest 13 bits and highest 8 bits are used for other things; how many bits are left over for the index into the inner page table?

What is the maximum size of symbol data type in KDB+?

I cannot find the maximum size of the symbol data type in KDB+.
Does anyone know what it is?
If youa re talking the physical length of a symbol, well symbols exist as interred strings in kdb, so the maximum string length limit would apply. As strings are just a list of characters in kdb, the maximum size of a string would be the maximum length of a list. In 3.x this would be 264 - 1, In previous versions of kdb this limit was 2,000,000,000.
However there is a 2TB maximum serialized size limit that would likely kick in first, you can roughly work out the size of a sym by serializing it,
q)count -8!`
10
q)count -8!`a
11
q)count -8!`abc
13
So each character adds a single byte, this would give a roughly 1012 character length size limit
If you mean the maximum amount of symbols that can exist in memory, then the limit is 1.4B.

Hardware Support for Paging

"The address consists of 16 bits, and the page size is 8KB. The page table thus consists of eight entries that are kept in fast registers."
How do we get the total entries in the page table as 8?
According to the calculation it should be 1.
Total Entries in the Page Table= ((2^16)/(2^3*2^10*2^3))=1.
(The first 2^3 is for 8 in 8KB, the second one is for bytes to bits conversion and 2^10 is for "Kilo" in 8KB.)
Thanks
Memory is byte-addressable hence, you do not need to divide by 2^3 for bytes to bit conversion.
Explaining it further, 16-bits for address means that the processor will generate memory addresses of length 16 bits which will be used to address the byte or half-word or word present starting (or ending - depends on the endianess of the machine) at that 16-bit value.
Now, the page size is the total size of a page in bits which in this case is 2^16 bits. But as memory is byte addressable, hence number of processor addresses in one page will be 2^16/2^3 i.e 2^13 addresses.
Hence number of page table entries are 2^16/2^13 = 8.

Why are there limits on domain name lengths?

From what I know, domains seem to be keys into a hash of the DNS where the value is the resource records for the domain name. Why are they limited in length? The specifications I found say that a domain name:
+Has a maximum label length of 63 characters
+Has a maximum of 127 labels.
+Cannot be no more than 255 bytes of data
And theres also all sorts of restrictions upon special character ordering, etc. Why is that?
label length
The 63-byte limit is because in the DNS protocol, labels stored as , length is a single byte, but two high bits of the length field reserved for something else (compression) thus leaving 6 bits for the length itself, 2^6=64 possible values - 0..63.
To simplify implementations, the total length of a domain name (i.e.,
label octets and label length octets) is restricted to 255 octets or
less.
I did not find a limit for 127 labels in the specifications. It arises simply from the fact that the whole domain name is up to 255 bytes and label is always no less than 2 bytes (single letter and the dot or length and the letter).