How do I find the Cache Block Address when I am given Memory Block Address and the Number of Blocks and they type of Associate Cache? - cpu-architecture

So I have been given the following:
Memory Block Address as 462 (in decimals)
Direct Mapped Cache
22 Blocks
The question is to find the Cache Block Address.
My teacher gave the equation: Cache Block Address = Memory Block Address % No. of Blocks
When I use this equation I get Cache Block Address = 462%4 = 2
But the answer given is 231 which I am assuming is obtained by 462/log2(4)
Can someone help me with this. I am a bit confused

Related

Find physical address from logical address given page table

Following is a page table -
enter image description here
Assume that a page is of size 16000 bytes. How do I calculate the physical address for say the logical address 1000.
Here is what I have worked out yet.
Logical memory = 8 pages
Logical memory size = 8 x 16000 bytes
Physical memory = 8 frames
physical memory size = 8 x 16000 bytes
Now given a logical address of 1000 it will map to the first page which is in frame 3
so considering frame0, frame1, frame2 all of 16000 x 3 bytes.
1000 will be at location 16000 x 3 + 1000
so the physical address will be = 49000 byte
Is this a correct approach?
Is this a correct approach?
Yes. To clarify:
Given a logical address; split it into pieces like:
offset_in_page = logical_address % page_size;
page_table_index = logical_address / page_size;
Then get the physical address of the page from the page table:
physical_address_of_page = page_table[page_table_index].physical_address = page_table[page_table_index].frame * page_size;
Then add the offset within the page to get the final physical address:
physical_address = physical_address_of_page + offset_in_page;
Notes:
a CPU (or MMU) would do various checks using other information in the page table entry (e.g. check if the page is present, check if you're writing to a "read-only" page, etc). When doing the conversion manually you'd have to do these checks too (e.g. when converting a logical address into a physical address the correct answer can be "there is no physical address because the page isn't present").
modulo and division (and multiplication) are expensive. In real hardware the page size will always be a power of 2 so that the modulo and division can be replaced with masks and shifts. The page size will never be 16000 bytes (but may be 16384 bytes or 0x4000 bytes or "2 to the power of 14" bytes, so that the CPU can do offset_in_page = logical_address & 0x3FFF; and page_table_index = logical_address >> 14;). For similar reasons, page table entries are typically constructed by using OR to merge the physical address of a page with other flags (present/not preset, writable/read-only, ...) and AND will be used to extract the physical address from a page table entry (like physical_address_of_page = page_table[page_table_index] & 0xFFFFC000;) and there won't be any "frame number" involved in any calculations.
for real systems (and realistic theoretical examples) it's much easier to use hexadecimal for addresses (to make it easier to do the masks and shifts in your head = e.g. 0x1234567 & 0x03FFFF = 0x0034567 is easy). For this reason (and similar reasons, like determining location in caches, physical address routing and decoding in buses, etc) logical and physical addresses should never be use decimal.
for real systems, there's almost always multiple levels of page tables. In this case approach is mostly the same - you split the logical address into more pieces (e.g. maybe offset_in_page and page_table_index and page_directory_index) and do more table lookups (e.g. maybe page_table = page_directory[page_directory_index].physical_address; then physical_address_of_page = page_table[page_table_index].physical_address;).

PCIe TLP write packet address only 31:2 bits

Let's take a sample write packet : Suppose that the CPU wrote the
value 0x12345678 to the physical address 0xfdaff040 using 32-bit
addressing
This example is from this site (I didn't understand the explanations in the original post)
Why does the address start at the second bit [31 : 2]
Why isn't the address the same
An address of an aligned, 32-bit chunk always has two zero bits at the end of the address. You can think of this as either writing the address of the chunk to the 32-bit slot or as writing the addresses divided by four to bits 2 through 31 of the address. The result is the same either way since dividing by four is equivalent to shifting two bit positions to the right.

How is a variable assigned a memory address?

If I write an instruction x = 7, I understand x to be some address. What then assigns a memory address to x? Is this address a virtual address that is then translated into a physical memory address?
If I write an instruction x = 7, I understand x to be some address. What then assigns a memory address to x?
It depends on the type of var x.
if x is a global or static variable, several tools will cooperate to give it an address
the compiler will write in the object file that it needs to store a global var named x with 4 bytes.
the linker will collect all the global vars in object files, put them in the data segment, and choose a position for them. For instance, x will be at #data_segment+0x1000. The linker will then modify all references to x in the code by #data_segment+0x1000
when it runs the program, the loader will first ask the operating system memory to store the different segments, including data segment. One then knows the value of #data_segment and the actual address of x1.
if x is a local variable, things are slightly simpler. All local vars are in the stack and their address is computed relatively to stack (or frame) pointer by the compiler. So address of x will be something like #stack_pointer+8 and it is generated by the compiler. But its actual value is only known at execution and depends on the stack pointer.
if x is dynamically allocated (malloc-ed), its address is only known at run-time. malloc() asks the OS for chunks of memory and dynamically positions vars in it. x will be put at a position that depends on free space in the memory managed by malloc()
Is this address a virtual address that is then translated into a physical memory address?
All addresses seen by the computer are virtual addresses that are converted to physical memory addresses.
1 Virtual addresses of program segments (including data segment) used to be constant for different executions of the program, but it is no longer true. For security reasons, they are randomized.
There are generally four ways this is done.
1) The variable is mapped to a hardware register. In that case x has no address.
2) The variable has an absolute address. This is usually considered bad form because the code using absolute addresses cannot be relocated; meaning it has to be placed in a fixed location in the address space. However, there are cases where a variable must be at a specific locations, such as some interfaces to devices.
In this case the address of x may be specified by the compiler or by the linker.
3) The variable is defined as an offset from a stack-related register. The is the method used to implement local variables in most programming languages. If you have 4-byte integers and say a C declaration like
int x, y ;
in a function with no other variables, there were be instructions at the top fo the function that look something like:
SUBL2 #8, SP ; Allocate 8 bytes from the stack
MOVL SP, BP ; Set the Base Pointer Register to the start of the allocation
where SP is the stack pointer and BP is some based pointer register.
In that case, x could then be the offset located at BP + 0 and, y could be at BP + 4.
Thus something like
x = y
would look like
MOVL X(BP), Y(BP)
or written as:
MOVL (BP), 4(BP)
The memory location of x and y are entirely determined at run time. Only the offset from the base pointer register is known. In fact, there could be multiple x and y active at the same time having different addresses if their containing function is called recursively or through an interrupt.
4) The memory location is another register offset (usually the program counter).
Let's say you are using traditional uppercase FORTRAN where all variable are static. It is common for the compiler to determine an location for a variable but refer to it using an offset from the program counter register (or some other register). The variable remains in a fixed place at run time but the location could be variable. Using such an offset allows the code to be position independent; meaning it can be loaded anywhere in memory. This allows the code to be used in shared libraries that can be used by multiple programs.
Usually the compiler sets some location for the variable and then that gets fixed by the linker.

Finding minimum page size to allow TLB access to overlap with tag fetch [duplicate]

This question already has an answer here:
Minimum associativity for a PIPT L1 cache to also be VIPT, accessing a set without translating the index to physical
(1 answer)
Closed last year.
Homework question, so please just nudge me in the right direction.
Consider a system with physically-addressed caches, and assume that 40-bit virtual addresses and 32-bit physical addresses are used, and the memory is byte-addressable. Further assume that the cache is 4-way set-associative, the cache line size is 64 Bytes and the total size of the cache is 64 KBytes.
What should be the minimum page size in this system to allow for the overlap of the TLB access and the cache access?
I've been stuck on this question and have no idea how to even begin. Can someone give me a hint towards finding the solution?
I think the most important piece of information in the question is
overlap of the TLB access and the cache access
This means, we access the Cache at the same time we access the TLB. In practice, what we really do is, we index the cache with the index bits from the virtual address and by the time we have located the entry in the cache, we will have the data (physical address) from the TLB. Then we can do the tag comparison with physical address. In other words cache acts as a Virtually indexed, Physically tagged (VIPT) cache.
Even though the scheme sounds efficient, the thing to lookout is, number of bits used to index the cache, cannot be higher than the number of bits needed to represent the page size. Simply, size of a page can put an upper limit on the number of cache entries.
Now coming back to your question,
its a 64KBytes cache with 4 way set assoc. and cacheline of 64Bytes.
Number of cachelines = (64KBytes/4)/64Bytes = 2^8 cachelines
That means if a page is 256Bytes or bigger, we can use this mechanism. If a page is smaller than 256 Bytes, then we cannot assume the index bits of the virtual address and the physical address are going to be the same.
What should be the minimum page size in this system to allow for the
overlap of the TLB access and the cache access?
256Bytes

memory segmentaion and segment registers

I do not understand memory segmentation very well , if we have memory of 1MB the segmentation make it segments of 64KB , is this right?
so is there specific segment for every segment register(CS,DS,SS,ES) and can not be changed ?
image for helping understand question
I guess you're referring to the old real mode of x86.
The values in the segment registers are not strictly static. The idea is that you had 16 bits of architectural address space in the x86, but this was very limiting (64 KB), yet 20 bits of physical address space. Typical addresses would be 16 bits, but addresses in a segment register would be the most significant 16 bits of a 20-bit address. This means they must exist on a 2^4=16-bit boundary. The hardware would then pretend the segment register is a 20-bit base address and the other address (e.g. address of an instruction) is treated as an offset.
Edit: One thing you might be asking is if the segments are mutually exclusive. The segments could overlap partially or completely. This made them quite powerful and quite dangerous.