Compute the average load time - cpu-architecture

A computer has a cache, main memory and a hard disk. If a referenced
word is in the cache, it takes 15 ns to access it. If it is in main memory
but not in the cache, it takes 85 ns to load (the block containing) it
into the cache (this includes the time to originally check the cache),
and then the reference lookup is started again. If the word is not in
main memory, it takes 10 ms to load (the block containing) it from
the disk into main memory, and then the reference lookup is started
again. The cache hit ratio is 0.4. In the case of a cache miss, the
probability that the word is in the main memory is 0.7. Compute the
average load time.
My Answer
Given:
Cache access time = 15 ns
Cache hit rate = 0.4
Cache miss rate = 1 – 0.4 = 0.6
RAM access time = 85 ns
RAM hit rate = 0.7
RAM miss rate = 1 – 0.7 = 0.3
Disk access time = 10ms = 10000000 ns
> Average access time = (cache access time x cache hit rate) + (cache
> miss rate) x (RAM access time + RAM hit rate) + (cache miss rate x ram
> miss rate x disk access time)
> = (15*0.4) + (0.6)(85*0.7) + (0.6)(0.3)(10000000)
> = 1 800 041,7 ns

I hope you are enjoying the computer systems class at Birkbeck... ;P
I think you missed something though:
(1) You are assuming the 10ms include the initial checking of the cache (he specified it for the 85ns but not for the 10ms so would add that to be on the safe side)
(2) It says the reference lookup is started again after loading into cache and main memory respectively... So from the question I understand that words can only be accessed from the cache (otherwise why bother with the 85ns?). Hence, I think you need to add the time it takes to load it into the cache from main memory when retrieved from the disk initially. Also, although I am not entirely sure on this one as it is a bit ambiguous, I think you need to add another 15ns for the word to be accessed in the cache after its loaded from main memory...
Interested to hear some thoughts

Kindly share the answers of first three questions in assignment here if you have done em.. :P
For this question answer is :
Cache access time = 15ns
Memory access time = 85ns +15ns = 100ns
Disk access time = 10x106 + 100ns = 10000100ns
Average load time = 0.4 x 15ns + 0.6[0.7 x 100ns + 0.3(10000100ns)]
Average load time = 6 +0.6(70 + 3000030) = 6 + 1800060
Average load time = 1800066ns = 1.8ms

If it is in main memory but not in the cache, it takes 85 ns to load (the block containing) it into the cache (this includes the time to originally check the cache).
You dont't need to add 85 (Memory) and 15 (Cache)
For this question answer is :
Cache access time = 15ns
Memory access time = 85ns
Disk access time = 10x106 + 85ns = 1000085ns
Average load time = 0.4 x 15ns + 0.6[0.7 x 85ns + 0.3(1000085ns)]
Average load time = 6 +0.6(59.5 + 3000025.5) = 6 + 1800051
Average load time = 1800057ns = 1.8ms

Related

Calculating memory stalls while adding second level cache

I am trying to calculate memory stall cycles per instructions when adding the second level cache.
I have the following given values:
Direct Mapped cache with 128 blocks
16 KB cache
2ns Cache access time
1Ghz Clock Rate
1 CPI
80 clock cycles Miss Penalty
5% Miss rate
1.8 Memory Accesses per instruction
16 bit memory address
L2 Cache
4% Miss Rate
6 clock cycles miss penalty
As I understand it, the way to calculate the Memory stall cycles is by using the following formula:
Memory stall cycles = Memory accesses x Miss rate x Miss penalty
Which can be simplified as:
Memory stall cycles = instructions per program x misses per instructions x miss penalty
What I did was to multiply 1.8 x (.05 +.04) x (80 + 6) = 13.932
Would this be correct or am I missing something?
First of all, I am not sure about the given parameters for miss penalty for L1 and L2 (L1 being 80 cycles and L2 being 6 cycles).
Anyway using the data as it is:
You issue 1 instruction per clock
There are 1.8 memory instructions in an instruction.
There is a 5% that access can miss L1 and another 4% chance that it can miss L2. You would only access the main memory if you miss in both L1 and L2. That would be .04 * .05 = 0.002 = 0.2% This means per memory access, you are likely to access the main memory 0.2% of the time.
Since you have 1.8 memory accesses per instruction, you are likely to access main memory 0.002 * 1.8 = 0.0036 = 0.36% per instruction.
When you encounter a miss in both L1 and L2, you will get stalled for 80 + 6 = 86 cycles (ignoring any optimizations)
Per instruction, you would only encounter .36% main memory accesses. Hence the memory stall cycles per instruction is .0036 * 86 = 0.3096

Operating Systems Virtual Memory

I am a student reading Operating systems course for the first time. I have a doubt in the calculation of the performance degradation calculation while using demand paging. In the Silberschatz book on operating systems, the following lines appear.
"If we take an average page-fault service time of 8 milliseconds and a
memory-access time of 200 nanoseconds, then the effective access time in
nanoseconds is
effective access time = (1 - p) x (200) + p (8 milliseconds)
= (1 - p) x 200 + p x 8.00(1000
= 200 + 7,999,800 x p.
We see, then, that the effective access time is directly proportional to the
page-fault rate. If one access out of 1,000 causes a page fault, the effective
access time is 8.2 microseconds. The computer will be slowed down by a factor
of 40 because of demand paging! "
How did they calculate the slowdown here? Is 'performance degradation' and slowdown the same?
This is whole thing is nonsensical. It assumes a fixed page fault rate P. That is not realistic in itself. That rate is a fraction of memory accesses that result in a page fault.
1-P is the fraction of memory accesses that do not result in a page fault.
T= (1-P) x 200ns + p (8ms) is then the average time of a memory access.
Expanded
T = 200ns + p (8ms - 200ns)
T = 200ns + p (799980ns)
The whole thing is rather silly.
All you really need to know is a nanosecond is 1/billionth of a second.
A microsecond is 1/thousandth of a second.
Using these figures, there is a factor of a million difference between the access time in memory and in disk.

Paging and TLB operating system

A hierarchical memory system that uses cache memory has cache access time of 50 nanosecond, main memory access time of 300 nanoseconds, 75% of memory requests are for read, hit ratio of 0.8 for read access and the write - through scheme is used. what will be the average access time of the system both for read and write requests.
A 157.5 ns
B 110 ns
C 75 ns
D 82.5 ns
Answer is A,157.5ns for read and write
Explanation:-For 1)average_read_access_time = 0.8*50+0.2*(50+300)=110 ns.
2) a. For read & write average take 75% of overall read_access= .75*110 plus 25% of write, that are only from main memory= .25*300
=.75*110+.25*300=157.5 ns.

Average memory access time

I would like to know did I solve the equation correctly below
find the average memory access time for process with a process with a 3ns clock cycle time, a miss penalty of 40 clock cycle, a miss rate of .08 misses per instruction, and a cache access time of 1 clock cycle
AMAT = Hit Time + Miss Rate * Miss Penalty
Hit Time = 3ns, Miss Penalty = 40ns, Miss Rate = 0.08
AMAT = 3 + 0.08 * 40 = 6.2ns
Check the "Miss Penalty". Be more careful to avoid trivial mistakes.
The question that you tried to answer cannot actually be answered, since you are given 0.08 misses per instruction but you don't know the average number of memory accesses per instruction. In an extreme case, if only 8 percent of instructions accessed memory, then every memory access would be a miss.

Interrupt time in DMA operation

I'm facing difficulty with the following question :
Consider a disk drive with the following specifications .
16 surfaces, 512 tracks/surface, 512 sectors/track, 1 KB/sector, rotation speed 3000 rpm. The disk is operated in cycle stealing mode whereby whenever 1 byte word is ready it is sent to memory; similarly for writing, the disk interface reads a 4 byte word from the memory in each DMA cycle. Memory Cycle time is 40 ns. The maximum percentage of time that the CPU gets blocked during DMA operation is?
the solution to this question provided on the only site is :
Revolutions Per Min = 3000 RPM
or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
No. of tracks read per second = (2^19/2^2)*50
= 6553600 ............. (1)
Interrupt = 6553600 takes 0.2621 sec
Percentage Gain = (0.2621/1)*100
= 26 %
I have understood till (1).
Can anybody explain me how has 0.2621 come ? How is the interrupt time calculated? Please help .
Reversing form the numbers you've given, that's 6553600 * 40ns that gives 0.2621 sec.
One quite obvious problem is that the comments in the calculations are somewhat wrong. It's not
Revolutions Per Min = 3000 RPM ~ or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
No. of tracks read per second = (2^19/2^2)*50 <- WRONG
The numbers are 512K / 4 * 50. So, it's in bytes. How that could be called 'number of tracks'? Reading the full track is 1 full rotation, so the number of tracks readable in 1 second is 50, as there are 50 RPS.
However, the total bytes readable in 1s is then just 512K * 50 since 512K is the amount of data on the track.
But then it is further divided by 4..
So, I guess, the actual comments should be:
Revolutions Per Min = 3000 RPM ~ or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
Interrupts per second = (2^19/2^2) * 50 = 6553600 (*)
Interrupt triggers one memory op, so then:
total wasted: 6553600 * 40ns = 0.2621 sec.
However, I don't really like how the 'number of interrupts per second' is calculated. I currently don't see/fell/guess how/why it's just Bytes/4.
The only VAGUE explanation of that "divide it by 4" I can think of is:
At each byte written to the controller's memory, an event is triggered. However the DMA controller can read only PACKETS of 4 bytes. So, the hardware DMA controller must WAIT until there are at least 4 bytes ready to be read. Only then the DMA kicks in and halts the bus (or part of) for a duration of one memory cycle needed to copy the data. As bus is frozen, the processor MAY have to wait. It doesn't NEED to, it can be doing its own ops and work on cache, but if it tries touching the memory, it will need to wait until DMA finishes.
However, I don't like a few things in this "explanation". I cannot guarantee you that it is valid. It really depends on what architecture you are analyzing and how the DMA/CPU/BUS are organized.
The only mistake is its not
no. of tracks read
Its actually no. of interrupts occured (no. of times DMA came up with its data, these many times CPU will be blocked)
But again I don't know why 50 has been multiplied,probably because of 1 second, but I wish to solve this without multiplying by 50
My Solution:-
Here, in 1 rotation interface can read 512 KB data. 1 rotation time = 0.02 sec. So, one byte data preparation time = 39.1 nsec ----> for 4B it takes 156.4 nsec. Memory Cycle time = 40ns. So, the % of time the CPU get blocked = 40/(40+156.4) = 0.2036 ~= 20 %. But in the answer booklet options are given as A) 10 B)25 C)40 D)50. Tell me if I'm doing wrong ?