Calculate the time for executin instructions with pipeline - cpu-architecture

Suppose that one instructions requires 10 clock cycles from fetch state to write back state. And we want to calculate the time required to execute 1,000,000 instructions. Each clock cycle takes 2 ns.
(a) Calculate the time required.
The answer says that 1,000,009*2 ns. The last digit 9 is for the number of clock cycles for filling the pipeline. Why is this?? I thought since each instruction fetch is happenin in each clock cycle, it would be 1000000*2 ns.

1 2 3 4 5 6 7 8 9 0
1 2 3 4 5 6 7 8 9 0
1 2 3 4 5 6 7 8 9 0
Let's consider these three instructions.Here you can see for the first instruction it has taken 10 clock cycles and and when coming to next two it will only take 2 more clock cycles, so that for the rest 999 999 instructions it will take more 999 999 clock cycles.Therefore 1 000 000 instructions it will take (10+999 999) 1 000 009 clock cycles.

Related

Is it possible to calculate AWT and ATA for SJF( shortest job first ) based on priority?

so far I know that in SJF the execution is done first which BT is low based on the AT.
but if there also attach Priority, how do reach the solution
Process Arrival Time Burst Time Priority
P1 7 3 2
P2 5 2 4
P3 4 5 1
P4 0 4 3

perl cpu profiling

I want to profile my perl script for cpu time. I found out Devel::Nytprof and Devel::SmallProf
but the first one cannot show the cpu time and the second one works bad. At least I couldn't find what I need.
Can you advise any tool for my purposes?
UPD: I need per line profiling/ Since my script takes a lot of cpu time and I want to improve the part of it
You could try your system's (not shell's internal!) time utility (leading \ is not a typo):
$ \time -v perl collatz.pl
13 40 20 10 5 16 8 4 2 1
23 70 35 106 53 160 80 40
837799 525
Command being timed: "perl collatz.pl"
User time (seconds): 3.79
System time (seconds): 0.06
Percent of CPU this job got: 97%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.94
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 171808
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 9
Minor (reclaiming a frame) page faults: 14851
Voluntary context switches: 16
Involuntary context switches: 935
Swaps: 0
File system inputs: 1120
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

Round Robin Scheduling : What happens when all jobs arrive at the same time?

Problem :
Five batch jobs A through E, arrive at a computer center at almost the
same time. They have estimated running times 10, 6, 2, 4, and 8
minutes. Their (externally determined) priorities are 3, 5, 2, 1, and
4, respectively, with 5 being the highest priority. Determine the mean process turn
around time. Ignore process switching overhead. For Round Robin Scheduling, assume that the system is multiprogramming, and that each job gets it fair share of the CPU.All jobs are completely CPU bound.
Solution #1 The following solution comes from this page :
For round robin, during the first 10 minutes, each job gets 1/5 of the
CPU. At the end of the 10 minutes, C finishes. During the next 8
minutes, each job gets 1/4 of the CPU, after which time D finishes.
Then each of the three remaining jobs get 1/3 of the CPU for 6
minutes, until B finishes and so on. The finishing times for the five
jobs are 10, 18, 24. 28, 30, for an average of 22 minutes.
Solution #2 the following solution comes from Cornell University here, which is different (and this one makes more sense to me) :
Remember that the turnaround time is the amount of time that elapses
between the job arriving and the job completing. Since we assume that
all jobs arrive at time 0, the turnaround time will simply be the
time that they complete. (a) Round Robin: The table below gives a
break down of which jobs will be processed during each time quantum.
A * indicates that the job completes during that quantum.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
A B C D E A B C* D E A B D E A B D* E A B E A B* E A E A E* A A*
The results are different: In the first one C finishes after 10 minutes, for example, whereas in the second one C finishes after 8 minutes.
Which one is the correct one and why? I'm confused.. Thanks in advance!
Q1: I believe that the "fair share" requirement means you can assume the time is evenly divided amongst running processes, and thus the particular order won't matter. You could also think of this as the quantum being so low that any variation introduced by a particular ordering is too small to worry about.
Q2: From the above, assuming the time is evenly divided, it will take 10 minutes for all processes to get 2 minutes of their own, at which point C will be done.

Round Robin Scheduling : Two different solutions - How is that possible?

Problem :
Five batch jobs A through E, arrive at a computer center at almost the same time. They have estimated running times 10, 6, 2, 4, and 8 minutes. Their (externally determined) priorities are 3, 5, 2, 1, and 4, respectively, with 5 being the highest priority. Determine the mean process turn around time. Ignore process switching overhead. For Round Robin Scheduling, assume that the system is multiprogramming, and that each job gets it fair share of the CPU.All jobs are completely CPU bound.
Solution #1 The following solution comes from this page :
For round robin, during the first 10 minutes, each job gets 1/5 of the
CPU. At the end of the 10 minutes, C finishes. During the next 8
minutes, each job gets 1/4 of the CPU, after which time D finishes.
Then each of the three remaining jobs get 1/3 of the CPU for 6
minutes, until B finishes and so on. The finishing times for the five
jobs are 10, 18, 24. 28, 30, for an average of 22 minutes.
Solution #2 the following solution comes from Cornell University, can be found here, and is obviously different from the previous one even though the problem is given in exactly the same form (this solution, by the way, makes more sense to me) :
Remember that the turnaround time is the amount of time that elapses
between the job arriving and the job completing. Since we assume that
all jobs arrive at time 0, the turnaround time will simply be the time
that they complete. (a) Round Robin: The table below gives a break
down of which jobs will be processed during each time quantum. A *
indicates that the job completes during that quantum.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
A B C D E A B C* D E A B D E A B D* E A B E A B* E A E A E* A A*
The results are different: In the first one C finishes after 10 minutes, for example, whereas in the second one C finishes after 8 minutes.
Which one is the correct one and why? I'm confused.. Thanks in advance!
The problems are different. The first problem does not specify a time quantum, so you have to assume the quantum is very small compared to a minute. The second problem clearly specifies a one minute scheduler quantum.
The mystery with the second solution is why it assumes the tasks run in letter order. I can only assume that this an assumption made throughout the course and so students would be expected to know to make it here.
In fact, there is no such thing as a 'correct' RR algorithm. RR is merely a family of algorithms, based on the common concept of scheduling several tasks in a circular order. Implementations may vary (for example, you may consider task priorities or you may discard them, or you may manually set the priority as a function of task length or whatever else).
So the answer is - both algorithms seem to be correct, they are just different.

Can't understand Belady's anomaly

So Belady's Anomaly states that when using a FIFO page replacement policy, when adding more page space we'll have more page faults.
My intuition says that we should less or at most, the same number of page faults as we add more page space.
If we think of a FIFO queue as a pipe, adding more page space is like making the pipe bigger:
____
O____O size 4
________
O________O size 8
So, why would you get more page faults? My intuition says that with a longer pipe, you'd take a bit longer to start having page faults (so, with an infinite pipe you'd have no page faults) and then you'd have just as many page faults and just as often as with a smaller pipe.
What is wrong with my reasoning?
The reason that when using FIFO, increasing the number of pages can increase the fault rate in some access patterns, is because when you have more pages, recently requested pages can remain at the bottom of the FIFO queue longer.
Consider the third time that "3" is requested in the wikipedia example here:
http://en.wikipedia.org/wiki/Belady%27s_anomaly
Page faults are marked with an "f".
1:
Page Requests 3 2 1 0 3 2 4 3 2 1 0 4
Newest Page 3f 2f 1f 0f 3f 2f 4f 4 4 1f 0f 0
3 2 1 0 3 2 2 2 4 1 1
Oldest Page 3 2 1 0 3 3 3 2 4 4
2:
Page Requests 3 2 1 0 3 2 4 3 2 1 0 4
Newest Page 3f 2f 1f 0f 0 0 4f 3f 2f 1f 0f 4f
3 2 1 1 1 0 4 3 2 1 0
3 2 2 2 1 0 4 3 2 1
Oldest Page 3 3 3 2 1 0 4 3 2
In the first example (with fewer pages), there are 9 page faults.
In the second example (with more pages), there are 10 page faults.
When using FIFO, increasing the size of the cache changes the order in which items are removed. Which in some cases, can increase the fault rate.
Belady's Anomaly does not state anything about the general trend of fault rates with respect to cache size. So your reasoning (about viewing the cache as a pipe), in the general case is not wrong.
In summary:
Belady's Anomaly points out that it is possible to exploit the fact that larger cache sizes can cause items in the cache to be raised in the FIFO queue later than smaller cache sizes, in order to cause larger cache sizes to have a higher fault rate under a particular (and possibly rare) access pattern.
This statement is wrong because it is overgeneralized:
Belady's Anomaly states that when using a FIFO page replacement policy, when adding more page space we'll have more page faults
This is a corrected version:
"Belady's Anomaly states that when using a FIFO page replacement policy, when adding more page space, some memory access patterns will actually result in more page faults."
In other words... whether the anomaly is observed depends on the actual memory access pattern.
Belady's anomaly occurs in page replacement algorithm do not follow stack algorithm.That is the pages when frames were less should be a subset of pages when frame are more.On increasing page frame,the page frames which were present before has to be there.This can occur in FIFO sometimes,even random page replacement but not LRU or optimal.
I could not understand belady anomaly even after reading the Wikipedia article and accepted answer. After writing the trace I kind of got it. Here I’d like to share my understanding.
Keys to understanding belady anomaly.
Unlike LRU, FIFO just pushes out the oldest elements regardless of
frequency. So staying in FIFO longer means falling victim to
eviction.
From here, I refer to 3 and 4 pages FIFO queue as FIFO3 and FIFO4.
To understand Wikipedia example, I divided it into 2 parts. When FIFO3 catches up with FIFO4 and when FIFO3 overtakes FIFO4.
How FIFO3 catch up with FIFO4 on 9th
Look at 3 in both FIFOs. In FIFO3, 3 is evicted on 4th and came back on 5th. So it was still there on 8th and cache hit happened.
In FIFO4, 3 is HIT on 5th, but this cache hit made 3 stays longer and evicted on 7th, right before next 3 on 8th.
2 is the same as 3. Second 2(6th) is MISS on FIFO3 and HIT on FIFO4, but third 2(9th) is HIT on FIFO3, and MISS on FIFO4.
It might help to think like this. On FIFO3, 3 was updated on 5th so stayed longer until 8th. On FIFO4 3 was old and evicted on 7th, right before next 3 comes.
How FIFO3 overtakes FIFO4
Because there are 2 cache misses on 8, 9th for FIFO4, 4 is pushed down and evicted on 11th in FIFO4.
FIFO3 still retains 4 on 12th because there are cache hit on 8, 9th, so 4 was not pushed down.
I think this is why Wikipedia's aritcle says "Penny Wise, Pound Foolish"
Conclusion
FIFO is a simple and naive algorithm that does not take frequency into account.
You might get a better understanding by applying LRU(Least Recently Used) to Wikipedia’s example. In LRU, 4 is better than 3.
Belady's anomaly happens in a FIFO scheme only when the page that is currently being referred is the page that was removed last from the main memory. only in this case even though you increase more page space, you'll have more page faults.