Analysing Performance - muneeb-mbytes/computerArchitectureCourse GitHub Wiki
Analyzing Delays
Component delay and impact on clock period
-
PC has one register with delay as 0 which is the register delay(single register like PC)
-
Adder delay is the adder which does PC+4 and PC + 4 + offset
-
ALU delay tA = t + delay
-
Delay of mux = 0
-
Register file -tR (array with collection of of resisters where according to the address one of them is selected), Program memory ti, Data memory tM (these are the few component delays and which has got very minimal delays)
-
Even the interconnecting wires have got significant wire delay and can be ignored if the gates logic is very quick.
So in order to calculate the impact on clock period we need to consider each instructions or group of instructions separately.
Instructions like add, subtract, AND, OR follow 2 paths :
-
Computing the sum or difference of AND & OR which goes through instruction memory, register file, ALU and back to register file that is ti + tR + tA + tR.
-
through adder to compute PC + 4
Delays for different instructions
- Delays for {add, sub, AND, OR} instructions the clock period is the maximum of { (ti + tR + tA + tR) , t+ (which is the ALU delay which is very quick) }
- Delays for {sw} which is store word instruction where we assume same time for reading and writing both in case of memory and in case of register file ut in general they need not be same, the clock period is the maximum of { (t+) , (ti + tR + tA + tM) }
- Delays for {lw} which is the load word instruction with the clock period being the maximum value of { (t+) , (ti + tR + tA + tM + tR) }
- Delay for {beq} which is the branch if equal instruction, here we need to consider 3 paths and the clock period is the maximum value of all the three paths which are:
-
one goes through both the adders so t+ and t+
-
another path goes through ti the instruction memory from where we are picking the offset and then adding it to PC + 4 so it is ti + t+
-
for the third one the way the comparison being done is ti + tR + tA .
- Delay for {j} which is for jump instruction with the clock period being the maximum value of { (t+) , (ti) }
Summarizing,
The clock period has to be large enough to accommodate all these possibilities. Hence its the maximum value of all the above mentioned delays.