Processor Design ‐4 - muneeb-mbytes/computerArchitectureCourse GitHub Wiki
Clock period in a single cycle design:
All stages of an instruction completed within one long clock cycle.
If all instructions must complete within one clock cycle, then the cycle time has to be large enough to accommodate the slowest instruction.
For example, lw $t0, –4($sp) needs 8ns, assuming the delays shown here.
reading the instruction memory 2ns
reading the base register $sp 1ns
computing memory address $sp-4 2ns
reading the data memory 2ns
storing data back to $t0 1ns
Total 8ns
If we make the cycle time 8ns then every instruction will take 8ns, even if they don’t need that much time.
For example, the instruction add $s4, $t1, $t2 really needs just 6ns.
reading the instruction memory 2 ns
reading registers $t1 and $t2 1 ns
computing $t1 + $t2 2 ns
storing the result into $s0 1 ns
With these same component delays, a sw instruction would need 7ns, and beq would need just 5ns.
Multi-cycle Design:
In multi-cycle design, the goal is to break down the execution of instructions into several cycles. This determines which tasks are completed in each cycle. For simplicity, let's consider a basic approach where each major action corresponds to a single clock cycle. In this scenario, whether it's ti, tM, or t+ all the largest among them dictates the clock period. For different instruction types, such as R class, lw, sw, beq, and j
the number of clocks needed varies:
- R class requires four clocks
- lw requires five
- sw requires four
- beq requires three
- and j requires only one.
Despite minor disparities in these times, resulting in small dead times, the overall performance improvement is noticeable due to reduced overall time wastage.
But in this approach the total time lw takes now is more than what it was taking earlier. Earlier it was taking one clock now it is taking five clocks which goes from here to here. There is some wastage of time here also but on the whole this approach would still give better performance or save time.
Here we have considered almost equal time duration (for ti, tM, etc). But if we consider large difference between these time delays it will result in unbalanced delays.
To maintain balance between delays following methods can be used:
Multiple actions in a period:
If there are 2 actions which are taking very little time then instead of doing one action in a period we can perform 2 actions.
Multiple periods for an action:
Find suitable clock period so that the dead time or the wastage of time is minimized.
Single cycle datapath
The steps to improve resource utilisation are as follows:
-
ALU which will do all arthemitic and logic operations
-
For improvisation the 2 memory blocks will be clubbed (IM-instruction memory and DM-data memory)
-
The single memory will store program as well as data
-
The two adders are removed and connections are rearranged
-
Registers are introduced