Processor Design ‐4 - muneeb-mbytes/computerArchitectureCourse GitHub Wiki

Clock period in a single cycle design:

All stages of an instruction completed within one long clock cycle.

image

If all instructions must complete within one clock cycle, then the cycle time has to be large enough to accommodate the slowest instruction.

single clock cycle drawio

For example, lw $t0, –4($sp) needs 8ns, assuming the delays shown here.

  reading the instruction memory            2ns

  reading the base register $sp             1ns

  computing memory address $sp-4            2ns

  reading the data memory                   2ns

  storing data back to $t0                  1ns                

Total 8ns

If we make the cycle time 8ns then every instruction will take 8ns, even if they don’t need that much time.

For example, the instruction add $s4, $t1, $t2 really needs just 6ns.

   reading the instruction memory             2 ns

   reading registers $t1 and $t2              1 ns
 
   computing $t1 + $t2                        2 ns

   storing the result into $s0                1 ns

With these same component delays, a sw instruction would need 7ns, and beq would need just 5ns.

Multi-cycle Design:

In multi-cycle design, the goal is to break down the execution of instructions into several cycles. This determines which tasks are completed in each cycle. For simplicity, let's consider a basic approach where each major action corresponds to a single clock cycle. In this scenario, whether it's ti, tM, or t+ all the largest among them dictates the clock period. For different instruction types, such as R class, lw, sw, beq, and j

the number of clocks needed varies:

  • R class requires four clocks
  • lw requires five
  • sw requires four
  • beq requires three
  • and j requires only one.

clck multi drawio

Despite minor disparities in these times, resulting in small dead times, the overall performance improvement is noticeable due to reduced overall time wastage.

But in this approach the total time lw takes now is more than what it was taking earlier. Earlier it was taking one clock now it is taking five clocks which goes from here to here. There is some wastage of time here also but on the whole this approach would still give better performance or save time.

Here we have considered almost equal time duration (for ti, tM, etc). But if we consider large difference between these time delays it will result in unbalanced delays.

Untitled Diagram-Page-4 drawio12

To maintain balance between delays following methods can be used:

Multiple actions in a period:

If there are 2 actions which are taking very little time then instead of doing one action in a period we can perform 2 actions.

Multiple periods for an action:

Find suitable clock period so that the dead time or the wastage of time is minimized.

Single cycle datapath

The steps to improve resource utilisation are as follows:

FIRST drawio

  • ALU which will do all arthemitic and logic operations

  • For improvisation the 2 memory blocks will be clubbed (IM-instruction memory and DM-data memory)

  • The single memory will store program as well as data

  • The two adders are removed and connections are rearranged

  • Registers are introduced

register_introduced drawio