Pipelining - muneeb-mbytes/computerArchitectureCourse GitHub Wiki
In Single cycle design,
If we start with an address in PC (Program Counter), the information flows through various stages within a cycle
- Stage 1: Fetch
- Stage 2: Decode
- Stage 3: Execute
The instruction ends at the end of the cycle
If Tf=5ns Td=5ns Te=5ns
Total time T = Tf + Td + Te =15ns
Clock period of a single cycle design
TO ENHANCE YOUR KNOWLEDGE:
Why Instruction Memory is external to the Processor?
Processor is "HARDWARE"
Let's say you want to use FACEBOOK,
- You'll download FACEBOOK, which will have
.exe
file which will be in binary(0s and 1s) - Later when you install and use facebook, these will be dumped into instruction memory.
Later sometime, you get bored using FACEBOOK and you delete it and download INSTAGRAM
- Later when you install and use instagram, the instruction memory will be dumped with the 0s and 1s of instagram
Now if Instruction Memory is inside the processor, Everytime you install the new application, you need to change the hardware
But now, instruction memory is outside the processor, So you can run any application with the same hardware
Processor will go out and FETCH the instruction from Instruction Memory and then DECODEs and EXECUTEs it.
So FETCHing takes more time
To increase the processor speed: Amdahl's Law
Multicycle Design:
- All stages operate concurrently
If Tf=5ns Td=3ns Te=4ns
Total time T= MAX(Tf,Td,Te) = 5ns
For example, in Multi Cycle design
To manufacture a CAR, there are 3 units
Unit 1: Door
Unit 2: Seat
Unit 3: Wheel
Let's say each stage takes 1 hour,
So in 6 hours, 4 cars will be manufactured
For the same example, If we consider Single cycle design
In 6 hours, only 2 cars will be manufactured
In Multi cycle design, each stage data has to be stored somewhere, before taking it into the next stage
Here in the above diagram, we can see that some buffers are used.
Which means contents of each stage is stored before giving it into the next stage
Basically these buffers are also registers, which is called as IR (Instruction Register).
How does this IR help?
Let's say out PC(Program Counter) is incrementing.
- While decoding the first instruction, the PC will be fetching the next instruction (PC=PC+4)
- So to avoid the confusion, we will store the instruction which is fetched first into the IR.
- Now to decode the first instruction, we will give the data in the IR.
- So my PC can keep on incrementing without any problem.
- To develop an IR, We just need a storage element (Flip Flop)
All the processors that you see today are Pipelined Processors