Processor Design ‐ 6 - muneeb-mbytes/computerArchitectureCourse GitHub Wiki

Implementation of the controller

The controller can be implemented using either a PLA or a ROM. Implementation via PLA takes place in two parts:

PLA 1. Generation of control signals for datapath

The tables for micro operations are linked to the groups of the same, i.e., PCgrp, MEMgrp, RFgrp and ALUgrp. There are a total of 10 control states required(cs0 to cs9) for all the necessary micro operations given in the below transition diagram: cs transition diagram The below table shows the relationship between control states and control signals: cs table To represent 10 control states, 4 bits are required, and hence the input to PLA1 is 4 bit. PLA1 The output signals are grouped, 4, 4, 5 and 6 for PCgrp, Memgrp, RFgrp and ALUgrp respectively for their respective micro operations.

PLA 2. Generation of next control state

The below table describes the next state given the present state and instruction groups:

R-class sw lw beq j
cs0 cs1 cs1 cs1 cs1 cs1
cs1 cs2 cs4 cs4 cs8 cs9
cs2 cs3 x x x x
cs3 cs0 x x x x
cs4 x cs5 cs6 x x
cs5 x cs0 x x x
cs6 x x cs7 x x
cs7 x x cs0 x x
cs8 x x x cs0 x
cs9 x x x x cs0

Note: X represents don't cares. We can have a more compact representation of the above table where we can represent this as a 1D table instead of a 2D one by listing input combinations vertically.

Present state Instruction group Next State
cs0 x cs1
cs1 R-class cs2
cs1 sw/lw cs4
cs1 beq cs8
cs1 j cs9
cs2 x cs3
cs3 x cs0
cs4 sw cs5
cs4 lw cs6
cs5 x cs0
cs6 x cs7
cs7 x cs0
cs8 x cs0
cs9 x cs0

From the above table we can infer that for cs0 regardless of the instruction group, the next state must be cs1. After cs1 the next state depends on the instruction group. The first two columns will serve as inputs to PLA2 to generate the required next state. PLA2 We connect the two PLAs and use a 4-bit state register to hold the control state value and change values every clock cycle. The state register contains the present control state and drives both the PLAs. PLA Controller

ROM: A suitable alternative to PLA?

Size of a ROM

A ROM(Read Only Memory) is a general purpose component that can take n inputs (corresponding to $2^n$ rows) as an address to generate m outputs. The number of words a ROM can hold is $2^n$. This is because a ROM holds all possible minterms. ROM truth table

Size of a PLA

The number of terms a PLA can hold is k(in the form of a compact truth table). This is because a PLA holds only the required number of minterms to form a canonical representation of the required logical function. PLA Size

Comparison

Since PLAs only hold the required number of minterms, they will always consume lesser size than a ROM. Take the following example: For a PLA:

inputs outputs
0X110X 1010

For a ROM:

inputs outputs
001100 1010
001101 1010
011100 1010
001101 1010

Control state transitions:

Present State Opcode Next State
0000 XXXXXX 0001
0001 000000 0010
0001 10X011 0100
0001 000100 1000
0010 000010 1001
0011 XXXXXX 0011
0100 XXXXXX 0000
0100 101011 0101
0101 100011 0110
0110 XXXXXX 0000
0111 XXXXXX 0111
1000 XXXXXX 0000
1001 XXXXXX 0000
1001 XXXXXX 0000

For a ROM, a total of 10 inputs will be required, 4 for the present state and 6 for opcode. The memory needs to hold $2^10$ words, i.e., 1024 words. Also, for example, to generate next state 0100, two terms must generate the same output, i.e., opcode input 100011 and 101011 should generate 0100. In comparison, A PLA only needs to hold k = 14 terms, which is significantly smaller in comparison. Hence, a PLA would be far more compact in comparison to ROM.

Microprogrammed controller

In this style of design, the controller is considered to be a small computer with a memory block(micro program block) that generates a 19 bits for data path control signals and 2 bits for controlling the sequence of execution of the micro program that is to be executed by the controller. The micro program counter is a 4 bit register that steps through different words of the micro programmed memory to ensure that the right signals are generated at the right time. The micro sequencer ensure the right address is put into the micro PC. image

In reference to the first diagram on this page, there are 2 instances where branching takes place, cs1 to cs2, cs4, cs8 or cs9 and cs4 to cs5 and cs6. In micro programming, a multi-way branch is called a dispatch. Our instance has 2 dispatches. image The output to the micro PC is either PC+1, reset, dispatch 1 or dispatch 2. The dispatches generate the necessary address when branching takes place.

Microprogram

Take the example of the following micro program:

first: fetch, PCinc, seq, rs2A, rt2B, Paddr, dispatch1 1a: arith, seq, res2rd, reset 1b: Maddr, dispatch2 2a: m_wr, reset 2b: m_rd, seq, mem2rt, reset 1c: branch, reset 1d: jump, reset

The first instruction is universal. the 1 in 1a is the dispatch number, and a in 1a is the branch. Micro programs are written in symbolic form. Micro assemblers can translate this into contents that will go into the control store of the micro program memory. There are 2 styles in which micro programs can be written:

Horizontal microprogramming

The previously written micro program follows this style. In this style, micro operations can be performed concurrently within the same instructions. It generally performs better but at the cost of minimal encoding.

Vertical microprogramming

This style of programming supports lower concurrency of micro operations and has lower memory requirements(maximal encoding) but at the cost of performance.

Microcode: Trade-offs

Microcode in general is easier to design and write. Implementation requires an off-chip ROM. It is easier to change the values since they are in memory and it can emulate other architectures and make use of internal registers. But the disadvantage is that since the ROM will be off-chip, it will be slower.