Processor Design ‐ 6 - muneeb-mbytes/computerArchitectureCourse GitHub Wiki

Implementation of the controller

The controller can be implemented using either a PLA or a ROM. Implementation via PLA takes place in two parts:

PLA 1. Generation of control signals for datapath

The tables for micro operations are linked to the groups of the same, i.e., PCgrp, MEMgrp, RFgrp and ALUgrp. There are a total of 10 control states required(cs0 to cs9) for all the necessary micro operations given in the below transition diagram: cs transition diagram The below table shows the relationship between control states and control signals: cs table To represent 10 control states, 4 bits are required, and hence the input to PLA1 is 4 bit. PLA1 The output signals are grouped, 4, 4, 5 and 6 for PCgrp, Memgrp, RFgrp and ALUgrp respectively for their respective micro operations.

PLA 2. Generation of next control state

The below table describes the next state given the present state and instruction groups:

	R-class	sw	lw	beq	j
cs0	cs1	cs1	cs1	cs1	cs1
cs1	cs2	cs4	cs4	cs8	cs9
cs2	cs3	x	x	x	x
cs3	cs0	x	x	x	x
cs4	x	cs5	cs6	x	x
cs5	x	cs0	x	x	x
cs6	x	x	cs7	x	x
cs7	x	x	cs0	x	x
cs8	x	x	x	cs0	x
cs9	x	x	x	x	cs0

Note: X represents don't cares. We can have a more compact representation of the above table where we can represent this as a 1D table instead of a 2D one by listing input combinations vertically.

Present state	Instruction group	Next State
cs0	x	cs1
cs1	R-class	cs2
cs1	sw/lw	cs4
cs1	beq	cs8
cs1	j	cs9
cs2	x	cs3
cs3	x	cs0
cs4	sw	cs5
cs4	lw	cs6
cs5	x	cs0
cs6	x	cs7
cs7	x	cs0
cs8	x	cs0
cs9	x	cs0

From the above table we can infer that for cs0 regardless of the instruction group, the next state must be cs1. After cs1 the next state depends on the instruction group. The first two columns will serve as inputs to PLA2 to generate the required next state. PLA2 We connect the two PLAs and use a 4-bit state register to hold the control state value and change values every clock cycle. The state register contains the present control state and drives both the PLAs. PLA Controller

ROM: A suitable alternative to PLA?

Size of a ROM

A ROM(Read Only Memory) is a general purpose component that can take n inputs (corresponding to $2^n$ rows) as an address to generate m outputs. The number of words a ROM can hold is $2^n$. This is because a ROM holds all possible minterms. ROM truth table

Size of a PLA

The number of terms a PLA can hold is k(in the form of a compact truth table). This is because a PLA holds only the required number of minterms to form a canonical representation of the required logical function. PLA Size

Comparison

Since PLAs only hold the required number of minterms, they will always consume lesser size than a ROM. Take the following example: For a PLA:

inputs	outputs
0X110X	1010

For a ROM:

inputs	outputs
001100	1010
001101	1010
011100	1010
001101	1010

Control state transitions:

Present State	Opcode	Next State
0000	XXXXXX	0001
0001	000000	0010
0001	10X011	0100
0001	000100	1000
0010	000010	1001
0011	XXXXXX	0011
0100	XXXXXX	0000
0100	101011	0101
0101	100011	0110
0110	XXXXXX	0000
0111	XXXXXX	0111
1000	XXXXXX	0000
1001	XXXXXX	0000
1001	XXXXXX	0000

For a ROM, a total of 10 inputs will be required, 4 for the present state and 6 for opcode. The memory needs to hold $2^10$ words, i.e., 1024 words. Also, for example, to generate next state 0100, two terms must generate the same output, i.e., opcode input 100011 and 101011 should generate 0100. In comparison, A PLA only needs to hold k = 14 terms, which is significantly smaller in comparison. Hence, a PLA would be far more compact in comparison to ROM.

Microprogrammed controller

In this style of design, the controller is considered to be a small computer with a memory block(micro program block) that generates a 19 bits for data path control signals and 2 bits for controlling the sequence of execution of the micro program that is to be executed by the controller. The micro program counter is a 4 bit register that steps through different words of the micro programmed memory to ensure that the right signals are generated at the right time. The micro sequencer ensure the right address is put into the micro PC.

In reference to the first diagram on this page, there are 2 instances where branching takes place, cs1 to cs2, cs4, cs8 or cs9 and cs4 to cs5 and cs6. In micro programming, a multi-way branch is called a dispatch. Our instance has 2 dispatches. The output to the micro PC is either PC+1, reset, dispatch 1 or dispatch 2. The dispatches generate the necessary address when branching takes place.

Microprogram

Take the example of the following micro program:

first: fetch, PCinc, seq, rs2A, rt2B, Paddr, dispatch1 1a: arith, seq, res2rd, reset 1b: Maddr, dispatch2 2a: m_wr, reset 2b: m_rd, seq, mem2rt, reset 1c: branch, reset 1d: jump, reset

The first instruction is universal. the 1 in 1a is the dispatch number, and a in 1a is the branch. Micro programs are written in symbolic form. Micro assemblers can translate this into contents that will go into the control store of the micro program memory. There are 2 styles in which micro programs can be written:

Horizontal microprogramming

The previously written micro program follows this style. In this style, micro operations can be performed concurrently within the same instructions. It generally performs better but at the cost of minimal encoding.

Vertical microprogramming

This style of programming supports lower concurrency of micro operations and has lower memory requirements(maximal encoding) but at the cost of performance.

Microcode: Trade-offs

Microcode in general is easier to design and write. Implementation requires an off-chip ROM. It is easier to change the values since they are in memory and it can emulate other architectures and make use of internal registers. But the disadvantage is that since the ROM will be off-chip, it will be slower.

	R-class	sw	lw	beq	j
cs0	cs1	cs1	cs1	cs1	cs1
cs1	cs2	cs4	cs4	cs8	cs9
cs2	cs3	x	x	x	x
cs3	cs0	x	x	x	x
cs4	x	cs5	cs6	x	x
cs5	x	cs0	x	x	x
cs6	x	x	cs7	x	x
cs7	x	x	cs0	x	x
cs8	x	x	x	cs0	x
cs9	x	x	x	x	cs0

	R-class	sw	lw	beq	j
cs0	cs1	cs1	cs1	cs1	cs1
cs1	cs2	cs4	cs4	cs8	cs9
cs2	cs3	x	x	x	x
cs3	cs0	x	x	x	x
cs4	x	cs5	cs6	x	x
cs5	x	cs0	x	x	x
cs6	x	x	cs7	x	x
cs7	x	x	cs0	x	x
cs8	x	x	x	cs0	x
cs9	x	x	x	x	cs0

	R-class	sw	lw	beq	j
cs0	cs1	cs1	cs1	cs1	cs1
cs1	cs2	cs4	cs4	cs8	cs9
cs2	cs3	x	x	x	x
cs3	cs0	x	x	x	x
cs4	x	cs5	cs6	x	x
cs5	x	cs0	x	x	x
cs6	x	x	cs7	x	x
cs7	x	x	cs0	x	x
cs8	x	x	x	cs0	x
cs9	x	x	x	x	cs0