Magic‐1 microcode - retrotruestory/M1DEV GitHub Wiki
Microcode
In other examples of homebuilt CPUs that I've seen, remarkably little microcode was necessary. This is largely the result of having fairly wide and regular instruction encodings. In this way, bit patterns within the instruction word itself can be used to generate the control signals (RISC-like). On the other extreme, you could use the instruction word simply as an index to the starting address of the corresponding microcode program used to carry it out and not use any portion of the instruction bits to assist control signal generation. What I've done is something close to the latter. Wherever convenient, I'm using instruction bits to assist signal generation, but primarily I'm using the instruction byte as a direct index into the microcode.
My original plan had me using the 8-bit opcode as in index into a PROM of starting addresses in the main microcode store. However, that added a lookup latency in the control signal generation path, so I decided to just burn more PROM bits and have the opcode be a direct starting address.
So, for microcode I'm using five 512x8-bit PROMs. The low half of the PROM will be devoted to the first microinstruction of each instruction. Each microinstruction contains a "next" field, which will route microexecution into an appropriate spot in the other half of the microcode store. Within the sequencer there is the ability to conditionally branch. I recently also allowed a 1-deep subroutine call/return mechanism, but decided to drop it when it became clear that I had lots of unused space in the microcode PROMs. Not especially elegant, but it certainly simplified the sequencer. Oh, there's also a special microinstruction for fetching, and lots of nasty logic dealing with faults and interrupts. Interrupts will be recognized at instruction boundaries. Faults will immediately suppress clocking of results and will transfer control to a fault microcode sequence at the beginning of the next T cycle.
As far as the parts used, I'm going with 74s472's, which are expensive and hard to find. For this reason, I put together an EPROM daughter card to try things out before I burn the real PROMs. The daughter card uses fast 60ns 27C256 EPROMs and also provides a hex display to show the address of the next microinstruction. [Note:].
So far, the part of the M1 design that I'm most embarrassed about is the large amount of microcode I'm using. I can see how I could significantly reduce it (in particular, by factoring in T-state pulses to give particular bits different meaning during different T-states). However, given that I'm going to be hand-wiring every connection, I think it best to trade off microcode bits for reduced complexity.
Let me try to explain the microcode a little better. The first thing you want to look at is the mcode_rec_t structure that is defined on the microcode web page. This structure represents a single microcode instruction, and is exactly 40 bits (or 5 bytes) wide. This corresponds to the 5 8-bit microcode EPROMS. The first 8 bits of mcode_rec_t are the "next" field, which are stored in U5 on the microcode schematic sheet or microcode EPROM 0 and are programmed using the file workspace/M1Dev/PromData/Actual/prom23_0.hex. The next 8 bits the latch bits through emdrlo, which correspond to U6/ EPROM 1, prom23_1.hex. ... and so on.
There are 512 microcode instruction works, broken down into a 1st bank of 256 and a second bank of 256. Each Magic-1 instruction can be thought of as a one-byte opcode optionally followed by additional bytes. The one-byte opcode is decoded by using it as a direct index into the 1st 256 bank of microinstructions. The first machine cycle of a new instruction always happens in that 1st 256 words, while all subsequent machine cycles happen in the 2nd bank of 256 microcode instructions.
All microcode instructions feed into and drive the control logic on the control card.
Now, the order of the instructions is sometimes important. By carefully ordering the instructions, we can reduce the amount of microcode needed. For example, consider the following Magic-1 instructions:
0x09: pop MSW 0x0a: pop C 0x0b: pop PC 0x0c: pop DP 0x0d: pop SP 0x0e: pop A 0x0f: pop B
Look at the v2.3 Microcode web page and note that all of the above instructions use the same microcode:
TO_Z(R_SP),DATA,NEXT(Pop16) ; on 1st page
then going to instruction 0x139 in 2nd half:
0x139: LDHI,NEXT(FALLTHROUGH) 0x13a: READLO,INC_TO_Z(R_MAR),L(R_SP,LWORD),DATA,NEXT(FALLTHROUGH) 0x13b: TO_Z(R_MDR),L(R_IR_REG,LWORD),NEXT(PCtoMAR) 0x1c2: TO_Z(R_PC),CODE,NEXT(Fetch)
The important part here is the L(R_IR_REG,LWORD) in microinstruction 0x13b. The L(reg,size) macro defines which register latch signal will be asserted on the control card (see U7:LATCH[0..3] on the control card). Those bits are then fed into U36 and U42 on the Field Decode sheet to produce the register latch signals. Look at the microcode page to find the LATCH register defines: R_MSW is 1, R_C is 2, and so on. When writing microcode, you could cause SP to latch data on the Z bus by generating L(R_SP,LWORD).
However, look at U42A on the Field Decode sheet. If the LATCH[0..3] bits are all 1 (i.e. 15, or R_IR_REG), then instead of using the register defined in the LATCH bits, the low 3 bits of the instruction register are used. Looking again at the pop instructions and the least significant 3 bits of the instruction encoding:
0x09: pop MSW, low 3 bits - 001, R_MSW (1) 0x0a: pop C, low 3 bits - 010, R_C (2) 0x0b: pop PC, low 3 bits - 011, R_PC (3) 0x0c: pop DP, low 3 bits - 100, R_DP (4) 0x0d: pop SP, low 3 bits - 101, R_SP (5) 0x0e: pop A, low 3 bits - 110, R_A (6) 0x0f: pop B, low 3 bits - 111, R_B (7)
So, by taking advantage of bits from the instruction encoding, we can allow multiple instructions to share the same microcode. Note that you don't have to use this trick - any of the pop instructions could be moved to a different location - but you would not be able to use shared microcode for them.
In other words, it is easy to replace a microcode instruction but not always easy to move them to a different opcode value. To add a new microcode instruction, you could just replace an old one. For example, TRAPO is not currently used.
Hope this helps, ...Bill
The microcode consists of 512 40-bit words, and is divided into 2 blocks of 256 microinstructions. The first half of the microcode is directly indexed by the opcode value. For example, the opcode for the MEMCOPY instruction is 0xE8, so when the instruction is loaded into the instruction register it directly tells where to begin executing microinstructions. Note, though, that all subsequent microinstruction execution happens in the 2nd 256 group of microinstructions.
Also, to simplify the microcode sequencing mechanism, each microinstruction contains the address of the next microinstruction to execute. Conditional execution is handled by either using the NEXT field or resetting to a next of 0 (to terminate execution of this instruction and proceed to the next instruction).
Anyway, I hope this helps.
…Bill
I don't think it would be correct to think of the microcode as something that is "loaded". It is always there, from the start.