Specifications - jpursey/oz-3 GitHub Wiki
Design philosophy
The OZ-3 is a virtual CPU, building on the general architecture ideas and quirks of the OZ-1 and OZ-2 (see History).
Just quirky enough
The goal of the OZ-3 is to be the heart of a "computer" exposed more or less directly in games and toy applications. It has the following top-level goals:
- Old school: The OZ-3 is intended to "feel" like an old 16-bit style CPU from the 1980s and 1990s. However, it deliberately is not an actual copy, so programmers (gamers) have a sense of discovery and can't just take existing code and run it.
- Unique: It has unique or uncommon features that diverge from most/any other CPU. In the OZ-3, the prime example is the separate and configurable memory banks for code, stack, and memory.
- Simple: While the CPU inspiration is certainly on the Z80/x86 side, there is more uniformity in the register layout and opcodes. This allows for less workarounds when programming (for instance, due to limited addressing modes supported by an opcode). It also makes programs shorter as a result, which is important with the relatively small memory footprint.
- Less boilerplate: Assembly programming is always going to be verbose, but the OZ-3 has a generous amount of opcode space, and so in true CISC style, it supports several complex opcodes for commonly needed tasks (pushing/popping register sets, counting, looping, etc). This also adds to the discovery and options when it comes to optimization.
- Extensible: The CPU is intended to be embeddable within a game, and so the architecture and runtime is designed to support external devices, coprocessors (including expanding opcodes), etc. without needing to change the core.
- Fun: The processor is ridiculous and would never exist as a real processor. The bus requirements combined with the limited instructions are just silly. The cycle counts are self consistent and kinda real-ish, but not really. Overall, the design is likely actively hostile to an efficient hardware implementation. But that's not the point. It has the veneer of being real, but the decisions are based more about what would be fun -- especially in the context of a game.
Backward compatibility
While its intended use cases is the same as the OZ-1 and OZ-2, it is not backward compatible at all. The obvious reason for this is that there are no existing OZ-1 and OZ-2 programs -- so why bother. However, this is a nostalgia passion project, and backwards compatibility can be "fun" (after all I did make the OZ-2 backward compatible with the OZ-1). So the more nitty gritty reasons for breaking backward compatibility are:
- Simplify the organization of the opcodes (grouping similar opcodes together).
- Replacing the "error" flag with an "overflow" flag in the Z80 style which better supports signed integer use cases. The "error" flag was very unconventional, ambiguous (what was the error), and didn't really fit with the goals of the machine anyway.
- Changing floating point operations to be more of a coprocessor-style approach (separate registers and parallel execution).
- Remove the separate concept of an "address stack" and just use the single "stack" memory. The (now single) stack also follows the more typical pattern of pushing onto the stack decrements the stack pointer.
- Unify the memory model by adding 16 banks of memory (64Kiw each). The code, stack, and data can all be assigned to separate (or the same) bank of memory. This also plays nicely with multiple cores).
- Overhaul addressing modes, supporting index style addressing directly, and remove ability to do memory-to-memory operations. A register always holds a result, and a
MOV
to memory is only supported from a register. Explicit memory-to-memory block operations are supported via a DMA coprocessor.
Hardware specification
Overview
The OZ-3 is a modelled as 16-bit processor with limited support for 32-bit operations (generally requiring additional cycles and depending on the address mode potentially memory accesses). The following lists the various OZ-3 hardware components and how they work together.
- Main memory: The OZ-3 supports up to 16 banks of memory each with 64 Kiw of data. The memory is general purpose, and is used for code, data, and device memory mapping. Each memory bank has a dedicated 16-bit address bus and 16-bit data bus, which is shared across any cores, coprocessors, and external devices that support memory mapping. Each memory bank is logically divided into 16 pages (4 KiW each), and is statically configured with a 16-bit read mask and a 16-bit write mask indicating the read/write access permissions for each page. The masks cannot be changed by OZ-3 code at runtime, it is part of the system configuration.
- Cores: The OZ-3 CPU supports up to 8 cores that may run in parallel. Each core has its own dedicated set of registers, interrupt vector, and ability to fetch, decode, and execute opcodes. However, all cores share main memory, ports, and coprocessors. Each core is associated with up to four memory banks simultaneously (one for code, one for the stack, and two for data).
- Ports: The OZ-3 supports 256 logical I/O ports. Each I/O port has a 16-bit input data line and a 16-bit output data line, with a separate status flag for each indicating a write/read occurred.
- Interrupts: The OZ-3 has 32 interrupts triggered via dedicated interrupt lines. Cores, coprocessors, and devices can trigger interrupts on the lines they are connected with. Multiple devices and coprocessors can share the same line, however as they share a line they will be unable to raise interrupts simultaneously. Each interrupt is independently handled by each OZ-3 core, with the address stored in an interrupt vector settable by individual cores.
Registers
Registers define the working set memory and status for each OZ-3 core. They are private to the core, and so are only directly accessible by code running within the core.
- General purpose registers: The general registers in the OZ-3 are fully interchangeable across opcodes for a given bit-depth. Notably (compared to the Z80/8080 and friends) there is no "accumulator" or "index" registers.
- 16-bit registers: The OZ-3 has 8 general purpose 16-bit registers: R0 to R7. These registers may also be used for addressing memory (with an scalar offset). When used in this way, registers R0-R3 refer to the
DATA
memory bank, registers R4-R5 refer to theEXTRA
memory bank (data bank #2), and registers R6-R7 refer to theSTACK
data bank. - 32-bit registers: The OZ-3 combines the 16-bit registers into pairs, yielding 4 general purpose 32-bit registers: D0 to D3. These are aliased over the 16-bit registers in little endian fashion, with R0 being the low 16-bits of D0 and R1 being the high 16-bits of D0.
- 16-bit registers: The OZ-3 has 8 general purpose 16-bit registers: R0 to R7. These registers may also be used for addressing memory (with an scalar offset). When used in this way, registers R0-R3 refer to the
- Special purpose registers: The OZ-3 also has several dedicated special purpose registers. These refer to addresses in each primary data bank and may be offset with a scalar value when referencing memory. Unlike the general purpose registers, they also have meaning tied to dedicated opcodes, and are manipulated directly by the OZ-3 as part of general operation.
- Program Counter: The PC register indicates where the next instruction is located that the OZ-3 core will execute in the
CODE
memory bank. The PC register wraps around if execution passes the top of memory. - Stack Pointer: The SP register indicates where the top of the stack is in the
STACK
memory bank. Like the Z80, the top of the stack points to the address of the last pushed value (not the one after it). The stack pointer register starts at zero and wraps around. - Data Pointer: The DP register indicates where the bottom of main memory is in the
DATA
memory bank. Unlike other registers, this is the only register that can be offset with another register (not just a scalar value).
- Program Counter: The PC register indicates where the next instruction is located that the OZ-3 core will execute in the
- Cache registers: The OZ-3 has two additional 16-bit temporary registers C0 to C1 and a corresponding 32-bit variant aliased over them (CC). These cannot be directly accessed by code, but used to temporarily store immediate data, or when operations involving the ALU is needed (the ALU only operates on registers).
- Bank Map: The BM register is a 16-bit register which indicates the banks being used by the core. OZ-3 code can configure the bank map with the
RST
instruction. There are four bank assignments, each represented by 4 bits. Bank assignments can overlap (they all can refer to the same physical memory bank).CODE
: This specifies the bank the PC register references, and where instructions and their immediate operands are fetched from.STACK
: This specifies the bank the SP register references, and where stack data is stored and retrieved. General purpose registers R6-R7 also refer to addresses in this bank.DATA
: This specifies the primary data bank for all direct and indirect addressing of main memory (data that isn't the stack). The BP register references this bank, as well as the general purpose registers R0-R3. In many cases, theDATA
bank and theSTACK
bank may map to the same physical memory.EXTRA
: This specifies the secondary data bank for general purpose use. The general purpose registers R4-R5 refer to this bank. This may be used to access additional memory, or to provide additional register addressing into same bank as is mapped toSTACK
orDATA
.- *Status:" The ST register is a 16-bit register that contains all status flags for the core. See Flags for details.
- Interrupts:
- Interrupt trigger: The IT register is a 32-bit register that contains a bit indicating whether an interrupt was raised for that index. It is set by the triggering code (usually an external device or coprocessor), and then it is cleared by the interrupt handler itself.
- Interrupt vector: Each core has its own vector of 32 16-bit addresses which specify where the interrupt handler is located. If the address is zero, then no handler will be called (it doesn't not call address zero). Otherwise the address is called within the specified
code
bank when the interrupt fires.
Flags
Each OZ-3 core has several flags stored in the ST
status register:
Bit | Flag | Name | Description |
---|---|---|---|
0 | Z | Zero | Set when an operation results in zero |
1 | S | Sign | Set when an operation results in the high bit set |
2 | C | Carry | Set when an operation causes an unsigned overflow |
3 | O | Overflow | Set when an operation causes a signed overflow |
4 | I | Interrupt | When set, interrupts are enabled |
8 | T | Trap | When set, the core automatically halts after each instruction |
Instructions
TODO
Synchronization
The OZ-3 has several cores, coprocessors, and devices all with access to overlapping resources like memory, ports, the interrupt vector, and coprocessors. This means there are lots of race conditions and possible synchronization issues, which is not fun. The OZ-3 provides very strong synchronizing guarantees at the operation level, with each operation effectively operating in an atomic-like fashion. It optimizes for ease of understanding over maximizing (virtual) performance in the interest of fun. In fact, due to the heavy synchronization guarantees, it provides an interesting opportunity for programmers (gamers) to find optimal ways to use the shared resources.
Concurrent memory access
Each core and coprocessor may be configured to be able to access one or more of the available memory data banks. Fetch and store actions are implicitly fully synchronized, such that no two components are accessing the same memory bank at the same time. This is true for the duration the operation requires the memory bank (both reads and writes). Any other cores and coprocessors will be put into a wait state until the preceding operation is complete for the specified memory bank operation.
After the operation is complete, the highest priority core or coprocessor will be released from its wait state to run. Priorities are 16-bit signed integer values set statically by the machine configuration. The eight general purpose cores have priorities matching their ID (0-7). Each coprocessor has its own defined priority (see Coprocessors).
As a simple example, a core may execute a MOV R1 (1234)
instruction (copy 16-bit value at address 20 to register R1). This will result in the following instruction microcode:
Microcode | Cycles | Description |
---|---|---|
LK.C |
--- | Lock CODE memory bank. |
ADR.C(PC) |
1 | Set CODE address bus to PC . |
ID |
1 | Read opcode from CODE data bus, and decode it. This also advances PC by 2. |
LDW.C(A) |
1 | Read 16-bit word containing 1234 from the CODE data bus into register A . |
UL.C |
--- | Unlock CODE memory bank. |
LK.D |
--- | Lock DATA memory bank. |
ADR.D(A) |
1 | Set DATA address bus to 1234 stored in A . |
LDW.D(R1) |
1 | Reads 16-bit word from DATA data bus into register R1 . |
UL.D |
--- | Unlocks DATA memory bank. |
The total execution time of this instruction is 5 cycles (memory bank locks and unlocks are logically instantaneous). Other cores and coprocessors are blocked from accessing the CODE memory bank for the first three cycles and the DATA memory bank of the last two cycles. If the OZ-3 core has mapped the CODE and DATA memory banks to the same physical bank, then other cores and coprocessors will be blocked from that memory memory bank for the entire 5 cycles of execution.
Concurrent port access
A port is a connection between a core and/or coprocessor and a single device. The only contention comes from multiple cores or processors attempting to read or write to a port at the same time. Like main memory, all port reads and writes are synchronized. Processors and coprocessors will enter a (likely very brief) wait state if they attempt to do a read or write to a port. There is no synchronization between instructions however, so it is important for a core or coprocessor to read/write both the status and value of a port, it must use the status based INS
/ OUTS
instructions instead of the IN
/ OUT
instructions.
Interrupts
Coprocessors, cores, and external devices can all trigger interrupts, and so can attempt to trigger an interrupts at the same time. Triggering an interrupt takes no time (in a simulated sense), is uniquely triggered per core, and it can only be cleared by the core itself, so there is no race condition for the actual setting and clearing of the interrupt trigger. However, there are race conditions when triggering an interrupt while an interrupt is in progress. The OZ-3 takes a relatively simple approach to this:
- Interrupt trigger: Any time (0 cycles)
- Sets the associated bit for the interrupt in the
IT
register (interrupt vector trigger) - Duplicate triggers (before handling) of the same interrupt are ignored (coprocessors and devices are notified whether this happens when they raise the interrupt)
- Sets the associated bit for the interrupt in the
- Interrupt detection: After each instruction completes (0 cycles)
- If the interrupt enable flag
I
is not set then control flow continues (interrupts are disabled). - If no bits in the
IT
register are set, then control flow continues (there are no interrupts). - Continues to interrupt handler mapping (#3)
- If the interrupt enable flag
- Interrupt handler mapping: (1+ cycles)
- The lowest index set interrupt handler is determined and
IT
flag is resets. - If the handler is not set (add 1 cycle)
- If
IT
is not set, continue with normal code execution then control flow continues. - If any bit in
IT
is set, restart interrupt handler mapping (#3)
- If
- Continues to interrupt handling (#4)
- The lowest index set interrupt handler is determined and
- Interrupt handling: (3 cycles)
- The current
PC
andST
registers are pushed onto the stack - Status flag I is cleared. This disables interrupts.
- The code can then optionally call
EI
andDI
to enable and disable interrupts for the duration of this interrupt handler.
- The current
- Interrupt handling end: (3 cycles)
- The
PC
andST
registers are popped from the stack. - This will restore interrupt handling to what it was before the interrupt began.
Coprocessors
The OZ-3 core supports additional low-level functionality through coprocessors. Coprocessors run independently to the general purpose cores and but provide additional functionality through dedicated reserved opcodes. Each coprocessor has its own separate opcodes which are fetched and decoded by an OZ-3 core, and then passed with their arguments to the coprocessor for execution. Coprocessors may have direct access to all shared resources (memory, ports, interrupts). Some standard OZ-3 defined coprocessors are defined here. However, specific virtual computer implementations may also define their own coprocessors, with their own behavior and opcodes.
DMA processor
The OZ-3 supports a coprocessor core for doing block memory copies of pages between memory banks. It has a 8 16-bit DMA request registers, which are mapped 1-to-1 with the general purpose cores of the OZ-3.
TODO
Math processor
The OZ-3 also supports a separate math coprocessor that supports basic floating point operations and various higher level functions.
TODO