Conceptual - adesutherland/CREXX GitHub Wiki

Conceptual Model

Describes an overall cREXX architecture and the major building blocks. It defines and assigns capabilities and technologies to individual components, and makes a first pass at application build, deployment and support processes.

REXX Language Levels

As described the Contextual Model, the cREXX project will be delivered in 4 project phases. The project will deliver different flavours (levels) of REXX starting off with a core REXX subset (Level A) delivered in Phase 0, this will be extended in later phased.

The following diagram details the REXX Language Levels.

/images/rexx-levels.svg

Level A is important because it is both

  • the proof of concept of the project, proving the architecture and the motivation of the team to deliver, and

  • the language that much of the cREXX components will initially be implemented in.

Level A will be extended to Level C (Classic REXX) and potentially to Level E (OORexx). It is expected that Classic REXX will be extended (as a compatible superset) to bring in new language features (e.g. USE).

As an example to explain the approach: Level A will only have low-level built in functions, implemented directly in the bytecode interpreter. Level C will have the complete set of standard classic REXX functions, these will be implemented in REXX Level A.

Level B represents a clean break - a modernised minimal REXX bringing in object oriented capabilities and type safety. It will be a REXX but likely not compatible with Level A. This in turn will be extended to

  • a full General purpose REXX (Level G), and

  • a specialist REXX designed for computer Language engineering (Level L)

Almost all of the cREXX components will eventually be implemented in REXX Level L.

cREXX Components

The following diagram details the components that make up the cREXX.

/images/crexx-components.svg

  • 0. cREXX End2end Controller. Orchestrates and wraps the cREXX components. The command line tool itself.
  • 1. REXX Parser. Parses REXX source to an interal representation
  • 2. REXX Validator. REXX Validation
  • 3. REXX Optimiser. REXX Optimisations
  • 4. REXX Assembler Generator. Generated REXX Assembler.
  • 5. REXX Assembler. Converts REXX Assembler to REXX Bytecode.
  • 6. REXX Bytecode Interpreter. The Bytecode Interpreter.
  • 7. REXX Runtime Library. Runtime
  • 8. REXX LLVM IR Generator. Converts REXX Assembler to LLVM IR

Finally the LLVM Optimiser and Assembler. The LLLVM infrastructure, generates native binaries

cREXX Process Flow

The following diagram shows how the cREXX components work to process REXX.

/images/crexx-flow.svg

One of the project principles principles is that interfaces between the components be documented. The following are the data formats that support end-to-end processing.

  • A. REXX Source Code.
  • B. REXX Abstract Syntax Tree.
  • C. REXX Symbol Table.
  • D. REXX Assembler Code.
  • E. REXX Bytecode.

cREXX Specifications

The following diagram details the required cREXX specifications.

/images/crexx-specifications.svg

  • S0. cREXX Tools Reference.
  • S1. REXX Language Specification.
  • S2. REXX Internal Structures Specification.
  • S3. REXX Assembler Specification.
  • S4. REXX Bytecode Specification.
  • S5. REXX Library Specification.
  • S6. REXX on LLVM Specification.
  • S7. REXX External Interfaces Specification.

Technology Mapping

Because each component is decoupled, coordinated externally, and with defined interfaces they can be implemented separately and in different technologies. In time there could be alternative implementation for the same component, indeed the need to support different levels of REXX make this inevitable.

The following diagram shows the technology of each component through the 4 project phases - the aim for the final Phase 3, is to have REXX implemented in REXX.

/images/crexx-technology.svg

Component Reuse

As described previously cREXX will support multiple levels of REXX and targeting multiple platforms.

To achieve this aim efficiently the architecture is designed to maximise reuse of the same components. This is detailed in the following diagram.

/images/crexx-reuse.svg

Even components that cannot not logically shared among REXX levels, for example the Parser and Runtime Library, can be physically implemented to allow a single physical component to support multiple levels.

Platform specific integration (e.g. to provide system IO) will be provided by the ByteCode interpreter (or LLVM infrastructure). Specifically this means that

  • System Integration will be implemented via low level ByteCode instructions
  • Level A and B languages may have a very minimal Runtime library as features are built into the Bytecode instructions
  • The Runtime Library can therefore be implemented in REXX

Component Details

REXX Parser

The target for Phase 3 is to have a PEG based parser (which conceptually can be considered to be a recursive decent parser with infinite look ahead). Conceptually (and a key assumption for the grammar specifications) it is a PEG parser supporting left recursion. This will be implemented either by a Packrat parer or using the new right to left PIKA Parser algorithm.

This parsing capability will be provided in REXX Level L as an extention to the PARSE command.

For Phases 0-2 a 3 stage parsing strategy will be used as detailed in the following diagram.

/images/parser-conceptual.svg

This 3 stage approach will deliver a performant solution that will be poweeful enough to handle REXX complexities.

It should be noted this will require the Lexer to be a state machine. This is considered an anti-pattern; we will therefore keep this to the minimum needed. And it should be noted that the REXX ANSI standards specification details a state machine lexer.

The 3rd stage will allow error messages to be as useful as possible and compliant to specifications.

This will be implemented using the RE2C lexer library and the Lemon Parser generator.