Logical AST Specification - adesutherland/CREXX GitHub Wiki
CREXX Logical AST Specification
Page Status: This page is ready for review for Phase 0 (PoC). It may have errors and be changed based on feedback and implementation experience
The AST specification covers all phases of the project and all REXX levels, this means that a single language processor (compiler backend) will be able to be used across the project.
Obviously the initial specification will only handle Phase 0 scope and language levels A and B but the intention is that this will be extended (where possible without breaking existing code - "backwards compatible")
The AST notation is described here, essentially:
(root child1 child2 child3) - simple tree with 3 children
(root child1 (subroot subchild1 subshild2) child3) - a tree with a subtree
AST Node List
Node Type | Description | Terminal |
---|---|---|
ABS_POS |
Absolute Pos for Parsing | No |
ADDRESS |
Address Instruction | No |
ARG |
Argument input for Parsing | Yes |
ASSIGN |
Assign Instruction | No |
BY |
BY part of REPEAT part of DO | No |
CALL |
Call Instruction | No |
CONST_SYMBOL |
Constant Symbol | Yes |
DO |
Do Instruction | No |
ENVIRONMENT |
Environment for Address | Yes |
ERROR |
Error Marker | No |
FOR |
FOR part of REPEAT part of DO | No |
FUNCTION |
Function | No |
IF |
If Instruction | No |
INSTRUCTIONS |
Instruction List | No |
ITERATE |
Iterate Instruction | No |
LABEL |
Label | Yes |
LEAVE |
Leave Instruction | No |
NUMBER |
Number | Yes |
OP_ADD |
Add/Subtraction Op | No |
OP_AND |
And Op | No |
OP_COMPARE |
Compare Op | No |
OP_CONCAT |
Concat Op | No |
OP_MULT |
Multiply/divide OP | No |
OP_OR |
Or Op | No |
OP_POWER |
Power Op | No |
OP_PREFIX |
Prefix Op (+, -, ) | No |
OP_SCONCAT |
Concat with space Op | No |
OPTIONS |
Options for Parsing | No |
PARSE |
Parse Instruction | No |
PATTERN |
Pattern for Parsing | No |
PROCEDURE |
Procedure | No |
PROGRAM_FILE |
AST Root for a file | No |
PULL |
Pull input for Parsing | Yes |
REL_POS |
Relative Pos for Parsing | No |
REPEAT |
Repeat part of DO loop | No |
RETURN |
Return Instruction | No |
REXX |
Language Level and Options | No |
SAY |
Say Instruction | No |
SIGN |
Rel pos direction for Parsing | Yes |
STRING |
String | Yes |
TARGET |
Target for Parsing | No |
TEMPLATES |
Template List for Parsing | No |
TO |
TO part of REPEAT part of DO | No |
TOKEN |
Generic token | Yes |
UPPER |
Upper Option for Parsing | Yes |
VAR_SYMBOL |
Variable | Yes |
Format of Non-Terminal Nodes
File Scope
(PROGRAM_FILE REXX INSTRUCTIONS?)
Language Options
(REXX level:CONST_SYMBOL options:CONST_SYMBOL*)
Error
(ERROR TOKEN+)
This node is not a actual token instead it is inserted into token stream, the actual offending TOKEN(s) are added as children
Instructions
(INSTRUCTIONS instruction*)
Where instruction is one of:
- ADDRESS, ASSIGN, CALL, DO, IF, INSTRUCTIONS, ITERATE, LABEL, LEAVE, PARSE, PROCEDURE, RETURN, SAY
Expressions
Expression (expr) nodes are one of:
expr <-
(OP_OR expr expr) /
(OP_AND expr expr) /
(OP_COMPARE expr expr) /
(OP_CONCAT expr expr) /
(OP_SCONCAT expr expr) /
(OP_ADD expr expr) /
(OP_MULT expr expr) /
(OP_POWER expr expr) /
(OP_PREFIX expr) /
(FUNCTION expr*) /
CONST_SYMBOL /
VAR_SYMBOL /
NUMBER /
STRING
Address
(ADDRESS ENVIRONMENT? expr?)
Assignment
(ASSIGN VAR_SYMBOL expr)
Call
(CALL CONST_SYMBOL expr*)
Do
(DO (REPEAT assignment (TO expr)? (BY expr)? (FOR expr)?) instructions*);
Note that simple DO / END maps to (INSTRUCTIONS instruction*)
If
(IF expr true:INSTRUCTIONS false:INSTRUCTIONS?)
Iterate
(ITERATE VAR_SYMBOL?)
Label
LABEL
Leave
(LEAVE VAR_SYMBOL?)
Parse
(PARSE (OPTIONS UPPER?) in (TEMPLATES template+)
in <- ARG / PULL;
template <- target / pattern / abs_pos / rel_pos;
target <- (TARGET VAR_SYMBOL?);
pattern <- (PATTERN STRING/VAR_SYMBOL);
abs_pos <- (ABS_POS NUMBER/VAR_SYMBOL);
rel_pos <- (REL_POS SIGN NUMBER/VAR_SYMBOL);
Procedure
(PROCEDURE LABEL instructions);
Return
(RETURN expr?)
Say
(SAY expr?)