Conceptual REXX Level B - adesutherland/CREXX GitHub Wiki

Conceptual REXX Level B

Page Status: This page is ready for review for Phase 0 (PoC). It may have errors and be changed based on feedback and implementation experience

This level is equivalent to REXX Level A except that the syntax has been updated without attempting backward compatibility.

Similar to Level A, Level B does not include the full REXX Runtime Library (built in functions) but rather only includes low level functions, each one implemented as a bytecode instruction.

Level B forms the foundation implementation for higher REXX levels not requiring Classic REXX compatibility. See REXX Language Levels.

It is expected that "REXX built in REXX" components will use REXX Level B.

The key differences with Level A (and therefore Classic REXX) are:

  • Keywords are reserved
  • Typed language
  • Object support
  • Tighter syntax rules designed to highlight potential REXX bugs quickly
  • Other Changes

Where REXX Level C (Classic REXX) is built on Level A (i.e. with the complete classic runtime library), REXX Level G (TypeREXX) is built on Level B.

Language Level

By default the language processor assumes Level C (Classic REXX) syntax.

  • This default behaviour may be overridden by the "level" command line option -level xyz, or by build options or other site wide configuration (to be defined).

Level B Programs must start with the command

OPTIONS LEVELB

This must be the first command (i.e. non-whitespace and non-comment text in the file) and have no other options. There can only be one OPTIONS LEVELB command in the program file.

OPTIONS LEVELA etc. will be available for REXX Level A and related levels (including Level C - Classic REXX), although it is not recommended that OPTIONS LEVELC is used for Classic REXX programs unless there are specific use cases because other REXX processors will not understand OPTIONS LEVELC and it is the default anyway.

In Level 0 & 1, LEVEL and Module options (see next) will be processed by the tokeniser, hence the constrained syntax.

Module Management

The cREXX processor's other REXX language sub-options can be loaded using the command

OPTIONS option1 option2 ...

This must be directly after the OPTIONS LEVELB command (except comments / whitespace).

It is envisaged that the REXX options will either be built in to the language processor or be supplied by external files. These will effect program parsing and therefore need to be included before other commands in the program.

Note: Currently no sub-options have been defined

Keywords

Keywords (instructions and operators) cannot be used for variables or constant names.

Typed Variables

This approach has been proposed to be be as simple and REXX-like as possible but introduce type safety which allows more bugs to be highlighted before runtime.

Declaration

Variables will be implicitly declared to be of a certain type on their first assignment. E.g

  • Integer

    Variable = 0;

  • Float

    Variable = 0.0;

  • String

    Variable = ""

We will also allow

  • Integer Variable = .INT where INT is the integer built in type (value 0).
  • Class Instance = .ACLASS Class with via the default factory
  • Class Instance = .ACLASS() Class with via the default factory
  • Class Instance = .ACLASS(ARG1,ARG2) via a factory function with 2 arguments

Once a variable has been assigned a type, then the type cannot be changed.

Function Arguments

Arguments can be pass by reference or pass by value

By example:

ARG a1 = 0, a2 = .int, expose a3 = .aclass, ?a4 = .aclass, a5 = .string[]

  • Arg a1 is an optional integer (and 0 if not specified in the call)
  • Arg a2 is a mandatory integer (pass by value)
  • Arg a3 is a mandatory class aclass pass by reference
  • Arg a4 is a optional class aclass pass by value, value from the default factory if not specified in the call
  • Arg a5 is an array of strings and is one way to allow an arbitrary number of strings to be passed to the procedure (see also Ellipsis later)

Ellipsis (...) and ARG Operators

The last arguments declaration can be an ellipsis ('...'), this is used to show that 0 or more arguments can be provided. For example:

ARG a1 = 0, a2 = .int, ... = .string

The '...' shows that an arbitrary number of .string arguments can be added to the end of the call.

The ? operator exist to access & query arguments:

  • ?a1 returns true if the optional arg a1 was specified.
  • ?a2 will always be true as a2 is not optional.

Pseudo Array arg allows access to the '...' arguments. Also see the Arrays section.

  • arg[1] or arg.1 gives the first '...' argument. These can signal OUTOFRANGE
  • arg[0], arg[], arg.0 or arg. return the number of '...' arguments
  • The type of this Pseudo is the type of the '...' argument

The compatibility arg() operator is designed to provide some compatibility with classic REXX; by example:

  • arg() is equivalent to arg.0 etc. Type Integer.
  • arg(1) is equivalent to arg.1 etc. The type of this operator is the same as the '...' argument and like arg.1 can signal OUTOFRANGE
  • arg(4,E), arg(4,"E"), arg(4,Exxx), arg(4,"Exxx") etc. all return 1 (true) if there were 4 or more '...' arguments given or 0 (false) otherwise. E is Exists.
  • Likewise arg(4,'O') etc. (O is Omitted) is equivalent to ~arg(4,'E').

Calling Convention

In cRexx the caller passes the argument registers by reference. There is a register flag to indicate if a argument value has been given or if a default value should be used. The caller is responsible to any required register copies needed for "pass by value" registers that are changed by the procedure.

Type promotion/safety

REXX will automatically convert / promote variable types between Integer, Float and String - this may cause a compile time or run time error (signal).

Constants

Constants are symbols that are unassigned REXX variables, these are read-only and have a value equal to their name.

If a symbol is used as a constant it cannot be subsequently used as a variable:

SAY CONSTANT_SYM; /* Prints "CONSTANT_SYM" */
CONSTANT_SYM = 1; /* "CONSTANT_SYM is a constant" error */ 

Objects

This proposal is a simple but powerful object scheme which aligns to classic REXX's philosophy.

  • Object access syntax extends Classic REXX's stem concept
    • This includes class methods that can accept arbitrary compound values (i.e. in stem.xxx, the xxx)
  • Objects are created by a class factory function
  • Classes and Objects can implement one or more interface specifications
  • There is no class hierarchy
  • Abstract classes are supported but they need the missing methods added on creation. The aim is to have a simple syntax to support "anonymous" classes and "injection" patterns.
  • Classic STEMs will be implemented by a REXX class, this will use:
  • A foundation/native STEM-like array object will be used as the base mechanism for storing collections

Object Access Syntax

Stem like syntax

ACLASS.ATTRIBUTE

or

ACLASS.METHOD()

Defining Classes

ACLASS: CLASS EXPOSE APUBLIC_MEMBER APUBLIC_VIRTUAL_MEMBER APUBLIC_METHOD

    ACLASS: FACTORY /* Default Factory */
        RETURN INSTANCE

    SPECIAL: FACTORY /* Special Factory - usage: Instance = .ACLASS.SPECIAL */
        RETURN INSTANCE

    AMEMBER = 0; /* Integer member */
    APUBLIC_MEMBER = 0.0; /* PUBLIC MEMBER */

    APUBLIC_METHOD: PROCEDURE = 0.0 /* Returns a float */
        ARG ...
        RETURN ret

    APRIVATE_METHOD: PROCEDURE = 0 /* Returns an int */
        ARG ...
        RETURN ret

    APUBLIC_VIRTUAL_MEMBER: GETMEMBER = .STRING
        ...
        RETURN val

    APUBLIC_VIRTUAL_MEMBER: SETMEMBER
        ARG val = .STRING
        ...
        RETURN

    *: GETMEMBER
    *: SETMEMBER
 
END

This makes a default Interface called ACLASS

Defining Interfaces

ABEHAVOUR: INTERFACE /* Note: EXPOSE not needed as all members are exposed in an interface */

    APUBLIC_MEMBER = .FLOAT; 

    APUBLIC_METHOD: PROCEDURE = 0.0 /* Returns a float */
        ARG ...
END

Using an Interface

ACLASS: CLASS EXPOSE .ABEHAVOUR
  ...
END

Also for Anonymous Classes

Instance = .ABEHAVOUR WITH
    APUBLIC_MEMBER = 3.5;

    APUBLIC_METHOD: PROCEDURE = 0.0 /* Returns a float */
        ...
        RETURN result
END

And (most useful) with Args so a function can get an object with a known specification but not caring about the implementation.

ARG Arg1 = .ABEHAVOUR

Note a Class can support multiple interfaces as long as there is no "overlap", no exposed methods or attributes duplicated.

Defining a Singleton

STRATEGY: INTERFACE EXPOSE CONFIF_VALUE DO_SOMETHING
   CONFIF_VALUE = .INT
   DO_SOMETHING: PROCEDURE = 0

   ...

END

AN_IMPLEMENTATION: CLASS EXPOSE .STRATEGY
  CONFIG_VALUE = 12
  DO_SOMETHING: PROCEDURE = 0
     ... Actual Implementation
  ...
END

Then the following sets an singleton instance for the interface.

STRATEGY = .AN_IMPLEMENTATION()  

We can then do

SAY .STRATEGY.CONFIG_VALUE
.STRATEGY.DO_SOMETHING()

Abstract Classes

Created by a class with a method declared (directly or via a Interface) but with no body

AACLASS: CLASS EXPOSE APUBLIC_METHOD

    AACLASS: FACTORY /* Default Factory */
        RETURN INSTANCE

    APUBLIC_METHOD: METHOD = 0.0 /* Returns a float - abstract, no body */
END

When Instantiated the missing methods need adding

Instance = .AACLASS WITH
    APUBLIC_METHOD: METHOD = 0.0 /* Returns a float */
        ... /* Here is the missing functionality */
        RETURN result
    END

Why? The use case is injection - e.g. logic for a UI Widget

MyForm.ADD_WIDGET(.BUTTON WITH 
          Colour = "RED"
          Label = "Alarm"
          ONPRESS: METHOD
              /* Panic when the red button is pressed! */
              Call Panic
              RETURN
          END
      )

Classes as Classes - Advanced Shit!

The .CLASS class represents the class (rather than an object / instance and is expected to be only used in specialist use cases. For example, passing class instances to a function or for introspection.

class = .CLASS(.MYCLASS); /* class is a variable of type CLASS and has a value of .MYCLASS */
class = .CLASS(aninstance); /* class is a variable of type CLASS and has a value of the class of the aninstance */

num_of_methods = class.method.0 /* Introspection will look like this - details TBC */

CALL SPECIAL_FUNCTION class /* We can pass class as an argument
EXIT

SPECIAL_FUNCTION: PROCEDURE
   ARG c = .CLASS        /* Any class at all */
   ARG c = .ANINTERFACE  /* or perhaps, any class implementing ANINTERFACE */         

   /* Lets make an instance of this class */
   Instance = C.FACTORY;
   Instance2 = C.FACTORY("SPECIALFACTORY", arg1, arg2); /* Gets complex when you want to use a specific factory ... */
   /* Note using a string for the method name - otherwise we would have to have a METHOD object and that is losing 
      sight of the need to try and keep it simple ... */

Arrays

These look like STEMS but actually are (at their simplistic default) 1 based dynamic arrays, with .0 giving the array size.

Designed to be simple and fast and cover many basic use cases.

array.1 = "Value"
array.2 = "Last Value"

In this case

array.0 has value 2

This usage implicitly declares the array as 1-dimensional, 1-base array, which dynamically grows, and has type string.

A 2-dimensional array of type float could be implicitly declared with

array.1.1 = 0.0

Alternative Syntax

Square brackets can be used e.g.

array[2] 

or

array.2 

are equivalent.

Explicit Declaration

Arrays can be explicitly declared

array = .int[10]            /* 1 Dimensional, 1 base, 10 elements */
array = .int[10,10]         /* 2 Dimensional, 1 base, 10x10 elements */
array = .int[0 to 10]       /* 1 Dimensional, 0 base, 11 elements */
array = .int[-2 to *]       /* 1 Dimensional, -2 base, dynamic growth (-2, -1, 0, 1, ...) */
array = .string[0 to 5, 4 to 10]  

Arrays are Objects

Arrays can be passed to functions

/* Array as an object */
options levelb
import rxfnsb

/* Array Returned */
array = create()
call print_elements array

/* Array exposed (pass by ref) */
call change array, "change line 2", 2
call print_elements array

/* Copy an array */
array2 =  array
call print_elements array

return

print_elements: procedure
  arg input = .string[]
  say "Printing Array"
  do i = 1 to input.
    say i" =" input.i
  end
  return

change: procedure = .void
  arg expose input = .string[], line="changed line", record = 1
  input.record = line
  return

create: procedure = .string[]
  x.1  = "Line 1"
  x.2  = "Line 2"
  return x

Outputs

Printing Array
1 = Line 1
2 = Line 2
Printing Array
1 = Line 1
2 = change line 2
Printing Array
1 = Line 1
2 = change line 2

The Classic STEM object

Implemented in REXX, of course, as a class.

Note: With the correct algorithm this should be performant, however if we have performance issues we will implement this class natively - either way it will look like a REXX class for users

Declaring

stem_instance = .STEM; /* By default stems store Strings */

or

stem_instance = .STEM(.Int); /* Stem storing Integers */
stem_instance = .STEM("Default Value"); /* Stem storing Strings with a default value */

Then usage is "Classic"

stem_instance.INDEX = VALUE

Other Standards Classes

TBC

Tighter Syntax Rules

Syntax Changes to be confirmed once the tge Phase 0 PoC implementation has been reviewed.

Other Changes

Changes to be confirmed once the tge Phase 0 PoC implementation has been reviewed.

Known changes to date:

  • % (Int Div) and // (Remainder) deprecated with alternative operators (mod and idiv) added as replacements. Note that REXX and C like languages have divergent meaning for %.

Phase 0 (PoC Prototype) Scope

The following elements are supported in Phase 0:

  • REXXLEVEL
  • Integer and String Variables
    • Excluding Objects / Containers / Stems
  • Expressions
    • Operators + - / *
    • Concat || Abuttal
    • Comparison EQ NE LT GT
  • Comments
  • Assignments
  • Grouping
    • IF THEN ELSE
    • DO END (Instruction Grouping)
    • DO TO ("for loop")
    • LEAVE
    • ITERATE
    • NOP
  • Input / Output
    • PARSE PULL
    • SAY
    • ADDRESS COMMAND
  • Functions
    • CALL / PROCEDURE / RETURN
    • PARSE ARG
    • Core Functions: CONFIG_SUBSTR CONFIG_LENGTH CONFIG_PULL
    • REXX BiF: SUBSTR, WORD, WORDS, LENGTH