ArchitectureGlossary - TypeCobolTeam/TypeCobol GitHub Wiki
Some definitions useful to understand the main concepts of TypeCobol:
| Concept | Definition |
|---|---|
| Token | Elementary text part of source code. A token cannot be subdivided, tokens are produced during Scanning step |
| SyntaxProperty | Used to associate a Token to a value. It's especially useful for codegen and languageServer |
| SymbolDefinition | One or more token used to define something. It can be a variable, a program name, a file name, ... |
| SymbolReference | One or more token used to reference something. It can be a variable, a program name, a file name, ... |
| URI | Deprecated. It's a duplicate of SymbolReference/SymbolDefinition. It'll be deleted |
| StorageArea | Represent a memory zone associated with a reference. |
| Variable | Can be a StorageArea or a constant value (like a literal) |
| CodeElement | A CodeElement is an object which represent a Cobol statement. Eg. MoveStatement, InitializeStatement. CodeElement cannot be linked together. An IfStatement cannot contains another statement. You have to use Node for that |
| Node | A node is usually associated with a CodeElement. Node can be linked together in a tree structure. So you can have an IfNode which contains other Node |
Example:
01 Var1 pic X.
01 Var2 pic X.
move Var1 to Var2On the first line: 01, Var1, pic, X, . are tokens. The Move statement is made of 4 significative tokens move, Var1, to and Var2, all of which separated by space-separator tokens.
See classes : TypeCobol.Compiler.Scanner.Token and TypeCobol.Compiler.Scanner.Scanner
Example:
01 Var1 PIC X(02) GLOBAL.The GLOBAL keyword will be turned into a SyntaxProperty<bool> which associates the GLOBAL keyword with the logical value true. It materializes the presence of the GLOBAL modifier for this data definition.
See classes : TypeCobol.Compiler.CodeElements.SyntaxProperty and TypeCobol.Compiler.Parser.CodeElementBuilder
Example:
IDENTIFICATION DIVISION.
PROGRAM-ID. MyPgm1.The token MyPgm1 will be turned into a SymbolDefinition with a type of ProgramName. The SymbolDefinition class associates a name (materialized by a token) with basic type information indicating the nature of the definition.
See classes : TypeCobol.Compiler.CodeElements.SymbolDefinition and TypeCobol.Compiler.Parser.CobolWordsBuilder
Example:
PERFORM para-1 THRU para-4Both para-1 and para-4 tokens will be turned into SymbolReference instances. Just like SymbolDefinition, a type is associated with the reference to describe the nature of the name used. Here the names are ambiguous because para-1 and para-4 may refer to paragraphs or sections.
See classes : TypeCobol.Compiler.CodeElements.SymbolReference and TypeCobol.Compiler.Parser.CobolWordsBuilder
Example:
01 var1 PIC X.
01 var2 PIC X.
MOVE var1 TO var2Both var1 and var2 are Variable objects using a StorageArea. A StorageArea object is defined primarily by a SymbolReference and two booleans IsReadFrom and IsWrittenTo indicating how the data is used in the current context. So here var1 is read and var2 is written.
See classes : TypeCobol.Compiler.CodeElements.StorageArea and TypeCobol.Compiler.Parser.CobolExpressionsBuilder
Example:
MOVE 'A' TO var1Both the literal 'A' and var1 are represented as Variable objects for this statement. Variable objects are abstractions for all items manipulated inside a given Cobol statement. Here 'A' will be defined with an AlphanumericValue and the IsLiteral property will be true, whereas var1 will get a StorageArea pointing to the 'var1' symbol.
See classes : TypeCobol.Compiler.CodeElements.Variable and TypeCobol.Compiler.Parser.CobolExpressionsBuilder
Example:
IF var1 = 'C' THEN
DISPLAY "OK"
ELSE
DISPLAY "KO"
END-IFIF var1 = 'C' THEN, DISPLAY "OK", ELSE, DISPLAY "KO" and END-IF are all distinct CodeElement instances. Usually a CodeElement spans a single line of code, it is an intermediate structured representation between Tokens and Nodes. Unlike a Node, a CodeElement has no parent-children relationships but it already captures the nature of statements or declarations. Here the two DISPLAY lines are described as DisplayStatement code elements, both using a literal variable.
CodeElements are produced during the SyntaxCheck step.
See classes : TypeCobol.Compiler.CodeElements.CodeElement and its derived classes, TypeCobol.Compiler.Parser.CobolStatementsBuilder, TypeCobol.Compiler.Parser.CodeElementsParserStep
Example:
IF var1 = 'C' THEN
DISPLAY "OK"
ELSE
DISPLAY "KO"
END-IFUsing the same example as above, the whole code is turned into a Node, the If node produced here has 3 children a Then node, an Else node and an End node. Then and Else nodes contains each one children which are theire respective display statements. The condition is located on the root If node.
Nodes are the final structured representation of a Cobol program, in fact the whole source file is represented as a single Node object, this root node is the AST - Abstract Syntax Tree of the supplied source code.
Nodes are produced during the SemanticCheck step.
See classes : TypeCobol.Compiler.Nodes.Node and its derived classes, TypeCobol.Compiler.CupParser.NodeBuilder.ProgramClassBuilder, TypeCobol.Compiler.Parser.ProgramClassParserStep