HeurLang (Language for Defining Heuristics of the Initial Box Model) - mn-mikke/Model-driven-Pretty-Printer-for-Xtext-Framework GitHub Wiki

Although, the formatting configuration of generic pretty printer expressed through [ModelLang](https://github.com/mn-mikke/Model-driven-Pretty-Printer-for-Xtext-Framework/wiki/ModelLang-(Language-for-Defining-Box-Models\)) makes work easier but still the user has to create a formatting configuration from scratch. Thus this language is able to define some heuristic rules serving to determine how an initial box model for a particular language described by Xtext's grammar look like. In other words, this language introduces new formatting rules that are independent on a grammar and serve to as an input for a transformation into formatting rules of the [ModelLang](https://github.com/mn-mikke/Model-driven-Pretty-Printer-for-Xtext-Framework/wiki/ModelLang-(Language-for-Defining-Box-Models\)) language.

Design

The following text describes what could look like a language defining formatting rules that are independent on a grammar of a language. Firstly, it should be selected an extension for files containing a code of this language. The extension should express the essence of this language like the extensions for two previous languages. This language essentially allows for defining Pretty-Printer's Default formatting configuration, so the extension .ppd will be a good choice. Further, because this language will be able to define some formatting rules, it will have to use the operators defined by utilizing the [MetaModLang](https://github.com/mn-mikke/Model-driven-Pretty-Printer-for-Xtext-Framework/wiki/MetaModLang-(Language-for-Defining-Box--Meta-models\)) language. Since it would be beneficial to use constants defined by that language and the grammar of that language contains grammar rules allowing for defining usages of these constants, the grammar of this language should be inherited from the grammar of the [MetaModLang](https://github.com/mn-mikke/Model-driven-Pretty-Printer-for-Xtext-Framework/wiki/MetaModLang-(Language-for-Defining-Box--Meta-models\)) language. Moreover, it would be good to reference a file containing a box meta-model via URI imports in order to be consistent. The heuristic rules will be transformed into formatting rules referencing terminal rules defining terminals and parser rules defining nonterminals so the heuristic rules should be separated into a group dedicated for terminals and a group dedicated for nonterminals in order to make a specification more comprehensible.

Heuristic Rules for Nonterminals

Although, this language will be able to use operators of a referenced box meta-model, any defining elements of any grammar rule will not be available. Now has to be decided how to specify places inside definitions of grammar rules that that should be enclosed by an usage of an operator. The one solution could be to define an usage of an operator and specify where this usage will be applied. The specification of locations can be realized only by some pattern determining whether the usage can be placed at a given locations or not similarly like regular expressions determine whether a piece of text is suitable or not.

Now it is necessary to design what these patterns should look like and how should work its evaluation. Firstly, it is good to realize that pieces forming these patterns has to reflect defining elements and also a code described by a grammar rule in order to the patterns be comprehensible. The keyword is only defining element occurring in a specification of a grammar rule whilst also in a code parsed by the grammar rule. Thus the patterns will contain keywords. Rule calls, cross references and composite defining elements do not directly reflects a code parsed by the grammar rule, so they will be represented in patterns by numbers and intervals expressing the multiplicity of their occurrence in the node of a tree structure defining a grammar rule.

Listing 1

A possible section of rules for nonterminals in a file defining a heuristic for generating an initial box model. The first rule has a special significance because its usage of an operator serving to enclose the root of a formatting rule. If the question mark is present, the usage could be rewritten by another one which is defined by another rule. The second rule encloses a node of the tree structure by its usage of operator when the first and the last child of the node are parenthesis. The third rule encloses a node by its usages of operators if the node has tree children and the second is an equal character. The last rule encloses a node by its usage of a operator if the node has at least two children and the last is a colon.

Non-terminals:
    root? : <V>
    ['(',*,')']: <H>
    [1,'=',1]: <I>,<H>
    [1-*,':']: <H>

Some proposal has been typified what the section of heuristic rules dedicated for non-terminals should look like. Now it should be designed how this section be transformed into language-depended formatting rules. The most feasible solution is to transform a model generated from grammar rules of a given language into a box model representing formatting rules without usages of operators firstly. A part of the box model without usages of operators related to nonterminals has a similar structure like an equivalent part of the grammar model, so patterns of heuristic rules expresses the same thing in a box model as well as in a grammar model. Thus it can be searched places inside a box model that subsequently serve for injecting usages of operators into the box model.

The searching places and injecting usages of operator could work simultaneously by utilizing the DFS algorithm with some following modifications. The traversal of tree structure of a formatting rule begins at the root. If the node is an ordered group, the first heuristic rule is taken and it is tested whether some sub-sequence of node's children satisfies its pattern. If such a sequence exists, it is subsequently enclosed by usages of parameters defined in the heuristic rule. It causes a division of ordered group into tree pieces. The first one is formed by a sequence of children that are in front of matched sequence. The second is the matched sequence. And the last represents a sequence of children following after the matched sequence. Now the current node has this tree children. These new tree ordered groups can be reflected into the identifiers of its in order to not change a path which identifiers express. Further, the traversal continues with these new nodes, where the usages of operators are skipped at the second node and the traversal must be careful to not use the same rule on the whole second node. If the matched sequence does not exists, this process is repeated with the next heuristic rule. Further, if there is no heuristic rule left, the traversal continues with node's children with all heuristic rules. if the node is another type of composite element, it can be matched sequences with only one child because the order of its children does not reflect a parsed code. When the node is a leaf or the algorithm has already visit all node's children, the matching can not take place and the algorithm backtracks.

Listing 2

The listing describes two grammar rules of a language allowing for defining some objects and its attributes. The object references some box in which is contained. There are also formatting rules generated from these grammar rules by utilizing heuristic rules defined in the Listig 1.

// Grammar rules
Object:
    'obj' name=ID '(' box=[Box] ')' ':'
	attributes+=Attribute*
   ';'
;

Attribute:
    name=ID '=' value=Value
;

// Generated formatting rules
PBOX[Object]:
    <V>[
        <H>['obj' name:ID <H>['(' box:[Box] ')'] ':']
        attributes:Attribute
        ';'
    ]
;

PBOX[Attribute]:
    <I>[<H>[name:ID '=' value:Value]]
;

Heuristic Rules for Terminals

However, nonterminal heuristic rules have to be independent on grammar rules because parser rules are unknown until an user defines them, a situation with terminals is inverse. In most cases a language developer uses default terminal rules so heuristic rules should defines default usages of operators for these terminals. Of course, a developer can add new terminals or use only terminals with only different names but there is the default formatting rule in each definition of formatting rules. The transformation from heuristic rules into formatting rules must only test whether a terminal defined in a heuristic rule exists in a grammar. If the terminal does not exist, the formatting rule is not generated.

Listing 3

A possible section of rules for terminals in a file defining a heuristic for generating an initial box model. The first three heuristic rules define usages of operators for formatting rules that will reference terminals defined before colons. The fourth rule defines an usage of an operator for a formatting rule of the keyword whose regular expression is enclosed in square brackets. The last rule defines an usage of an operator for a default formatting rule.

Terminals: 
    INT: <F w=bold>
    ML_COMMENT: <F i=italic, c="#00FF00">, <MC>
    SL_COMMENT: <F i=italic, c="#00FF00">, <SC>
    Keyword["^\\\\w.*$"]: <F c="#FF0000">
    default: <F>

Complex Listing

This listing depicts what a file containing code written in HeurLang may look like. Some heuristic rules dedicated for generating an initial box model for the grammar from the listing on this page.

operators "platform:/resource/gpp/settings/operators.ppo"

Terminals: 
    INT: <F c="#7F7F7F">
    ML_COMMENT: <F i=italic, c="#00FF00">,<MC>
    SL_COMMENT: <F i=italic, c="#00FF00">,<SC>
    Keyword["^\\\\w.*$"]: <F w=bold,c="#7F0055">
    default: <F>

Non-terminals:
    root? : <V>
    ['package'|'import',1-*] : <H>
    [1-*,')','{'] : <H>
    ['{',*,'}'] : <V>
    [*,'class',*] : <H>
    [',',1] : <H>

Realization

The steps to implement the language designed in the previous section are straight-forward or similar to realization steps described in the previous two chapters but there are some questions concerning what implementation resources should be used for the realization.

Model Traversal

When it is needed to transform a model into an another model, the model has to be somehow traversed. The traversal is formed from many steps where the step represents to perform certain actions and choose a next element of model. It requires to know a type of a currently visited element and find actions competent for the type. The action searching can be realized in Java only by if statements or a switch statement deciding by element's type thus requiring to write a lot of code. The facilitation could be to write the model traversal in the Xtend2 language. This language contains a concept called dispatch methods. These methods behave polymorphically based on types of method's parameters. In other words, When the method has more overloads and it is called with a parameter that have got a specific type and it is somehow retyped, it is chosen a variant of the method with a specific type. Moreover, the types of parameters of method's overloads do not have to be in the inheritance hierarchy. The Xtend2 language can easily call a Java code because a code written in Xtend2 is transformed into Java.

Listing 4

A comparison of behavior of overloaded Java methods and dispatch methods of the Xtend2 language.

class A {}
class B extends A {}

// Java code
class Main
{
    public static void main(String[] args)
    {
        Main main = new Main();
        B obj = new B();
        main.foo((A)obj);
    }
	
    void foo(A par)
    {
        System.out.println("A");
    }
	
    void foo(B par)
    {
        System.out.println("B");
    }
}

// Result: A

// Xtend 2 code
class Main
{
    def static void main(String[] args)
    {
        Main main = new Main();
        B obj = new B();
        main.foo((A)obj);
    }
	
    def dispatch void foo(A par)
    {
        System.out.println("A");
    }
	
    def dispatch void foo(B par)
    {
        System.out.println("B");
    }
}

// Result: B

Model Transformation into a Text Representation

A tool for traversing models and translate them into another models were introduced. It can be obtained an initial box model from heuristic and grammar rules by this way. Because of a language developer be allowed to change the box model, it has to be transformed into a corresponding domain-specific language. Standardly, the XPand language and the Acceleo language serve for this purpose. The languages allow for defining templates representing the mapping of a certain model into a certain language. An another option is to use the Xtend2 language. that contains a similar concept of templates. Moreover, the language is able to work as a classical imperative and object-oriented language. Since the Xtend2 language. language is used for traversing models and these two parts of realization are related between themselves, it should be the Xtend2 language used again.

Listing 5

An listing of a method that works as a template. The method transforms imports of a box model into their text declarations. The templates are indicated in the Xtend2 language by the sequence of three apostrophes and the template can reference an outside variables by a code enclosed in "«" and "»".

def placeImports(BoxModel boxModel)'''
     «FOR imp:boxModel.operatorsSection.imports»
         import "«imp.importURI»"
     «ENDFOR»
'''
⚠️ **GitHub.com Fallback** ⚠️