What: Annotations are consolidated into a single node. SingleMemberAnnotation, NormalAnnotation and MarkerAnnotation are removed in favour of Annotation. The Name node is removed, replaced by a ClassOrInterfaceType.
Why: Those different node types implement a syntax-only distinction, that only makes semantically equivalent annotations have different possible representations. For example, @A and @A() are semantically equivalent, yet they were parsed as MarkerAnnotation resp. NormalAnnotation. Similarly, @A("") and @A(value="") were parsed as SingleMemberAnnotation resp. NormalAnnotation. This also makes parsing much simpler. The nested ClassOrInterface type is used to share the disambiguation logic.
What: Annotations are now nested within the node, to which they are applied to. E.g. if a method is annotated, the Annotation node is now a child of a ModifierList, inside the MethodDeclaration.
Why: Fixes a lot of inconsistencies, where sometimes the annotations were inside the node, and sometimes just somewhere in the parent, with no real structure.
Annotations are not just randomly in the enum body anymore
Types
Type and ReferenceType
What: those two nodes are turned into interfaces, implemented by concrete syntax nodes. See their javadoc for exactly what nodes implement them.
Why:
some syntactic contexts only allow reference types, other allow any kind of type. If you want to match all types of a program, then matching Type would be the intuitive solution. But in 6.0.x, it wouldn't have sufficed, since in some contexts, no Type node was pushed, only a ReferenceType
Regardless of the original syntactic context, any reference type is a type, and searching for ASTType should yield all the types in the tree.
Using interfaces allows to abstract behaviour and make a nicer and safer API.
Code
Old AST
New AST
// in the context of a variable declarationList<String> strs;
TypeArgument is removed. Instead, the TypeArguments node contains directly a sequence of Type nodes. To support this, the new node type WildcardType captures the syntax previously parsed as a TypeArgument.
The WildcardBounds node is removed. Instead, the bound is a direct child of the WildcardType.
Why: Because wildcard types are types in their own right, and having a node to represent them skims several levels of nesting off.
What: Remove the Name node in imports and package declaration nodes.
Why: Name is a TypeNode, but it's equivalent to AmbiguousName in that it describes nothing about what it represents. The name in an import may represent a method name, a type name, a field name... It's too ambiguous to treat in the parser and could just be the image of the import, or package, or module.
ImportDeclaration
+ Name "java.util.ArrayList"
ImportDeclaration[@Static=true()]
+ Name "java.util.Comparator.reverseOrder"
ImportDeclaration[@ImportOnDemand=true()]
+ Name "java.util"
What: AccessNode is now based on a node: ModifierList. That node represents modifiers occurring before
a declaration. It provides a flexible API to query modifiers, both explicit and implicit. All declaration
nodes now have such a modifier list, even if it's implicit (no explicit modifiers).
Why: AccessNode gave a lot of irrelevant methods to its subtypes. E.g. ASTFieldDeclaration::isSynchronized
makes no sense. Now, these irrelevant methods don't clutter the API. The API of ModifierList is both more
general and flexible
TypeDeclaration
+ Annotation
+ MarkerAnnotation
+ Name "A"
+ ClassOrInterfaceDeclaration[@Public=true()]
+ ClassOrInterfaceBody
TypeDeclaration
+ ClassOrInterfaceDeclaration
+ ModifierList[@Modifiers=("public")]
+ MarkerAnnotation "A"
+ ClassOrInterfaceBody
Flattened body declarations
What: Removes ClassOrInterfaceBodyDeclaration, TypeDeclaration, and AnnotationTypeMemberDeclaration.
These were unnecessary since annotations are nested (see above Annotation nesting).
Why: This flattens the tree, makes it less verbose and simpler.
What: Removes the generic Name node and uses instead ClassOrInterfaceType where appropriate. Also
uses specific node types for different directives (requires, exports, uses, provides).
What: Simplify and align the grammar used for method and constructor declarations. The methods in an annotation
type are now also method declarations.
Why: The method declaration had an nested node "MethodDeclarator", which was not available for constructor
declarations. This made it difficult to write rules, that concern both methods and constructors without
explicitly differentiate between these two.
What: A separate node type ReceiverParameter is introduced to differentiate it from formal parameters.
Why: A receiver parameter is not a formal parameter, even though it looks like one: it doesn't declare a variable,
and doesn't affect the arity of the method or constructor. It's so rarely used that giving it its own node avoids
matching it by mistake and simplifies the API and grammar of the ubiquitous FormalParameter and VariableDeclaratorId.
What: The AST representation of a try-with-resources statement has been simplified.
It uses now LocalVariableDeclaration unless it is a concise try-with-resources grammar.
Why: Simpler integration try-with-resources into symboltable and type resolution.
What: Merge AST nodes for postfix and prefix expressions into the single UnaryExpression node. The merged nodes are:
PreIncrementExpression
PreDecrementExpression
UnaryExpression
UnaryExpressionNotPlusMinus
Why: Those nodes were asymmetric, and inconsistently nested within UnaryExpression. By definition they're all unary, so that using a single node is appropriate.
UnaryExpression[@Image=null]
+ UnaryExpressionNotPlusMinus[@Image="~"]
+ PrimaryExpression
+ PrimaryPrefix
+ Name "a"
UnaryExpression[@Image="+"]
+ PrimaryExpression
+ PrimaryPrefix
+ Name "a"
+ UnaryExpression[@Operator="~"]
+ VariableAccess "a"
+ UnaryExpression[@Operator="+"]
+ VariableAccess "a"
Binary operators are left-recursive
What: For each operator, there were separate AST nodes (like AdditiveExpression, AndExpression, ...).
These are now unified into a InfixExpression, which gives access to the operator via getOperator()
and to the operands (getLhs(), getRhs()). Additionally, the resulting AST is not flat anymore,
but a more structured tree.
Why: Having different AST node types doesn't add information, that the operator doesn't already provide.
The new structure as a result, that the expressions are now parsed left recursive, makes the AST more JLS-like.
This makes it easier for the type mapping algorithms. It also provides the information, which operands are
used with which operator. This information was lost if more than 2 operands where used and the tree was
flattened with PMD 6.