Property framework 7 0 0 - pmd/pmd GitHub Wiki

⚠️ This page is outdated. Please refer to PMD 7 Tracking Issue #3898 for the current status.


Properties framework 7.0.0 changes

7.0.0, with Java 8 and a new ruleset schema, give us an opportunity to largely improve the properties framework. See the advantages.

TOC:

Value validation

Lambdas give us an opportunity to add validation capabilities to the property descriptors very cleanly. Consider the following interface:

public interface PropertyConstraint<T> extends Predicate<T> {
  
  /**
  * Returns a diagnostic message if the value
  * has a problem. Otherwise returns an empty
  * optional.
  */
  default Optional<String> validate(T value) {
    return pred.test(value) ? Optional.empty() : Optional.of("Constraint violated on value '" + value + "' (" + getConstraintDescription() + ")")
  }
  
  /**
  * Returns a description of the constraint
  * imposed by this validator on the values.
  * E.g. "Should be positive", or "Should be one of A | B | C."
  *
  * <p>This is used to generate documentation.
  *
  * @return A description of the constraint
  */
  String getConstraintDescription();
}

Add add a method on property builders that can add a constraint of the correct type. Then:

  • We get a powerful way to add validation to properties
  • It's uniformly documentable
  • We can drastically reduce the width and height of the property class hierarchy:
    • The following interfaces are not needed anymore:
      • NumericPropertyDescriptor's validity range can be mapped to a constraint
      • EnumeratedPropertyDescriptor's valid values can be mapped to a constraint
      • PackagedPropertyDescriptor's package restrictions can be mapped to a constraint
    • Removing these interfaces allow us to scrap:
      • The abstract property classes both for single- and multi-value variants
      • The abstract property builder classes both for single- and multi-value variants
      • The separate property builders using PropertyDescriptorField, which were designed to handle additional configuration
    • Mapping these restrictions to constraints make them transparently documentable

To reduce syntactic overhead, we can have some utility classes return common validators, e.g.

public class NumericConstraints { 
 public static <N extends Number> PropertyConstraint<N> positive() {
   return PropertyConstraint.fromPredicate(n -> n.intValue() > 0, "Should be positive");
 }
} 

New property syntax in XML

See Schema changes

Consequences on the framework's architecture

  • No need to care about delimiter logic
    • the divide between multi-/single-value properties can be eliminated, dividing by two the number of classes needed to implement the framework
    • The isMultiValue method of PropertyDescriptor can be removed
  • Each XML syntax needs to be reified, so that PropertyDescriptors can declare what syntaxes they support. I imagine these parsers could have the following interface
abstract class PropertyXmlSyntax<E> {
  E parse(Element elt);
  Element serialize(E val);
  
  // turns this parser into a parser of a collection of elements
  // i.e. use the parser for the `<value>` elt to parser the `<item>`s of a `<seq>`
  // This method would be implemented
  
  // We can use that to parse properties having as value anything derived from Collection (ie. List, Set, etc),
  // of any component type provided we already have a parser for it, (this object), and a supplier for actual collection type.
  final <T extends Collection<? extends E>> PropertyXmlSyntax<T> toSeq(collectionMaker: Supplier<T>) {
    // not abstract, returns a new PropertySeqSyntax
  }
  
  // plus some methods for documentation
} 

Deriving subclasses would only be possible within PMD, so that available syntaxes are in a closed set, managed by us.

NOTE: this assumes RulesetFactory uses a DOM parser!! I'm not sure we can modularise this with a SAX parser...

PropertyDescriptor is enriched with the following method:

interface PropertyDescriptor<T> {
  PropertyXmlSyntax<T> getXmlSyntaxInterface();
}
  • This method replaces String asDelimitedString(T value) and T valueFrom(String propertyString), which are not general enough anymore

  • One can define a property for literally any type they want, provided they have a PropertyXmlSyntax for it.

    • The <value> can accomodate anything, but collections properties may want to use the <seq> syntax for better readability. Similarly, maps may want to use an alternate map syntax if we introduce one, etc.
    • Since this is true, the type() method of PropertyDescriptor should be dropped. It's used to document the way the value is supposed to be provided, but since generic type arguments are erased, it won't be of use. Rather, the things to document would be the PropertyXmlSyntaxes.

Property creation

  • Since we can remove so many redundant concrete classes with the changes described above, their usages should be replaced by usage of the PropertyDescriptor interface:
    • e.g. IntegerProperty -> PropertyDescriptor<Integer> in user code
  • Shorthand for commonly used properties could be defined on a PropertyFactoryClass. E.g.
    • new IntegerProperty(...) or IntegerProperty.named(..) is replaced by PropertyFactory.intProperty(...), which returns a builder.
    • The interface of the builders is uniform, whatever the type of property. They are practically identical to the current interface, except the builders' build method's return type is PropertyDescriptor
  • I see no reason to forbid creating completely new property types, given a suitable PropertyXmlSyntax. This is not a priority and could be left for future work.

Blank properties

The change revolves around two points:

  • Allowing a property to have no default value. Such a property must be given a value by the user in the XML, otherwise the rule cannot run since it's not fully configured. We call these properties "blank properties", since they have no default value.
    • This needs a new method isBlank on the PropertyDescriptor interface. If it returns true, then defaultValue's result can be given no meaningful interpretation
    • Builders may be enriched with a new method, and users would only be allowed to either specify isBlank or a defaultValue
  • Reporting blank properties that were not overridden. This can be done easily using Rule.dysfunctionReason, ensuring the rule is not run and a configuration error is reported

Use case

  • Rules with blank properties allow us to define families of rules that do basically the same thing, but with so wildly different parameters that we can't find a sensible default value. This allows us to ship more generic rules, which are more reusable than several specific rules. An example for that is the class hierarchy naming conventions described in #973:

Atm, some rules in the code style category try to enforce a naming convention (eg a suffix) that should be respected by all the subtypes of a type. These are:

  • LocalHomeNamingConvention
  • LocalInterfaceSessionNamingConvention
  • MDBAndSessionBeanNamingConvention
  • RemoteInterfaceNamingConvention
  • RemoteSessionInterfaceNamingConvention
  1. I think they are too specific for PMD
  2. They are all implemented the same way, and it could be nice to abstract that away. PMD itself uses similar conventions of its own, eg the name of all interfaces extending Node should end with "Node"

It would be nice to have one rule to rule them all, which would be configurable.

...

Current rule equality is defined by rule name + language:

https://github.com/pmd/pmd/blob/7a6bc498264ad87f3783e6dfde0b7e4bc9d4bcbf/pmd-core/src/main/java/net/sourceforge/pmd/RuleSet.java#L196

So we may have an "abstract" rule with eg the following XPath, modulo some other configuration properties:

//ClassOrInterfaceDeclaration[ExtendsList/ClassOrInterfaceType[pmd-java:typeIs($superType) and not(matches(@Image, $regexConvention))]

And a user could "instantiate" it several times in the same ruleset by referencing it, overriding the name and setting the properties. The only loose point here is that the properties can have no sensible default value, so the best behaviour would be to fail if they're not set IMO. This feature is lacking from the properties framework for now.

Deprecation support

We currently deprecate properties by prefixing their name with "deprecated!", which is not cool and error prone. We could add an isDeprecated method to PropertyDescriptor instead.

We could also allow having name aliases to properties, to greatly reduce the cost of property renaming.

All-in-all

  • The currently ~60 classes framework can be implemented with just ~10 classes
  • Any property type can be supported provided there is an XML syntax for it
  • New XML syntaxes can be added without touching anything else, should the need arise
  • Arbitrary constraints on the values of properties can be defined
    • They're fully and uniformly documentable (for now they're not documented at all)
  • Blank properties allow more abstraction in the rules we ship, allowing us to ship and maintain less rules
⚠️ **GitHub.com Fallback** ⚠️