Taken Ideas Archive - pmd/pmd GitHub Wiki

This document provides brief descriptions of candidate projects for undergrad students through programs such as Google Summer of Code, or ITBA's final project.

The objective is to define clearly the objective, identified tasks to perform, and expected output.

Table of Contents:

  1. Full Antlr Support
  2. Auto fixable issues
  3. Complete type resolution for Java
  4. Metrics Framework
  5. Automated regression tests against real-world projects
  6. Support Java 10
  7. UI - New improved Designer

Full Antlr Support

Rationale

PMD currently supports languages with grammars built on JavaCC, by providing base AST node classes, and bridging the XPath capabilities of Saxon to the AST.

The purpose of this project is to provide equivalent support for grammars built with Antlr 4. We expect developers to be able to write rules either as AST visitors or XPath expressions with minimum differences to what they currently do for JavaCC. This would allow to fully support languages that are currently only supported for CPD.

Impact

This opens the door for PMD to fully support dozens of new languages. This is huge to enlarge our community, and provide an homogeneous toolset for full stack developers.

Expected deliverables

PMD supports one additional language using an Antlr based grammar with a few rules as proof of concept.

At least 1 rule is written using an XPath expression, and at least 1 rule is written as an AST visitor in Java.

Swift would be a good candidate, but others would work too.

Desirable skills

Intermediate Java, Understanding compiler construction

What the student will learn

Better understanding of AST construction. Deep knowledge of XPath. Getting to understand the building blocks for development languages and their caveats. Static code analysis basics.

Proposed mentors

Status

In development by ITBA students

Auto fixable issues

Rationale

PMD is currently able to traverse the AST to detect issues. Upon doing so, it knows which tokens are involved in the violation, along with the file, lines and columns involved.

The goal of this project is to model fixes for violations, in such way that rules may not only detect violations, but could optionally provide fixes for them.

It would also be needed to implement fixes on a number of existing rules. We could take such fixes from some of the current plugins to bind PMD into some IDEs that already provide them on their own.

These fixes could be exposed on a number of ways:

  1. Through an API, so that IDE integrations can use them.
  2. Through reports, so that tools such as Arcanist can use them.
  3. Directly applied onto the files, when PMD is run with a proper flag.

Impact

The cost of adoption by new users is greatly lowered, by allowing to quickly fixing violations even on legacy projects. Reduces the burden of keeping code in shape by automating even further the process.

Expected outcome

PMD is capable of providing fixes for violations through reports, applied directly into the codebase through a new flag (both through CLI and ant), and though report listeners. At least a couple of existing rules start providing autofixes (they can be imported from existing tools such as the unofficial Eclipse Plugin). The new flags and APIs are documented. The Eclipse plugin exposes these fixes as quick fixes.

Desirable skills

Intermediate Java

What the student will learn

Deep understanding of the AST representation of source code.

Proposed mentors

Status

In development by ITBA students

Complete type resolution for Java

Rationale

For some time PMD has supported Type Resolution, being able to detect the type associated with different nodes in the AST, however, such support for Java is only partial.

The goal of this project is to fully implement Type Resolution for all nodes with associated types, and write rules demonstrating such support (for instance, implementing rules equivalent to FindBugs' GC_UNRELATED_TYPES).

There are some type resolution specific rules (see typeresolution ruleset). Merge these back into the standard rules. In general, a Rule should use Type Resolution when it can, and fall back on non-TypeResolution approach otherwise. No need for separate Rules for TypeResolution/non-TypeResolution.

As type resolution depends on a correctly set auxclasspath option, PMD should verify this, failing fast if not set at all, and providing proper warnings on missing classes.

Additionally, it would be interesting to explore changes to how type resolution is performed. Currently a full AST analysis is performed if any rule declares use of type resolution. However, such rules usually care for a small portion of the AST. Changing this full tree analysis to a subtree analysis on demand (with dynamic programming) could speed up execution time by reducing analysis scope.

Ideally, all the features of type resolution are exposed to XPath based rules as well.

Impact

Full Type Resolution greatly empowers PMD to perform far more complex and accurate analysis, while keeping the blazing speed it currently offers. This would allow to import rules form other tools allowing users to reduce their toolsets while maximizing customization and performance.

Expected outcome

All appropriate nodes of the AST can know their produced types. At least one new rule is produced to shows this working. A proper performance analysis has been performed to consider on-demand type resolution.

Desirable skills

Intermediate Java, Understanding compiler construction

What the student will learn

Detailed knowledge of Java's type system and resolution rules. Dealing with the Java Language Specification. Profiling and performance tuning in Java.

Proposed mentors

Status

Developed as part of GSoC 2017 by Bendegúz Nagy. The scope was downsized as it became apparent the complexity was higher than anticipated, but great improvements were developed and shipped.

Metrics Framework

Rationale

PMD currently has a code size ruleset for Java, and a couple extra rules such as GodClassRule that detect violations based on metrics. However, this is implemented by the rules themselves in isolation.

The goal would be to provide metrics framework directly into the core to gather metrics on compilation units that can be freely accessible by rules. Extra metrics and rules may be added.

The metric gathering should probably work like Data Flow Analysis and type resolution, on a first stage before rules are applied, and only if any active rules needs it.

The metrics are from the book "Object-Oriented Metrics in Practice" (Lanza; Marinescu). There are in total 24 different metrics. As a proof of concept the GodClassRule should be refactored to use the new metrics framework and one more rule should be implemented, e.g. to detect Feature Envy methods.

Impact

A whole family of rules can be implemented, while reusing metrics computation to keep performance.

Expected outcomes

GodClassRule uses the new metrics framework; other rules, that benefit from the framework, have been identified. The metrics framework is documented.

Desirable skills

Intermediate Java

What the student will learn

Better understanding of code metrics available in Object Oriented Programming and their use.

Proposed mentors

Status

Developed as part of GSoC 2017 by Clément Fournier. Fully functional, implemented on Java and Apex, old rules replaced by new ones backed by the metrics framework starting from PMD 6.0.0.

Automated regression tests against real-world projects

See also #360

Rationale

As we work on PMD, we add new rules, update the language grammars to newer versions, and improve upon existing rules and analysis subsystems (Control Flow, Symbol Table, Type Resolution, etc.). We try to make our testing of these changes as extensive as possible, but we can come short of imagination, leading to defects after release; which at the very least are false positive / negatives, but can range all the way up to application crashes.

Being able to easily test against standard projects gives us a better insight into how our changes:

  • Affect stability
  • Affect performance
  • Affect analysis accurancy

Impact

More stable releases, higher confidence, improved insight in how much better the analysis results are by looking at report diffs.

Expected outcome

A list of standard projects are defined (Spring Framework, Hibernate, Solr, etc.), and a specific version is cloned into a Travis build job, analyzed, and the results are compared against a baseline for the last published snapshot.

Failures during analysis fail the build. Analysis diffs are commented back to the Pull Request (using Danger or similar) for the reviewer's information.

Once the PR is merged, a new snapshot is uploaded, and a new baseline is stored.

The analysis is automated via a travis job, that is automatically executed for each push to the repository and each pull request. It should also be possible to run the same analysis locally and a vendor lock-in into travis should be avoided.

Desirable skills

Basic Shell Scripting, Basic Ruby

What the student will learn

Build automation, programming skills.

Proposed mentors

Status

Developed as part of GSoC 2018 by Binguo Bao.

Support Java 10

Implement Java 10's Local-Variable Type Inference

Rationale

PMD's best supported language is Java. In order to keep this, we want to support the new upcoming Java 10 feature of Local-Variable Type Inference. This new feature will be shipped with Java 10 (general availability is March 2018). This project will also improve type resolution, since currently, the types of variable are taken from the variable declaration till now.

The local-variable inference looks like this:

var list = new ArrayList<String>();  // infers ArrayList<String>
var stream = list.stream();          // infers Stream<String>

Impact

Supporting this new language feature and keeping up-to-date with the Java development helps PMD staying relevant.

Expected Outcome

PMD is able to parse new Java 10 source code. The correct type resolution of such local variable is demonstrated in unit and integration tests. Existing rules have been analyzed and adjusted, if they are impacted.

Desirable Skills

Java, Understanding compiler construction

What the student will learn

Using a fresh Java 10 feature, Dealing with Java Language Specification

Proposed Mentors

Stretch goal(s)

  • Reviewing the current grammar and comparing it with the Java Language Specification. We currently have some certain differences in the AST structure, that is generated by our grammar, which we should fix step by step.
  • Have a look at Java 11. It might include JEP 323 which combines the new local variable syntax with lambdas.

References

Status

Implemented.

UI - New improved Designer

Rationale

The Designer is a UI developer tool, that helps in writing new PMD rules or examine existing rules. It supports all the languages PMD supports and displays the source as an abstract syntax tree (AST). You can write XPath expressions that are executed directly against the AST in order to develop new XPath based rules.

With PMD 6, we have now a JavaFX based Designer which provides a modern base for further enhancements. The issue #714 lists already a couple of improvements.

Impact

The Designer is essential in gaining new contributions and rules. It is an important tool during the complete process of developing new rules and bugfixing existing rules. Since the Designer is language agnostic, it the one tool that can be used for all languages PMD supports.

Expected Outcome

An easy to use Designer GUI application. A documentation (updated documentation) on how to use it, like an example guide of creating a new rule with the help of the designer.

Desirable Skills

Java, JavaFX, XPath, Interest in getting to now additional languages (Apex, PLSQL, JavaScript, ...)

What the student will learn

JavaFX GUI design, Integration of an existing Java tool (in this case PMD) via its API

Proposed Mentors

Status

Implemented.

⚠️ **GitHub.com Fallback** ⚠️