Cache enabled analysis - SonarSource/sonar-java GitHub Wiki

Cache enabled analysis

Introduction

SQ 9.4 introduces the cache API that enables analyzers to store and read information between analyses. This API has been wrapped in the sonar-java analyzer to allow internal checks and custom plugins to leverage this by adding an extra point to the integration.

Adding cache support to my check

The plugin API adds an extra hook in the check lifecycle. Where you may have been used to checking a parsed source file in the scanFile method of your JavaFileCheck, the analyzer now allows you to interact with the file before it is parsed in scanWithoutParsing.

Impact on check lifecycle

The scanWithoutParsing method is called if the cache is enabled (ie: the server supports caching). It is expected to return true if scanning the file without a parsed tree was sufficient for the check to its job. False otherwise.

It is called on all checks that implement EndOfAnalysisCheck or are defined outside of the sonar.java.checks package (ie: custom rules). Let's call that set of checks unskippableChecks.

If all unskippableChecks successfully scan a file without parsing, then the file in question will not be parsed and scanned further. If one of the unskippableChecks cannot successfully scan the file without parsing, then the file will be parsed and scanned by all unskippableChecks.

Depending on the behavior of the checks a rule writer may find themselves in the following situations.

Given a source file set of 1 file Domain.java, and 2 checks FirstCheck and SecondCheck. For the sake of simplicity, let's assume these checks also implement EndOfAnalysisCheck

On a happy path, all checks successfully scan without parsing and there is no need to (parse and) scan the file.

+-------------------------------------------------------------------+
|             |     scanWithoutParsing   |          scan            |
|             |--------------------------|--------------------------|
|             | FirstCheck | SecondCheck | FirstCheck | SecondCheck |
|-------------|------------|-------------|------------|-------------|
| Domain.java |   success  |   success   |     -      |      -      |
+-------------------------------------------------------------------+

In the case where one of the checks fails to scan without parsing, then the file will be (parsed and) scanned with its tree. This change in behavior implies that rule developers must beware of potential double reporting of issues.

+-------------------------------------------------------------------+
|             |     scanWithoutParsing   |          scan            |
|             |--------------------------|--------------------------|
|             | FirstCheck | SecondCheck | FirstCheck | SecondCheck |
|-------------|------------|-------------|------------|-------------|
| Domain.java |    fail    |   success   |    run     |     run     |
+-------------------------------------------------------------------+

Use of the caches

The scanWithoutParsing method takes an InputFileScannerContext as parameter. This structure is similar to the JavaFileScannerContext that is passed to the scan method but it does not offer any tree-based operation. However, it enables access the CacheContext that provides the ReadCache and the WriteCache.

The ReadCache contains data saved in the last analysis.

The WriteCache contains data the rule wishes to persist for the next analysis. This last point, in particular, implies that information that is not persisted for the next analysis will be lost.

Beyond this, developers are responsible for the data they write to the cache, from validating the staleness of cached data to maintaining the (de-)serialization.

Defensive caching guidelines

  • Ensure that the same issue is not reported twice in a scan
  • Ensure that cached data is consistent by protecting against (partially) missing or corrupted data from the cache
  • If the work you are doing is made easier by parsing the file, you can still access the caches in the scan method
  • The cache is shared, make sure to use keys unique to your rule for storage and retrieval
  • Be reasonable in the amount of information stored
  • A key can only be written to once in a WriteCache, make sure to make it count