two kinds of analysis - jspecify/jspecify GitHub Wiki

This page discusses the differences between two kinds of analysis performed by analysis tools, and some of the consequences for each kind.

Background terminology

The code that an analyzer or compiler is looking for findings in is called the "code under analysis" ("CUA"). Any other code that is not being analyzed itself, but from which information might be drawn to aid in processing the CUA, is called "dependency code".

Primary analysis vs. annotation validation

JSpecify exists to support primary analysis (the first example of which is nullness analysis). Primary analysis looks for actual bugs that might already exist in your code.

But to accomplish that, we have a set of annotations we ask you to add to your code. Then since those annotations could themselves be misused, we now have a few more bugs to look for: misuses of the annotations themselves! We might perhaps call this "secondary analysis", but we'll more plainly call it annotation validation. Its purpose is to help you to provide information correctly to the primary analysis.

It's often useful to differentiate these two kinds of behavior -- even if in practice a single tool might perform both. To recap:

  • KIND 1: Primary analysis: given the information conveyed by annotations, look for potential real problems in the code.
    • Example: null is being returned for a non-null return type.
  • KIND 2: Annotation validation: look for where these annotations are being used incorrectly.
    • Example: a return type is annotated @Nullable @NonNull

(In a way we've labeled Kinds 1 and 2 backwards, out of "chronological" order. But it would also be confusing if something called "primary" were not "Kind 1".)

Outside the realm of static analysis, there's an analogous distinction between the (Kind 2) EqualsTester utility, which checks whether an equals method itself is broken, and (Kind 1) an assertion, like assertEquals(e, a) (or assertThat(a).isEqualTo(e)), which assumes equals is correct and uses it to find other problems.

How this complicates matters

If we were defining a brand new programming language, we could make the compiler itself find all Kind 2 issues, report them as errors, and refuse to produce an output file. This is brilliant, because then you never have to worry about them again!

But this is Java. Even if code in dep.jar includes JSpecify annotations, it might have simply been compiled by vanilla javac and that's it. Only a small number of Kind 2 problems would have been caught. And even if it was analyzed, that analyzer might have been noncomformant, or its findings might have been ignored. Or the annotations might have been applied to the code from the outside (using "stub files" or "external annotation data files").

Now suppose your code depends on dep.jar. In doing primary analysis on this code, the analyzer will sometimes need information about code in dep.jar. If it sees JSpecify annotations, it'll use them. But there's no guarantee they were used correctly at all!

In dep.jar:

void bar(@Nullable @NonNull foo);

In your code:

bar(null);

The annotations on the bar() signature are, of course, a logical contradiction, and JSpecify does not define a precedence rule to settle the issue either. Kind 2 analysis of dep should have caught that.

But now we're doing primary analysis of your code, where the bar() method is not part of the CUA, only a dependency. Officially, the annotations are both classified as "unrecognized" or "irrelevant", and the meaning is as if neither was present.

A perhaps interesting example

Consider this code in null-marked context:

interface StringValidator {
  /**
   * Returns true if {@code s} is considered valid; always returns
   * {@code false} for {@code null}.
   */
  boolean m(@Nullable String s);
}

class NonemptyValidator implements StringValidator {
  @Override public boolean isValid(String s) {
    return !s.isEmpty();
  }
}

Notice that there is a Kind 1 problem here: the subtype is contractually required to accept null but does not. But also notice that the annotations are the most correct expression of how the code is actually behaving, so there is no Kind 2 problem to report. In fact the annotations are successfully revealing the existence of the Kind 1 problem.

An open question

But it's presently unclear to what degree this Kind 1 analyzer or library is responsible for recognizing this fact and modifying its behavior accordingly (potentially even reporting the original problem itself, though it is outside the CUA).

One resolution would be: officially, this Kind 1 analyzer or library should take all available information into account, thereby recognize the flaw in bar's specification, and behave as though neither annotation were present. (It could perhaps also optionally issue a finding to the effect of "you are depending on a malformed API", but we would not require this.) So, if the analyzer doesn't do that, it would be nonconformant with JSpecify.

Another option would be to say that the Kind 1 analyzer or library is entitled to trust that annotations were used correctly, so it would have "garbage in, garbage out" behavior if they were not.

Note that in either case, there is a well-defined answer for what the actual meaning of the annotations is (they conflict, therefore carry no meaning). It is only that in the second option, failing to validate one's own annotations could result in analyzers of your dependents making erroneous conclusions.

(As much as possible, we don't want tool authors to have to worry about this—or even about the subtleties of correct code. To that end, we plan to provide libraries for tool authors.)

A three-party scenario

Suppose these three snippets appear in three different artifacts, each dependent on the ones before it:

// In A.jar (dependency)
@interface One {
  String somethingRequired();
}

// In B.jar (dependency)
@Implies(One.class)
@interface Two {}

// In C.java (CUA)
@Two void three();

(TODO)