nullness borderline cases - jspecify/jspecify GitHub Wiki

In base Java, especially before generic types, the trade-offs of choosing one type over another were quite stark. If your method needs to accept values up to 2^31 - 1, it can accept int, but if it needs to accept 2^31 as well, you're stuck using long. Java erased types are absolutely strict.

With generic types, it gets a bit more fuzzy. We can make judgment calls, and suppress warnings when the compiler isn't sure we're safe. For example, to implement Collections.<T>emptySet(), it can return the same Set<Object> instance every time, even though the types aren't actually assignable.

In our case, most of the time it will be easy to decide what nullness a type usage should have, just by considering whether it's good or bad for null to "pass through" that point. But there are many kinds of "borderline" cases where the answer isn't clear. If the tools were as strict as Java is with erased types, we'd have no choice: we'd simply have to choose nullable for every borderline case. But instead, we have the same flexibility we just described for generic types, so we're empowered to make the decision that works out best for users.

Here we have notes about how to handle situations where it is tough to tell which of @Nullable or @NonNull applies.

The Method.invoke case

A java.lang.reflect.Method instance might represent a static method or an instance method. Only in the first case is null an acceptable argument! Yet we have to annotate the method one way for all.

Interestingly, though, the invoke method never has to accept null. In the static case, it promises to ignore the argument, and so it became conventional to pass null. But code can just as easily pass a dummy value like "(ignored)". Given that, we can catch more bugs by leaving the parameter non-nullable.

(TODO: a better example of the general issue, that doesn't have this get-out-of-jail card.)

The "poly-null" case

See PolyNull.

The AtomicReference case

A simplified look at AtomicReference's API:

class AtomicReference<V extends @Nullable Object> {
  AtomicReference(V initialValue) {...}

  V get() {...}
  void set(V newValue) {...}
}

This says that we can:

  • Create an AtomicReference<String>, where we'll have to pass a non-null string, and can only set non-null strings thereafter, and therefore values returned by get will be non-null too
  • Create an AtomicReference<@Nullable String>, where we can pass null, and set null, but of course values returned by get will be nullable too

So far... this sounds perfect! Both ways of using the class make perfect sense.

Unfortunately, there's a second constructor, for creating an uninitialized atomic reference:

class AtomicReference<V extends @Nullable Object> {
  AtomicReference(V initialValue) {...}
  AtomicReference() {...} // the value remains `null`!

  V get() {...}
  void set(V newValue) {...}
}

Ideally we'd like this no-arg AtomicReference constructor to refuse to create atomic references with a non-null type argument: new AtomicReference<@Nullable String>() should work, but new AtomicReference<String>() should not. But Java generics don't work this way, and JSpecify takes after Java generics.

That means that even for an AtomicReference<String>, get could still return null. Just avoiding the no-arg constructor makes you technically safe, but how would the analyzer know that? It seems we're forced to have get return @Nullable V. But then that erases most of the benefits of using an AtomicReference</*non-null*/ String> in the first place.

By including this constructor, AtomicReference has made itself vulnerable to the very same create-initialize gap that plagues object fields and array components. Like those cases, a bug results if, before the value is initialized, either the value is read or the instance escapes the context where it was created.

The logical way to avoid bugs here is to use initialization analysis -- whatever catches this bug for fields and arrays should be equally able to include this case as well. It doesn't make sense to saddle the nullness analysis with the responsibility for catching such bugs, because, as just discussed, it is incapable of doing so well.

Therefore, the original code should be considered acceptable as it was. Of course, using initialization analysis (that recognizes this case) is recommended!

The findViewById case

Sometimes called the Map.get case.

TODO; for now see issue 71.

The parseInt case

This is a method that:

  • definitely throws upon null input
  • but throws the same kind of exception as for other invalid inputs

Integer.parseInt(String) is a good example: it throws NumberFormatException whether the input is null or "xyz".

Now, JSpecify specifications have nothing to do with exception types. Since this method has no desire to receive null, the "correct" nullness for this parameter is non-null.

But annotating "correctly" doesn't always produce the best outcomes for users. It helps to look at the real-world consequences on users. This generally means balancing the overall harm from false positives against the overall harm from false negatives.

Two kinds of callers

A parsing method has callers of two kinds: those with trusted input, and those with untrusted input. "Trusted" means the caller is sufficiently confident it will parse correctly. Logically that includes trusting it to be non-null. (Yes, the trust could be misplaced, and the code could propagate an unexpected unchecked exception, but this is no different from other cases of "programmer error".)

Setting aside nullness annotations/analysis, processing untrusted input looks like this:

try {
  return Integer.parseInt(input);
} catch (NumberFormatException e) {
  return -1; // or whatever other handling (that isn't just rethrowing `e`)
}

... whereas processing trusted input looks like this:

return Integer.parseInt(input);

Nullness analysis

Now what about nullness analysis? Assume that the parsing method is now @NullMarked. On the caller side, assume the analyzer considers the input expression to be nullable (because if non-null, the code above will both compile and run just fine).

Now we have two choices:

Make parseInt parameter non-null

In both the trusted and untrusted cases above, the line return Integer.parseInt(input) must change to either

Integer.parseInt(requireNonNull(input));

... or, to preserve the former behavior more precisely and avoid an unnecessary check:

@SuppressWarnings("nullness")
int result = Integer.parseInt(input);
return result;

Neither of which even changes anything appreciably at runtime, so the finding was a false positive.

Make parseInt parameter nullable

Both original code snippets can continue to be used as-is. There is no finding and no false positive. What about false negatives? Well, the untrusted case will be handled just fine at runtime. But since input is a nullable expression, there's a good chance it does have some risk of being null, and we missed out on giving a signal to the "trusting" programmer that they might be a little too trusting in this case.

Conclusion

Considering the incidence and severity of all the false positives from the non-null parameter, and the incidence and severity of the false negatives from the nullable parameter, which seems more damaging overall? Answering that question should be a useful guide to which way you should annotate the method.

The "something has already gone wrong" case

You own a library API. A particular type usage in that API's member signatures includes null only in exceptional scenarios, that your clients should never find themselves in according to the API's broader contract. (Whether that contract is enforced or even enforceable.)

In these cases, “something has already gone wrong” (it might not be this client’s fault, but it doesn’t matter). Notice this is not the “Map.get case”.

Now if this were an analogous question of choosing a root type, there is no freedom; javac forces us to use the more general type. But our case is much more like parameterized types; we as well can suppress warnings, so we do have the flexibility to decide between the lesser of two evils:

  • The type can be technically over-specific (non-null).
  • The type can be over-general in practice (nullable).

Splitting the cases

The consequences of each option depends on how the type is being used:

  • A provided type (where values flow out from a library to its clients)
  • An accepted type (where values flow into a library from its clients)
  • (I’ll disregard “inout” types here, as they should be rare and they have much bigger problems.)

TODO: a good example of each case

Case 1: provided type

Should you make this provided type:

  • "technically over-specific"? (non-null)
    • General consequence: in the errant case, can cause null pollution in the client. An immediate defensive null check would catch it, but nothing would prompt the user to do that.
    • Consequence to Kotlin client: in the errant case, at least that runtime null check will happen, so we “fail faster” and more consistently.
  • "over-general in practice"? (nullable)
    • General consequence: well-behaved clients suffer “type dullening” (like happens when calling a polynull method without polynull support). To reassert non-nullness they must suppress a finding, or make an unnecessary-feeling runtime check.
    • Consequence to Kotlin client: same, but that first remedy is impossible; happily, the second remedy is super easy (!!).

Case 2: accepted type

Should you make this accepted type:

  • "technically over-specific"? (non-null)
    • General consequence: in the errant case, the client would have to suppress a finding in order to pass null to the library.
    • Consequence to Kotlin client: that's impossible, so there is simply no way to do this, short of altering the library jar!
  • "over-general in practice"? (nullable)
    • General consequence: well-behaved clients may accidentally send null, which we can only hope is rebuffed at runtime.
    • Consequence to Kotlin client: same

Conclusion

Again, we can't pronounce one answer as always better than the other. But hopefully understanding these consequences will help you make your decision.

TODO: the "everything is projected" case

⚠️ **GitHub.com Fallback** ⚠️