type system - jspecify/jspecify Wiki

The type system model

The basic model should be enough for most users to handle most situations (we think). But at some point it might fall short, or you might crave a deeper understanding of "how to think like an analyzer".

It's now time to understand the Jspecify type system!

What's a type system?

At the most general level, we can describe a type system as any system that defines:

  • Types, which are uniquely identifiable sets of values. For example, int may denote the type that has as its value set all positive and negative whole numbers (integers) up to certain lower and upper bound values. Some types are predefined in a type system while others may be constructed during the type system's operation based on relationships between existing types.
  • Kinds, which can be thought of as sets of types. In other words, kinds are the "type of types".
  • Relations, which may be established between types or kinds.
  • Operations, which allow the system to establish relations between kinds or types, transform kinds and types, or draw conclusions based on the relations between kinds and types.

Of course, Java itself defines a type system, spelled out in detail in the Java Language Specification (JLS). To better understand how JSpecify builds on and extends the Java type system, it may be helpful to gain a basic understanding of how Java's type system works, which is covered in the next section.

Some Background: Java's Type System

We can explore Java's type system step-by-step, examining how javac implements each of the features outlined in our general description of type systems:

Types in the Java Type System

We noted that, first of all, a type system defines things called types, which are uniquely identifiable sets of values. We'll soon see that the Java language defines several types, but we will start with one example: int. In Java, int is a predefined type that encompasses all the positive and negative whole numbers (integers) up to a lower (negative) bound and upper (positive) bound. Types like int also have an associated size which determines how much memory they occupy on a system. This is important in other parts of the compiler, but we can ignore this detail when it comes to the type system.

Kinds in the Java Type System

What we call kinds in this document are sets of types or "the type of types".

We can clarify this notion with an example from the Java type system. The type we explored earlier, called int, belongs to a kind in the Java type system called Primitive Types. This is a set of types, such as int, char, and short that share certain characteristics. Note that a kind is not the same thing as the intersection of the values of these types! While the compiler will check in certain places that a particular kind of type is used, the fact that int and char belong to the same kind does not mean we can use them interchangeably or that we could pass any value from their intersection. A different kind, the intersection type will let us accomplish this; more on it shortly. The Java type system has the following kinds:

  • No Type: This is a special kind of types that has no members. It's used for constructs that do not have meaning in the type system, such as packages and, in Java's case, void.
  • Primitive Types: these are built-in types; their shared characteristic is that they do not support a special operation called dereferencing.
  • Reference Types: these are types that support a special operation called dereferencing.
  • Null Types: this kind contains only one type, which in turn contains one value null. As we'll see, this type is of major significance when it comes to JSpecify's nullness effort.
  • Array Types: this kind only contains a single type, which contains as its values arrays.
  • Executable Types: this kind contains types that have invocable constructs as values, such as methods and constructors.
  • Intersection Types: this kind contains types that are derived from the intersection of the value sets of two or more types. More on these shortly.
  • Union Types: this kind contains types that are derived from the union of the set of values of two or more types. More on these shortly.
  • Type Variables: this is a special kind of type that we'll discuss later.
  • Wildcard Type: this is a special kind of type that we'll discuss later.
  • Declared Type: this is the kind assigned to user defined types (UDFs), in the case of Java, these are classes and interfaces.

Kinds as Functions

While we've described kinds as fixed sets of types so far, it's also possible to think of them as functions that, given some number of argument types produce a new type.

This is easiest to see with the intersection and union types. We can view the intersection kind as a function that, given two types, A and B returns a new type C that has the value set that is the intersection of the value sets of both the input types A and B. Similarly, the union type, given two input types, returns a type that has as its value set the union of the value sets of the given input types. These concepts are actually instances of two deeper mathematical notions called product and coproduct which are the fundamental building blocks that permit defining new types from a set of initial primitive types.

Similarly, the array type can be viewed as a function that takes a single type as an argument, and returns the type of arrays of arbitrary length containing values from the value set of the input type.

Not all types fit this treatment nicely. For example, it's hard to view the primitive types as a function; even if we view them as a nullary function with no inputs, we'd get an invalid function as a result since they have more than one output, e.g. int, char. Instead, this is a special case in which these are initial types determined by the compiler and assigned to parts of a program by the type system. We need to start somewhere. Another alternative would be to introduce separate kinds for each primitive that are constant nullary functions to themselves. Then, the kind of primitive types is a partial unary function over these kinds that is undefined for non-primitive type inputs.

Relations in the Java Type System

TODO: Kind relations (e.g. many kinds are subkinds of the reference kind); type relations (subtyping).

Operations in the Java Type System

TODO: Basic unification/type checking.

The JSpecify Nullness Type System

TODO: one way of pitching the main idea: the Null Type is no longer a subtype of reference type; in some ways it becomes proper kind function from types to types (the null kind function T+null)