-
Notifications
You must be signed in to change notification settings - Fork 330
How NullAway Works
This page gives some further details on the inner workings of NullAway.
The most complex part of the NullAway implementation is its intra-procedural type inference, which infers nullability information for local variables and some types of expressions based on code within the same method. The inference can infer different types for locals / expressions at different program points, e.g.:
void foo(@Nullable Object x) {
if (x != null) {
// inference learns x is @NonNull here
x.toString();
}
// x is still @Nullable here, hence error is reported
x.hashCode();
}The inference also learns nullability facts for more complex expressions, e.g.:
class ListNode {
int data;
@Nullable ListNode next;
@Nullable ListNode getNext() { return next; }
void doStuff() {
if (this.next != null && this.next.getNext() != null) {
// NullAway infers this.next.getNext() to be @NonNull here
this.next.getNext().toString();
}
}
}Type inference is implemented as a dataflow analysis, leveraging the Checker Framework's dataflow library. An abstract state is a mapping from access paths to a corresponding nullability state. For further background on access paths see Section 5.2 of this document. NullAway allows for zero-argument method calls (like getters) to be included in access paths along with field names. NullAway also makes the simplifying (but unsound) assumption that callees perform no mutation, and hence does not invalidate nullability state for access paths across method calls.
The dataflow transfer functions are implemented here. For the most part, the transfer functions simply propagate information learned from assignments and null checks for any expression representable as an access path. Some care must be taken in constructing the initial store when analyzing the body of a lambda expression. Unlike normal methods, lambda parameters inherit the nullability of their corresponding functional interface methods, to eliminate verbosity, so some logic is required to "infer" the initial nullability of the lambda parameters.
For matching of containsKey() and get() calls
for maps, NullAway
additionally tracks the receiver and first argument to such calls as an access
path. So, if there is a call ap1.get(ap2) underneath a
conditional if (ap1.containsKey(ap2)), where ap1 and ap2
are representable access paths, NullAway treats the get() call as
returning @NonNull.
Given the results of type inference, in some cases, error checking
involves directly checking the conditions that would lead to
each
possible
error message.
The logic is in the
main
NullAway class.
E.g., to
check
fields assignments,
we ensure that if the field is @NonNull, the right-hand side
expression of the assignment is not @Nullable. Checking of
dereferences,
parameter passing and
returns,
and
method overriding is
similarly straightforward.
Checking
for
proper initialization of @NonNull fields is
more involved. Here, we must again leverage the results of dataflow
analysis, to show that the relevant field is @NonNull at the exit
node of the relevant constructor / initializer (i.e., it is
initialized on all paths). Similarly, dataflow analysis is used to
check
for
read-before-init errors,
by inferring field nullability at the program point before the read.
Our dataflow analysis has been designed such that we can analyze a
method once (an expensive operation) and re-use the result for type
inference and all initialization checking. NullAway also has targeted
inter-procedural reasoning to allow, e.g., for field initialization in
a private method that is always invoked from an initializer.
Some other special cases are worth mentioning. Java 8 lambdas are
treated as overrides of the corresponding functional interface
methods, so override checking proceeds similarly to that of normal
methods, but with inherited nullability for unannotated parameters (as
discussed in the above section on inference). Method references are
also checked against the expected functional interface method, but
careful handling of corner cases
like
unbound instance methods is
required. Finally, unboxing of a null value leads to an NPE, so we
introduce checks for
nullability
for various operations that
cause unboxing.
The main way to extend the behavior of NullAway beyond the core inference and checking described above, is to use a handler.
Through the NullAway codebase, there are multiple extension points, at which the core code calls specific methods of the Handler interface for all registered handler objects. To extend or override the default behavior for NullAway, it often suffices to scan that interface for the corresponding extension point method(s), then subclass BaseNoOpHandler overriding only said method(s), and finally add an instance of the new handler at the end of the list returned by Handlers.buildDefault().
For example, consider LibraryModelsHandler, which is used to provide models for unannotated third-party library methods. This handler first uses a service loader to find custom models for methods, in addition to those declared inside the DefaultLibraryModels class. It then registers itself as a handler overriding, among others, Handler.onDataflowVisitMethodInvocation. This particular extension hook allows it to change, on a callsite by callsite basis, the nullability assumption for the method's return value. By default, NullAway assumes unannotated methods return @NonNull. However, the LibraryModelsHandler changes that value for those methods for which an explicit library model exists which shows the return method to be @Nullable, and the NullAway core dataflow simply moves forward propagating that @Nullable value in those cases.
LibraryModelsHandler overrides similar extension points to mark the arguments to certain library methods as @NonNull and check for correct overriding of modeled library methods. Because of the handlers mechanism, the core NullAway classes don't need to know about the existence of library models at all, they only need the general hooks to override the default return nullability and argument nullability of unannotated methods. These same hooks can be used by other handlers that derive this information from sources other than explicit library models (e.g. ContractHandler below, or even our experimental bytecode analysis).
Some other NullAway extensions implemented as handlers include:
-
RxNullabilityPropagator: This is a handler which propagates nullability information across call boundaries for select methods inside an Rx stream. For example, this allows handling the following code:observable.filter(o -> o != null).map(o -> o.toString()). This is an example of handlers being used to add limited forms of inter-procedural inference, which is prohibitively expensive in the general case. -
ContractHandler: Adds support for a subset of the specifications of the@Contractannotation (e.g.@Contract("!null -> !null")means that if the method's sole argument is@NonNull, then the dataflow analysis should also assume the return is@NonNull). -
ApacheThriftIsSetHandler: Correctly interpretsisSetXXXX()calls inside code generated by the Apache Thrift library as checking the nullability of propertyXXXX. -
OptionalEmptinessHandler: Implements our support for preventing.get(...)calls onOptionalvalues without checking that theOptionalis non-empty, effectively extending NullAway to handle accesses to potentially empty optional values the same it way it handles dereference of@Nullablevalues, without any optional-specific changes to the core tool.
Note that handlers overriding the same extension method are chained together, and the order of the handler chain in Handlers.buildDefault() can matter in certain cases. Please refer to the documentation of the corresponding Handler interface method. In general, due to the performance focus of NullAway, handler chains should remain shallow for most extension methods, and handlers are expected to be as performant as core code. One important advantage of handlers, besides helping to keep the core code readable, is that they allow turning on and off specific non-core NullAway features by simply adding or removing the corresponding handler from Handlers.buildDefault().
Work is underway to implement support for the JSpecify specification of the semantics of Java nullability annotations. See the JSpecify Support page for details from a user perspective; here we give more details on the implementation thus far. You should be familiar with the JSpecify user guide before reading this and you may need to reference the full spec. Most code for implementing JSpecify support is in the com.uber.nullaway.generics package, specifically the GenericsChecks class.
Null-markedness: NullAway supports the @NullMarked and @NullUnmarked annotations, including nesting them. The core logic for checking the markedness of a particular method / class is in the CodeAnnotationInfo class. The main NullAway class calls into CodeAnnotationInfo methods as it traverses ASTs to determine what checks to perform. It also tries to optimize for the common case where a class is entirely @NullMarked or @NullUnmarked; see the NullMarking enum and how it is used.
Ensuring type arguments respect bounds: For a generic type with explicit nullability in type arguments (e.g., Foo<@Nullable String>), we must check that a type argument is @Nullable only if the corresponding type variable has a @Nullable upper bound (e.g., class Foo<T extends @Nullable Object>, but not just class Foo<T>). See GenericsChecks.checkInstantiationForParameterizedTypedTree and GenericsChecks.checkGenericMethodCallTypeArguments for the relevant logic.
Note that getting the fully-annotated type of an expression within NullAway can be challenging (even ignoring inference), as sometimes javac maintains annotations inconsistently or not at all (particularly for NewClassTrees). See the GenericsChecks.getTreeType method for our additional logic to get or reconstruct annotated types in more cases.
Determining when an expression is @Nullable Some core NullAway logic for determining whether an expression or location is @Nullable must be enhanced for JSpecify. E.g., a class Foo<T> may have an instance method T getVal() { ... }. When getVal() is invoked with a receiver of type Foo<@Nullable String>, NullAway must reason that the invocation expression is @Nullable. Similar cases arise when determining if a method parameter is @Nullable or not. There are various methods in GenericsChecks for assisting with this reasoning, including getGenericReturnNullnessAtInvocation, getGenericMethodParameterNullness, and getGenericParameterNullnessAtInvocation.
Subtyping checks: NullAway must perform subtyping checks at (pseudo-)assignments and at method overrides. With JSpecify support, these subtyping checks must also account for nullability of generic type arguments and array contents. The code for the additional subtyping checks required by JSpecify is separated from the code checking subtyping of top-level nullability (i.e., that a @Nullable expression is not assigned into a @NonNull location), since JSpecify mode is still optional. The main logic for the additional JSpecify subtyping checks are in GenericsChecks.subtypeParameterNullability, which in turn calls GenericsChecks.identicalTypeParameterNullability, which uses the CheckIdenticalNullabilityVisitor. These routines are called from various places to handle regular assignments, parameter passing, returns, etc. For method overriding checks see GenericsChecks.checkTypeParameterNullnessForMethodOverriding.
Sometimes when performing a subtyping check, we need to "view" a type as one of its supertypes. E.g., if we have class Foo<T extends @Nullable Object>, and class Bar extends Foo<@Nullable String>, we may have to check if a Bar expression is assignable to a Foo<String> variable (it is not). To do so, we need to first "view" the Bar expression as Foo<@Nullable String> and then compare the nullability of the type arguments. The TypeSubstitutionUtils class contains methods to help with such conversions (the asSuper method), along with viewing a method as a member of some class (memberType) and substituting type arguments into a generic type (subst). (The class is named TypeSubstitutionUtils since all of these operations require performing substitutions.) The logic in this class handles cases where @Nullable type arguments are written in extends and implements clauses, with some additional related logic in ClassDeclarationNullnessAnnotUtils. It also handles restoring explicit nullability annotations (see restoreExplicitNullabilityAnnotations). This restoration handles cases like when a method parameter p has type @NonNull T where T is a type variable and the @NonNull annotation is explicit. Even if the type argument for T is @Nullable at some call site, that parameter must still be treated as @NonNull due to the explicit annotation.
Inference of Type Argument Nullability For generic methods, Java often infers the type arguments for calls to such methods. For JSpecify support, Nullaway must extend that inference to also infer nullability of the type arguments. We have an initial implementation of inference based on a simple constraint solver. See GenericsChecks.inferGenericMethodCallType, ConstraintSolver, and ConstraintSolverImpl. The basic approach is to generate subtyping constraints for a call based on where its result is being assigned and what parameters are being passed, and then solve the constraints by propagating nullability requirements, while still respecting type variable bounds. This code is still in flux and the implementation approach may change in the future.