Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove explicit nullable annotations if they depend on the generic type argument #337

Closed
DavyLandman opened this issue Aug 12, 2019 · 41 comments

Comments

@DavyLandman
Copy link

Hi Ben,

I was pleasantly surprised by CheckerFramework annotations on the caffeine code, since we also switched a while back, it makes life easier. 👍

A pattern I saw and I wanted to check your opinion on is for example Cache::get:

  @Nullable
  V get(@NonNull K key, @NonNull Function<? super K, ? extends V> mappingFunction);

Now reading the documentation, if the mapping function never returns null, neither should the get function, right?

If that is so, than I think the @Nullable should be removed from the return type. Since CF will just propagate the nullability of the V type parameter. If V is @Nullable, then the result will be, but if you provide a mapping function that has a @NonNull return type, it's a bit strange to still have to handle the null case.

I hope this question makes sense?

@cruftex
Copy link
Contributor

cruftex commented Aug 12, 2019

Mind the JavaDoc on the method:

@return the current (existing or computed) value associated with the specified key, or null if
the computed value is null

The function is nonnull, not the result of it.

@DavyLandman
Copy link
Author

I did not mean the mapping function, I'm talking specifically about the V generic type parameter that can be either @NonNull or @Nullable depending on the client code.

Example:

public @NonNull String foo(@NonNull String bar) { // note that in CF @NonNull is assumed, just here for clarity sake
     return cache.get(bar, String::toUpperCase);
}

Since String.toUpperCase returntype is @NonNull, this (sh/c)ould be valid. Since the V type parameter would be bound to @NonNull String.

I hope this makes my question a bit more clear, sorry if I use the wrong terminology, I'm always open to learn.

(currently, because of the @Nullable annotation, the example becomes something like:)

public @NonNull String foo(@NonNull String bar) { // note that in CF @NonNull is assumed, just here for clarity sake
     return Objects.requireNonNull(cache.get(bar, String::toUpperCase), "Unexpected null result from cache");
}

@ben-manes
Copy link
Owner

Yes, if V is never null then the results is non-null. How would we express this, though? It’s the same as computeIfAbsent.

I originally meant it for documentation as a hint to the reader, but static analysis tools have gotten better. I think dropping the annotation would imply non-null by default.

@cruftex
Copy link
Contributor

cruftex commented Aug 12, 2019

.... I'm always open to learn.

@DavyLandman: Actually I learned something now. Thanks for the more detailed explanation!

....If that is so, than I think the @nullable should be removed from the return type. Since CF will just propagate the nullability of the V type parameter.

The nullability might be part of the return type of the mapping function in CF, but there is no way to propagate this. Implementations of the same method signature can legally return always null.

There is no contract on the method that CF can use to determine how the returned value relates to the mapper function.

@DavyLandman
Copy link
Author

DavyLandman commented Aug 13, 2019

Yes, if V is never null then the results is non-null. How would we express this, though? It’s the same as computeIfAbsent.

Indeed, similar to that, maybe we can do the same as the CF jdk annotations for map?

default V computeIfAbsent(K key, Function<? super K, ? extends @Nullable V> mappingFunction) {

I have to admit, I would have to dive deep into the CF docs to understand how the @Nullable works together with the wildcard and it's position of the type, but at least that's how the authors of CF defined it.

I originally meant it for documentation as a hint to the reader, but static analysis tools have gotten better. I think dropping the annotation would imply non-null by default.

I understand this tradeoff, I think the example from the jdk annotations might be a good middle ground.

@ben-manes
Copy link
Owner

ben-manes commented Aug 13, 2019

Thanks. I think that's a reasonable solution.

I use ErrorProne + NullAway instead, so there will likely be other minor mistakes in the future.

@DavyLandman
Copy link
Author

Great, I imagine might be more cases like this throughout the big caffeine api?

I hadn't heard about nullaway, looks like a good alternative to checker framework. CF takes some investment and getting used too, but after the initial pain, keeping it updated is not that hard.

@ben-manes
Copy link
Owner

I wouldn't be surprised. This is a little special because its a computation, so we just need to fix each of the get(key, func) style methods. Most other usages are obvious or we can defer to the asMap view.

I tried CF once but it was too invasive for me, but I'm sure it's matured a lot since then. ErrorProne + NullAway are a bit easier to add progressively, though I suspect they are not as powerful.

@ben-manes
Copy link
Owner

ben-manes commented Apr 27, 2020

I forgot to link here earlier, but I am waiting to see how the effort in google/guava#2960 (comment) pans out.

@ben-manes
Copy link
Owner

ben-manes commented Jan 16, 2021

Reviewing this anew, and I think maybe this case should use @PolyNull.

As an example of the use of @PolyNull, method Class.cast returns null if and only if its argument is null:

@PolyNull T cast(@PolyNull Object obj) { ... }

This is like writing:

@NonNull T cast( @NonNull Object obj) { ... }
@Nullable T cast(@Nullable Object obj) { ... }

If so, then I would change this one method signature to,

@PolyNull
V get(@NonNull K key, @NonNull Function<? super K, ? extends @PolyNull V> mappingFunction);

What do you think @DavyLandman?

@vlsi
Copy link
Contributor

vlsi commented Jan 16, 2021

The following should be enough:

// `K extends Object` means K is non-null (provided default checkerframework configuration is used)
public interface Cache<K extends Object, V> {

  V get(/* @NonNull <== not needed */K key, @NonNull Function<? super K, ? extends V> mappingFunction);

@ben-manes
Copy link
Owner

I haven't used checker for a long time, but it used to default to null (not nonnull). Is that no longer the case? That was one reason that I decided not to use it as it became very verbose with that coding style.

@vlsi
Copy link
Contributor

vlsi commented Jan 16, 2021

PS. I'm surprised you have lots of @NonNull annotations. Is it intentional? I guess the typical recommendation (e.g. from checkerframework) is non-nullable by default (e.g. @DefaultQualifier(NonNull...))

@vlsi
Copy link
Contributor

vlsi commented Jan 16, 2021

@ben-manes
Copy link
Owner

For my usages, the goal wasn't actually about static analysis but for clearer documentation. The jsr305 were used that way to be more explicit for those scanning the JavaDoc that the return value is nullable. When migrating in #242, it was requested that these are more explicit to better conform to the checker framework's analysis. That made sense, but became verbose.

Would using @DefaultQualifier help reduce that verbosity by again assuming non-null unless specified? It would then go back to being more reader-friendly?

@vlsi
Copy link
Contributor

vlsi commented Jan 16, 2021

it used to default to null (not nonnull)

The defaults are described here: https://checkerframework.org/manual/#climb-to-top

They suggest non-null by default, however, things get a bit complicated with generics.

Would using @DefaultQualifier help reduce that verbosity by again assuming non-null unless specified?

Exactly.

the goal wasn't actually about static analysis but for clearer documentation

I see. However, tools like IntelliJ IDEA recognize the annotations and they might produce false warnings in case the annotations are not at their best :)
AFAIK Kotlin does not understand checkerframework annotations, however, it looks like they are going to converge on https://github.com/jspecify/jspecify

Would using @DefaultQualifier help reduce that verbosity by again assuming non-null unless specified? It would then go back to being more reader-friendly?

I guess so.

For instance, even if you write public interface Cache<K extends @NonNull Object, V> {, then it would be more-or-less clear that K is a non-nullable type, and it would avoid adding @NonNull K even without DefaultQualifier.

PS. I've added a gist on checkerframework behavior, however, it is more for machine verification rather than for "documenting nullability in code"
https://github.com/apache/calcite/blob/6908d21ddecab8a10d7754800eed8f15b9c32f78/site/develop/index.md#null-safety

@ben-manes
Copy link
Owner

Right, modern Java generally assumes non-null by default. The IDE nullness checkers often did too, or at least Eclipse's as one of the early ones. I am positive checker went against that in early versions, as I recall that in their docs in 2014 when first trying it out here. If that's changed or by using default qualifier then that is great.

I'll clean up the annotations in the v3 branch and ping you for a quick review.

@ben-manes
Copy link
Owner

@vlsi Why not include the generic bounds or use TypeUseLocation.ALL in your package?

@vlsi
Copy link
Contributor

vlsi commented Jan 16, 2021

TypeUseLocation.ALL

Generic bounds are tricky, and trying to make every bound nullable or non-nullable does not really work for cases like extends, super, etc.

I recently annotated Apache Calcite codebase (see apache/calcite#2268), and it does pass the verification. I learned that the Checker Framework's recommended approach to generics works OK, it is readable, and it is more-or-less consistent.

@ben-manes
Copy link
Owner

ben-manes commented Jan 17, 2021

@vlsi Can you please review f19f547? I removed all of the @NonNull annotations, used @PolyNull in the case above, and added the package defaults.

@ben-manes
Copy link
Owner

@vlsi I spent a few hours trying to resolve checker framework warnings, but I don't think it's worthwhile. It complains about types being nullable despite non-null if conditions asserting that to no longer be true. By checking only the types and not logic flow, hundreds of pointless warnings are produced. These have to be suppressed, which makes the code harder to read and negates any value in catching errors.

This may be a bad fit because as a data structure null is very common. In most code one can and should be very null adverse, and use a coding style that makes them very rare. Then handling and suppressing the warnings is likely a minor deal.

I'll try to port over any public api interface changes to try and make that better conform. That's your and @DavyLandman intent anyway, I just got too ambitious by trying to pass its rules too.

@DavyLandman
Copy link
Author

@ben-manes I agree, especially for map like structures, checker-framework is really hard to get right. One of the problems is caused by arrays. I think at some points maps where hard-coded in their internals.

So I think providing the right annotations on the interface is a good enough trade-off. If you want, you could take a look at the CF annotations for Map and look at how they set it up?

ben-manes added a commit that referenced this issue Feb 1, 2021
Additional wildcards are used throughput the APIs to more flexibly
accept parameters. For example this allows a wider range of method
references to be used as load functions.

The generic types now match the Checker Framework's rules [1]. This
should improve usage of the cache apis in projects that run the checker.
This project does not and its implementation classes are not compliant
for the nullness checker.

[1] https://checkerframework.org/manual/#generics-instantiation
ben-manes added a commit that referenced this issue Feb 1, 2021
Additional wildcards are used throughput the APIs to more flexibly
accept parameters. For example this allows a wider range of method
references to be used as load functions.

The generic types now match the Checker Framework's rules [1]. This
should improve usage of the cache apis in projects that run the checker.
This project does not and its implementation classes are not compliant
for the nullness checker.

[1] https://checkerframework.org/manual/#generics-instantiation
ben-manes added a commit that referenced this issue Feb 1, 2021
Additional wildcards are used throughput the APIs to more flexibly
accept parameters. For example this allows a wider range of method
references to be used as load functions.

The generic types now match the Checker Framework's rules [1]. This
should improve usage of the cache apis in projects that run the checker.
This project does not and its implementation classes are not compliant
for the nullness checker.

[1] https://checkerframework.org/manual/#generics-instantiation
ben-manes added a commit that referenced this issue Feb 1, 2021
Additional wildcards are used throughput the APIs to more flexibly
accept parameters. For example this allows a wider range of method
references to be used as load functions.

The generic types now match the Checker Framework's rules [1]. This
should improve usage of the cache apis in projects that run the checker.
This project does not and its implementation classes are not compliant
for the nullness checker.

[1] https://checkerframework.org/manual/#generics-instantiation
@Avinm
Copy link

Avinm commented Aug 19, 2021

I see that LoadingCache::get still declares the return value as @Nullable and this causes static analysers to issue a warning when using NonNullable types like Optional (which is recommended for negative caching). Is there a recommendation/workaround for this in Caffeine 2.x? On comparing, I noticed that LoadingCache::get in Guava does not have the @Nullable annotation.

@ben-manes
Copy link
Owner

Guava will throw an InvalidCacheLoadException if a null value is returned by the loader. Caffeine mirrors Map.computeIfAbsent by allowing a null to indicate a data miss, which is treated as a soft failure rather than a hard error. We actually considered that for Guava, but Josh Bloch felt strongly to fail on a null response which I think in retrospect was a mistake. The source data might be absent and exception handling is a verbose mechanism for that.

There isn't a way in the annotations to handle that LoadingCache.get(key) returns null only if the type is, since Java lacks non-null types. The equivalent method Cache.get(key, mappingFunction) is marked with @PolyNull which should work how you want.

I don't plan to spend much time on 2.x's annotations, but I am open to any refinements for 3.x.

@Avinm
Copy link

Avinm commented Aug 19, 2021

I see, thanks for the explanation. Would it be possible to use @PolyNull in LoadingCache::get? I would love to use 3.x, but sadly I am constrained to jdk8 for now :/ But this issue seems to exist for LoadingCache::get even in 3.x.

@ben-manes
Copy link
Owner

My limited understanding is it won't, because it requires checking a method argument. In v3 we have,

@PolyNull V get(K key, Function<? super K, ? extends @PolyNull V> mappingFunction);

This is evaluated as if

@NonNull V get(K key, Function<? super K, ? extends @NonNull V> mappingFunction);
@Nullable V get(K key, Function<? super K, ? extends @Nullable V> mappingFunction);

but that requires pairing with an argument, which LoadingCache.get(key) lacks. Since that relies on associating to a different class from a runtime configuration, the analyzer can't know if it might return null or not. The Java type system isn't enough, and I don't know if there is an annotation that is like PolyNull but based on the type V instead. This is outside of my area, so maybe I missed another annotation that fits this.

@Avinm
Copy link

Avinm commented Aug 19, 2021

Makes sense. And I guess the other option of removing the @Nullable altogether for LoadingCache.get was already considered and rejected?

@ben-manes
Copy link
Owner

It wouldn't accurate, because you are allowed to return null from LoadingCache.get.

For example let's say that I wanted to cache users by their email for login. If the user mistypes their email then it will be a cache miss that returns null. In Guava this would throw an exception as if an error, whereas in Map.computeIfAbsent its a null response that does not create a mapping. If we wanted negative caching, both would recommend using Optional so that an explicit mapping is established. Since the call is querying for an absent row in the db, not being found is not an exceptional condition so Guava throwing a checked exception is quite heavy handed for a common scenario. That's somewhat mitigated by getUnchecked when you know the entry must be present and don't want to annoy the call-site code.

@Avinm
Copy link

Avinm commented Aug 20, 2021

Got it. It's just unfortunate I guess. Another solution might be to define a NonNullable getOrthrow method, but that might be an overkill.

@DavyLandman
Copy link
Author

DavyLandman commented Aug 20, 2021

I think CheckerFramework also uses PolyNull:

see: https://github.com/typetools/jdk/blob/master/src/java.base/share/classes/java/util/Map.java#L1036

    default @PolyNull V computeIfAbsent(K key,
            Function<? super K, ? extends @PolyNull V> mappingFunction) {

@Avinm
Copy link

Avinm commented Aug 20, 2021

@DavyLandman : Like Ben pointed out earlier, I think that requires the type parameter be "paired with an argument, which LoadingCache.get(key) lacks"

@DavyLandman
Copy link
Author

DavyLandman commented Aug 20, 2021

@Avinm ah, my bad. I would think that the type came from the class level type bound. But I might be missing something about the LoadingCache signature.

Ah wait, is it that the closures is passed in at an earlier point? Maybe that should influence the result of the V type. So the whole loading cache result type is PolyNull.

But I admit, I cannot oversee the implications of that.

@Avinm
Copy link

Avinm commented Aug 20, 2021

The LoadingCache has class level K and V type params:
public interface LoadingCache<K, V> extends Cache<K, V>

The nullability here depends on the CacheLoader supplied while building the LoadingCache:

LoadingCache<String, Optional<String>> optionalCache = Caffeine.newBuilder().build(this::toOptional);

Optional<String> toOptional(String input) // Should never return null since that would defeat the purpose of Optional
{
    ...
}

Now its usage would cause a (slightly) ugly warning:

optionalCache.get("someStr")
             .map(str -> doSomething(str) // Invoking ".map" on optionalCache.get(..) might throw NPE :(
             ...

Could Polynull annotation somehow be used here?

@ben-manes
Copy link
Owner

ben-manes commented Aug 20, 2021

minor correction, the type signatures in 3.x were updated for checker framework,

LoadingCache<K extends Object, V extends Object> extends Cache<K, V>

and the package-info.java with the defaults,

@DefaultQualifier(value = NonNull.class, locations = TypeUseLocation.FIELD)
@DefaultQualifier(value = NonNull.class, locations = TypeUseLocation.PARAMETER)
@DefaultQualifier(value = NonNull.class, locations = TypeUseLocation.RETURN)
package com.github.benmanes.caffeine.cache;

@Avinm is on 2.x, but I think we should try to focus on 3.x and maybe backport.

@ben-manes
Copy link
Owner

@vlsi The latest errorprone release claims that it is unnecessary to use extends Object for a non-null marker.

T extends Object is redundant when using normal (non-Checker Framework checked) code.

However, T extends Object compiles to the same bytecode as T when using vanilla javac. So, when using Checker on vanilla javac’s bytecode, T extends Object does not imply non-null bounds outside the same compilation unit.

Since we don't compile using CheckerFramework's javac, should this be stripped out?

@vlsi
Copy link
Contributor

vlsi commented Aug 5, 2022

Since we don't compile using CheckerFramework's javac, should this be stripped out?

Well, in theory, <T extends Object> would help consumers who use Caffeine and compile code with the Checker Framework.

@cpovirk
Copy link
Collaborator

cpovirk commented Aug 5, 2022

I think I may have suggested the Error Prone message, and it has a mistake in it. That said, <T extends Object> in Caffeine cannot help any users unless Caffeine itself is compiled with the Checker Framework. I can elaborate more, hopefully on Monday. Sorry for the unhelpful message.

@ben-manes
Copy link
Owner

I can elaborate more, hopefully on Monday. Sorry for the unhelpful message.

hey @cpovirk, just a reminder in case you forgot about this over the weekend. No worries otherwise, I know everyone is busy and if this is only about verbosity then we discuss whenever time allows.

@cpovirk
Copy link
Collaborator

cpovirk commented Aug 11, 2022

So much for Monday... :)

The main points are:

  • class A<T> and class A<T extends Object> compile to identical bytecode.
  • Bytecode is, of course, all that consumers of Caffeine see.

Here's a demo to show that the two produce identical bytecode:

$ tail -n +1 */A.java
==> just-t/A.java <==
class A<T> {}

==> t-extends-object/A.java <==
class A<T extends Object> {}

$ ( cd just-t && javac A.java ) && ( cd t-extends-object && javac A.java ) && diff -s just-t/A.class t-extends-object/A.class
Files just-t/A.class and t-extends-object/A.class are identical

Thus, there's no benefit to writing class A<T extends Object> (except as a hint to anyone who reads the Caffeine source code :)).

At that point, the question becomes: So how does the Checker Framework distinguish between <T> and <T extends Object>? And that question has a two-part answer:

First: When the Checker Framework is getting information about an API from source code, rather than bytecode, then it can distinguish between class A<T> and class A<T extends Object>.

But that doesn't explain everything: The Checker Framework does advise using class A<T extends Object> when building a library that will later be used in bytecode form by others. And this, too, can work. How?

The key point here is that the Checker Framework modifies what bytecode javac writes. I don't want to claim that I can give a fully precise description of what you have to do to make it do that, but I can get closer than I did in the Error Prone message: If you run the Checker Framework's nullness checker, then the Checker Framework will make javac write nullness annotations to the resulting bytecode.

That's technically enough to answer the original question, but I think it would be best for me to step back and talk about the Checker Framework's writing of nullness annotations more broadly:

Why is it useful to write the annotations? After all, if the Checker Framework knows that unannotated types are non-nullable, then it doesn't need to write annotations to tell its future self that :) I know of two reasons:

  1. Not all tools make the same assumptions as the Checker Framework. Kotlin, for example, treats unannotated types as having unknown ("platform"/"flexible") nullness. By writing @NonNull annotations, the Checker Framework tells tools like Kotlin "I checked this code, and I don't know any way for null to get in here."
  2. And finally... by writing nullness annotations, the Checker Framework can write different bytecode for <T> and <T extends Object>, letting tools (including later invocations of the Checker Framework itself!) distinguish between the two:
$ for F in just-t t-extends-object; do ( cd $F && ~/checker-framework-3.24.0/checker/bin/javac -processor nullness A.java ); done; diff <(javap -cp just-t -v A) <(javap -cp t-extends-object -v A) | tail
@@ -52,7 +51,7 @@
     org.checkerframework.checker.nullness.qual.NonNull
   3: #15(): CLASS_TYPE_PARAMETER, param_index=0
     org.checkerframework.checker.initialization.qual.Initialized
-  4: #16(): CLASS_TYPE_PARAMETER_BOUND, param_index=0, bound_index=0
-    org.checkerframework.checker.nullness.qual.Nullable
+  4: #14(): CLASS_TYPE_PARAMETER_BOUND, param_index=0, bound_index=0
+    org.checkerframework.checker.nullness.qual.NonNull
   5: #15(): CLASS_TYPE_PARAMETER_BOUND, param_index=0, bound_index=0
     org.checkerframework.checker.initialization.qual.Initialized

So, if you were to build your Caffeine release with a javac invocation that runs the Checker Framework, then <T extends Object> would be different to your users than <T>.

Until then, you want <T extends @NonNull Object>⁠—or you want to just not worry about annotating type-parameter bounds yet :)

@vlsi
Copy link
Contributor

vlsi commented Aug 11, 2022

<T extends @NonNull Object> causes invalid bytecode in certain javac versions: smallrye/jandex#92 (comment), so I ended up removing extends @NonNull Object in pgjdbc: pgjdbc/pgjdbc#2010

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants