Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce FluxFlatMapUsageCheck #26

Merged
merged 8 commits into from
Jan 15, 2022

Conversation

rickie
Copy link
Member

@rickie rickie commented Dec 30, 2021

There are two problems with Flux#flatMap:

  1. It provides unbounded parallelism.
  2. It is not guaranteed to be sequential.

Neither of these behaviors is obvious, which is problematic as this method looks a lot like the simpler Java Stream#flatMap. People coming from the non-reactive world would are much more likely to actually be looking for concatMap or flatMapSequential. People looking for particular behavior when it comes to parallelism are likely to be looking for one of the flatMap overloads which define the maximum number of threads.

For these reasons, we want to ban the non-overloaded Flux#flatMap.

Copy link
Member

@werli werli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice seeing the unbounded Flux#flatMap getting attention 🚀

Comment on lines 28 to 30
explanation =
"The problem with `Flux#flatMap` is that it is not clear that it provides unbounded parallelism and is"
+ " not guaranteed to be sequential. Therefore, we disallow the use of the non-overloaded `Flux#flatMap`.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shorter:

Suggested change
explanation =
"The problem with `Flux#flatMap` is that it is not clear that it provides unbounded parallelism and is"
+ " not guaranteed to be sequential. Therefore, we disallow the use of the non-overloaded `Flux#flatMap`.",
explanation =
"`Flux#flatMap` provides unbounded parallelism and is not guaranteed to be sequential. " +
"Therefore, we disallow the use of the non-overloaded `Flux#flatMap`.",

Comment on lines 37 to 41
private static final Matcher<ExpressionTree> FLUX_FLATMAP =
instanceMethod()
.onExactClass("reactor.core.publisher.Flux")
.named("flatMap")
.withParameters("java.util.function.Function");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not the following? 🙂

Suggested change
private static final Matcher<ExpressionTree> FLUX_FLATMAP =
instanceMethod()
.onExactClass("reactor.core.publisher.Flux")
.named("flatMap")
.withParameters("java.util.function.Function");
private static final Matcher<ExpressionTree> FLUX_FLATMAP =
instanceMethod()
.onExactClass(Flux.class.getName())
.named("flatMap")
.withParameters(Function.class.getName());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way we have to add the imports, which in this case is not a problem. Sometimes it can be prevent us from having to specify a dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, reverting the Flux.class.getName() one, because indeed; the RefasterResourceCompiler says ClassNotFoundException. (Although I'm not sure why this occurs because of the RRC?) @Stephan202.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens because io.projectreactor:reactor-core is declared as a provided dependency. @werli's suggestion would require us to move the dependency to the compile scope, which would cause it to be pulled in everywhere that error-prone-contrib is used. If we'd do that for all dependencies for which we have one or more BugPatterns, then this checker would become a very "heavy" dependency to add to a project.

(Replacing provided with test is also not possible because there are Refaster templates that also depend on io.projectreactor:reactor-core.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for the explanation @Stephan202 👍

Comment on lines 53 to 72
.addFix(
SuggestedFix.builder()
.replace(tree, Util.treeToString(tree, state).replace("flatMap", "concatMap"))
.build())
.addFix(
SuggestedFix.builder()
.replace(
parentExpression,
getReplacementWithConcurrencyArgument(parentExpression, state))
.build())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, I've seen ~two purposeful usages of Flux#flatMap(Function, int) in all this time.
Shouldn't we simply automatically apply the flatMap -> concatMap fix? Everyone using Flux#flatMap(Function, int) looks to know what they're doing :)

Makes me think then: If we apply such a "trivial" fix, wouldn't this qualify as a Refaster rule then? 👀

Copy link
Member Author

@rickie rickie Dec 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thoughts 😄 . The reason for making it a BugPattern is that we want to have the two suggestions. However, when we run patch, the first suggestion will always be used, which means that we'll always replace flatmap with concatMap.

The compiler error will now look like "Did you mean <FIX 1> or <FIX 2>?". So indeed, the first one will be most used, but we still wanted to give the Flux#flatMap(Function, int) as suggestion.

Copy link
Member

@werli werli Dec 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍 Thx!

@rickie rickie force-pushed the rossendrijver/disallow_flux_flatmap branch from 27b57a5 to e89e5db Compare December 31, 2021 10:46
Copy link
Member

@Stephan202 Stephan202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a commit. Maybe we should also flag the unary Flux#flatMapSequential?

Comment on lines 37 to 41
private static final Matcher<ExpressionTree> FLUX_FLATMAP =
instanceMethod()
.onExactClass("reactor.core.publisher.Flux")
.named("flatMap")
.withParameters("java.util.function.Function");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens because io.projectreactor:reactor-core is declared as a provided dependency. @werli's suggestion would require us to move the dependency to the compile scope, which would cause it to be pulled in everywhere that error-prone-contrib is used. If we'd do that for all dependencies for which we have one or more BugPatterns, then this checker would become a very "heavy" dependency to add to a project.

(Replacing provided with test is also not possible because there are Refaster templates that also depend on io.projectreactor:reactor-core.)

Comment on lines 53 to 56
.addFix(
SuggestedFix.builder()
.replace(tree, Util.treeToString(tree, state).replace("flatMap", "concatMap"))
.build())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that tree.getExpression() doesn't contain a reference to e.g. Mono.flatMap.

A more robust solution is to match on MethodInvocationTrees rather than MemberSelectTree and then use SuggestedFixes.renameMethodInvocation:

Suggested change
.addFix(
SuggestedFix.builder()
.replace(tree, Util.treeToString(tree, state).replace("flatMap", "concatMap"))
.build())
.addFix(SuggestedFixes.renameMethodInvocation(tree, "concatMap", state))

Comment on lines 59 to 61
.replace(
parentExpression,
getReplacementWithConcurrencyArgument(parentExpression, state))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach relies on getReplacementWithConcurrencyArgument below, but if we match on MethodInvocationTrees as suggested above we can express the addition of an extra argument more straightforwardly:

Suggested change
.replace(
parentExpression,
getReplacementWithConcurrencyArgument(parentExpression, state))
.postfixWith(
Iterables.getOnlyElement(tree.getArguments()), ", " + NAME_CONCURRENCY_ARGUMENT)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that postfixWith is really useful, good one!

tags = StandardTags.LIKELY_ERROR)
public final class FluxFlatMapUsageCheck extends BugChecker implements MemberSelectTreeMatcher {
private static final long serialVersionUID = 1L;
private static final String NAME_CONCURRENCY_ARGUMENT = "MAX_CONCURRENCY";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final String NAME_CONCURRENCY_ARGUMENT = "MAX_CONCURRENCY";
private static final String MAX_CONCURRENCY_ARG_NAME = "MAX_CONCURRENCY";

import java.util.function.Function;
import reactor.core.publisher.Flux;

/** A {@link BugChecker} which flags usages of {@link Flux#flatMap(Function)}s. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/** A {@link BugChecker} which flags usages of {@link Flux#flatMap(Function)}s. */
/** A {@link BugChecker} which flags usages of {@link Flux#flatMap(Function)}. */

"import reactor.core.publisher.Flux;",
"",
"class A {",
" void positive() {",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like the positive/negative split, we currently don't do this most other tests, instead relying on grouping with and without // BUG: Diagnostic contains:. The alternative can make it a bit easier to see that all permutations are tested. For now I'd stick with the status quo.

" Flux.just(1).concatMap(Flux::just);",
" Flux.just(1).flatMap(Flux::just, 1);",
" Flux.just(1).flatMap(Flux::just, 1, 1);",
" Flux.just(1).flatMap(Flux::just, throwable -> Flux.empty(), Flux::empty);",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overload has the same implicit reliance on the default concurrency level as the unary variant. However, there is no more elaborate overload we can refer to. So perhaps we should just call out this observation in the main code, without further flagging this method.

@BugPattern(
name = "FluxFlatMapUsage",
summary =
"`Flux#flatMap` is not allowed, please use `Flux#concatMap` or specify an argument for the concurrency.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Not allowed" is a bit too strong. How about:

Suggested change
"`Flux#flatMap` is not allowed, please use `Flux#concatMap` or specify an argument for the concurrency.",
"`Flux#flatMap` has subtle semantics; please use `Flux#concatMap` or explicitly specify the desired amount of concurrency",

(Other summaries currently don't end with a dot.)

Comment on lines 29 to 30
"`Flux#flatMap` provides unbounded parallelism and is not guaranteed to be sequential. "
+ "Therefore, we disallow the use of the non-overloaded `Flux#flatMap`.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallelism isn't unbounded (it's Queues.SMALL_BUFFER_SIZE by default). Since currently we don't generate documentation, let's move this explanation to the class Javadoc, where we don't need to deal with string concatenation.

(We should of course look into generating a website with documentation; later 🙃.)

String parentString = Util.treeToString(parentExpression, state);
return String.format(
"%s, %s)",
parentString.substring(0, parentString.lastIndexOf(')')), NAME_CONCURRENCY_ARGUMENT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the developer chooses a concurrency value of 1 then the result is equivalent to a concatMap; let's add a Refaster template for that.

@rickie
Copy link
Member Author

rickie commented Jan 2, 2022

Maybe we should also flag the unary Flux#flatMapSequential?

Based on the implementation of flatMapSequential, that might be a good idea. I'm curious why it hasn't been flagged as "potentially dangerous" yet? I think I saw it listed as alternative to flatMap because the naming was a lot clearer?

Btw: nice improvements!! 🚀

@Stephan202
Copy link
Member

I'm curious why it hasn't been flagged as "potentially dangerous" yet? I think I saw it listed as alternative to flatMap because the naming was a lot clearer?

I guess because flatMapSequential is uses less, and has one less caveat (because ordering is consistent).

@rickie rickie force-pushed the rossendrijver/disallow_flux_flatmap branch 2 times, most recently from 57d559e to edf932c Compare January 2, 2022 15:16
@@ -40,7 +40,7 @@
@BugPattern(
name = "FluxFlatMapUsage",
summary =
"`Flux#flatMap` has subtle semantics; please use `Flux#concatMap` or explicitly specify the desired amount of concurrency",
"`Flux#flatMap` and `Flux#flatMapSequential` have subtle semantics; please use `Flux#concatMap` or explicitly specify the desired amount of concurrency",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or: Flux#flatMap{,Sequential} ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but let's split the line.

* A {@link BugChecker} which flags usages of {@link Flux#flatMap(Function)} and {@link
* Flux#flatMapSequential(Function)}.
*
* <p>{@link Flux#flatMap(Function)} and {@link Flux#flatMapSequential(Function)} eagerly perform up
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure about this first two sentences, are they still correct 😬 .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite; will push something :)

Copy link

@EnricSala EnricSala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool 👍 Will be nice not having to worry about this one anymore, when will it be released? 😄

"import reactor.core.publisher.Flux;",
"",
"class A {",
" private static final int MAX_CONCURRENCY = 8;",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking, is this constant necessary in the before case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. The suggested fix does not introduce this constant, so strictly speaking it yields non-compilable code, which refactoringTestHelper doesn't like. So we introduce this constant to work around that.

(In theory we could update the code to suggest a MAX_CONCURRENCY constant, but that's not worth the hassle.)

Copy link
Member

@Stephan202 Stephan202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one more commit :)

Suggested commit message:

Introduce `FluxFlatMapUsageCheck` (#26)

@werli @nathankooij anything (else) from your side?

when will it be released?

I estimate we're ~1-2 months away from officially enabling Error Prone Support in PSM... 🤞

* A {@link BugChecker} which flags usages of {@link Flux#flatMap(Function)} and {@link
* Flux#flatMapSequential(Function)}.
*
* <p>{@link Flux#flatMap(Function)} and {@link Flux#flatMapSequential(Function)} eagerly perform up
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite; will push something :)

@@ -40,7 +40,7 @@
@BugPattern(
name = "FluxFlatMapUsage",
summary =
"`Flux#flatMap` has subtle semantics; please use `Flux#concatMap` or explicitly specify the desired amount of concurrency",
"`Flux#flatMap` and `Flux#flatMapSequential` have subtle semantics; please use `Flux#concatMap` or explicitly specify the desired amount of concurrency",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but let's split the line.

"import reactor.core.publisher.Flux;",
"",
"class A {",
" private static final int MAX_CONCURRENCY = 8;",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. The suggested fix does not introduce this constant, so strictly speaking it yields non-compilable code, which refactoringTestHelper doesn't like. So we introduce this constant to work around that.

(In theory we could update the code to suggest a MAX_CONCURRENCY constant, but that's not worth the hassle.)

Copy link
Member

@werli werli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing more from my side 👍 Nice to see flatMapSequential also being covered 🚀

Copy link
Contributor

@nathankooij nathankooij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question but already approving :) Nice work!

* <p>{@link Flux#flatMap(Function)} and {@link Flux#flatMapSequential(Function)} eagerly perform up
* to {@link reactor.util.concurrent.Queues#SMALL_BUFFER_SIZE} subscriptions. Additionally, the
* former interleaves values as they are emitted, yielding nondeterministic results. In most cases
* {@link Flux#concatMap(Function)} should be preferred, as it produces consistent results and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we then also not include {concat,flat}MapIterable for that reason? Or is this one problematic because there's no concurrency overload?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that unlike #flatMap(Function) and #concatMap(Function), with Iterable there is no notion of eager vs lazy inner subscription. The content of the Iterables are all played sequentially. Thus flatMapIterable and concatMapIterable are equivalent offered as a discoverability improvement for users that explore the API with the concat vs flatMap expectation.

Good one, I think it would make sense to just pick one (concat?) to keep usages uniform 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So IIUC:

  • {concat,flat}MapIterable do not suffer from the issue with eager subscription.
  • {concat,flat}MapIterable both emit values in a deterministic order.

Based on this I guess all we need is a Refaster check to replace flatMapIterable with concatMapIterable. I guess strictly speaking that's out of scope for this PR, but let's just add it for completeness 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a commit 😄 !

@Stephan202 Stephan202 force-pushed the rossendrijver/disallow_flux_flatmap branch from 96af19b to f68492e Compare January 15, 2022 11:32
Copy link
Member

@Stephan202 Stephan202 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased and added a small commit.

Comment on lines 160 to 164
* Prefer {@link Flux#concatMapIterable(Function)} over {@link Flux#concatMapIterable(Function)}
* to be consistent with {@link FluxFlatMapUsageCheck}.
*
* <p>NB: Both implementations emit values in a deterministic order and there is no difference
* with eager or lazy inner subscriptions. This means that both implementations are *equivalent*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can express this more concisely:

Suggested change
* Prefer {@link Flux#concatMapIterable(Function)} over {@link Flux#concatMapIterable(Function)}
* to be consistent with {@link FluxFlatMapUsageCheck}.
*
* <p>NB: Both implementations emit values in a deterministic order and there is no difference
* with eager or lazy inner subscriptions. This means that both implementations are *equivalent*.
* Prefer {@link Flux#concatMapIterable(Function)} over {@link Flux#concatMapIterable(Function)},
* as the former has equivalent semantics but a clearer name.

@Stephan202 Stephan202 merged commit 63b4fae into master Jan 15, 2022
@Stephan202 Stephan202 deleted the rossendrijver/disallow_flux_flatmap branch January 15, 2022 11:39
@Stephan202 Stephan202 added this to the 0.1.0 milestone Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants