-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix OrderByOperator.finish to wait for finishMemoryRevoke #16431
Fix OrderByOperator.finish to wait for finishMemoryRevoke #16431
Conversation
@@ -92,6 +93,7 @@ | |||
private final IntArrayList positionCounts; | |||
private final boolean eagerCompact; | |||
|
|||
private int modificationCount; // may overflow, doesn't matter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
volatile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think PagesIndex is supposed to be thread safe?
cc @pettyjamesm |
The `Operator.startMemoryRevoke` API promises that no processing methods are called before memory revoking is done (`finishMemoryRevoke`). Since 8c9df82, 7d11573 and cf50577 `finish` is the only processing method that can be called between `startMemoryRevoke` and `finishMemoryRevoke`. `OrderByOperator` had a concurrency-like bug because it did not handle well the case when invocation order is `startMemoryRevoke`, spill completes, `finish`, `finishMemoryRevoke`. During `finish` the operator gets an iterator over `PagesIndex` which becomes invalidated when `finishMemoryRevoke` clears the `PagesIndex`. The commit fixes that by adding explicit wait for `finishMemoryRevoke` inside `OrderByOperator.finish`. The check is same as in `HashBuilderOperator`. For consistency with the other operator the `finishMemoryRevoke` field becomes `Optional`.
When `PagesIndex` is mutated, the iteration behavior is not well defined. Similarly to JDK collections, fail iteration with `ConcurrentModificationException` when underlying index has changed. Unlike the JDK collections, the check is done only at the end (not fail-fast), to minimize performance impact. Before recent fixes, `OrderByOperator` would create an iterator over an index and then clear it. This was not intentional and led to a regression when `PagesIndex` iteration code was refactored. The check introduced here would catch that situation explicitly.
ab69353
to
45989e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is test possible?
the change is nicely tested with |
CI #16437 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, although +1 to the part about adding a comment and one question about returning early vs finishing revoking and proceeding.
The
Operator.startMemoryRevoke
API promises that no processing methodsare called before memory revoking is done (
finishMemoryRevoke
). Since8c9df82, 7d11573 and cf50577
finish
is the only processingmethod that can be called between
startMemoryRevoke
andfinishMemoryRevoke
.OrderByOperator
had a concurrency-like bug because it did not handlewell the case when invocation order is
startMemoryRevoke
, spillcompletes,
finish
,finishMemoryRevoke
. Duringfinish
the operatorgets an iterator over
PagesIndex
which becomes invalidated whenfinishMemoryRevoke
clears thePagesIndex
.The commit fixes that by adding explicit wait for
finishMemoryRevoke
inside
OrderByOperator.finish
. The check is same as inHashBuilderOperator
. For consistency with the other operator thefinishMemoryRevoke
field becomesOptional
.Fixes #16406