-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix query failure race to preserve original failure cause #20582
Conversation
Prep for adding more parameterization
e523ac9
to
9a9bff2
Compare
Added a unit test. |
cancellingThread.get(10, SECONDS); | ||
anotherThread.get(10, SECONDS); | ||
|
||
// TODO queryStateMachine.getFinalQueryInfo() does not exist for cancelled queries, but may be created by anotherThread due to a race |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason transitionToFailed and transitionToCanceled have different code paths and only first sets finalQueryInfo
. that's probably a miss in 8a89c29?
cc @JamesRTaylor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in another commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if #20231 is related
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question. don't know
9a9bff2
to
a64aeb2
Compare
@@ -1123,12 +1133,16 @@ public boolean transitionToFailed(Throwable throwable) | |||
|
|||
QueryState oldState = queryState.trySet(FAILED); | |||
if (oldState.isDone()) { | |||
QUERY_STATE_LOG.debug(throwable, "Failure after query %s finished", queryId); | |||
if (log) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: separate commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it not a separate commit already?
cancellingThread.get(10, SECONDS); | ||
anotherThread.get(10, SECONDS); | ||
|
||
// TODO queryStateMachine.getFinalQueryInfo() does not exist for cancelled queries, but may be created by anotherThread due to a race |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if #20231 is related
|
||
boolean canceled = queryState.setIf(FAILED, currentState -> !currentState.isDone()); | ||
if (canceled) { | ||
session.getTransactionId().flatMap(transactionManager::getTransactionInfoIfExist).ifPresent(transaction -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where it's done now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the common method called from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unless i missed something, please check
mind automation |
Before the change there was a race around `QueryStateMachine.transitionToFailed`: - Thread A: failure occurs, calls `transitionToFailed` with a specific exception (the original failure cause) - Thread A: calls `cleanupQueryQuietly`, which calls listeners, which e.g. unregisters a query from `LanguageFunctionManager` - Thread B is within `StatementAnalyzer` and calls `LanguageFunctionManager.getQueryFunctions`, which fails with `IllegalStateException` - Thread B catches the exception and fails the query; calls `transitionToFailed` - Thread B sets `failureCause` to `IllegalStateException` caught above - Thread A invokes `failureCause.compareAndSet` and sees non-empty reference; the original query failure cause is lost
Before commit 8a89c29, the only difference between `transitionToCanceled` and `transitionToFailed` seemed to be logging the failure thru QUERY_STATE_LOG. After that commit, one of those methods sets `finalQueryInfo` to ensure it's set. Unify the handling to ensure `finalQueryInfo` is set for `transitionToCanceled` as well.
3b74db0
to
9257118
Compare
prev CI failed #19799 |
Before the change there was a race around
QueryStateMachine.transitionToFailed
:transitionToFailed
with a specificexception (the original failure cause)
cleanupQueryQuietly
, which calls listeners, whiche.g. unregisters a query from
LanguageFunctionManager
StatementAnalyzer
and callsLanguageFunctionManager.getQueryFunctions
, which fails withIllegalStateException
transitionToFailed
failureCause
toIllegalStateException
caught abovefailureCause.compareAndSet
and sees non-emptyreference; the original query failure cause is lost
Fixes #20551