Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix a possible race between Flow.Cleanup and Flow.Cancel #95522

Merged
merged 2 commits into from
Jan 20, 2023

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jan 19, 2023

This commit fixes a possible race that could occur when Flow.Cleanup
is called by the main goroutine concurrently with Flow.Cancel by the
listener goroutine (which is not allowed). We already had
synchronization in place, but it was insufficient. In particular, the
following scenario could lead to a nil pointer crash:

  • the listener checks whether Cleanup has been called, it hasn't, and
    the mutex is unlocked;
  • the listener is preemptied;
  • the main goroutine proceeds to perform the Cleanup. At the very end
    the flow object is unset (including ctxCancel overwritten to nil);
  • the listener resumes its execution, proceeds to call Cancel on the
    already-unset Flow object, leading to a nil pointer on ctxCancel
    call.

This is now fixed by holding the mutex through the call to Cancel in
the listener goroutine, ensuring that the Flow object is not unset
from under the listener. Additionally, this commit clarifies the
callbacks that are performed at the very beginning and very end of
Cleanup method.

Fixes: #95527.

Release note: None

@yuzefovich yuzefovich requested review from andreimatei, msirek and a team January 19, 2023 17:56
@blathers-crl

This comment was marked as off-topic.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the setup-flow-follow-up branch from 5f338a4 to d525206 Compare January 19, 2023 20:07
@yuzefovich yuzefovich changed the title sql: address some feedback on a recent change sql: fix a possible race between Flow.Cleanup and Flow.Cancel Jan 19, 2023
This commit fixes a possible race that could occur when `Flow.Cleanup`
is called by the main goroutine concurrently with `Flow.Cancel` by the
listener goroutine (which is not allowed). We already had
synchronization in place, but it was insufficient. In particular, the
following scenario could lead to a nil pointer crash:
- the listener checks whether `Cleanup` has been called, it hasn't, and
the mutex is unlocked;
- the listener is preemptied;
- the main goroutine proceeds to perform the `Cleanup`. At the very end
the flow object is unset (including `ctxCancel` overwritten to `nil`);
- the listener resumes its execution, proceeds to call `Cancel` on the
already-unset `Flow` object, leading to a nil pointer on `ctxCancel`
call.

This is now fixed by holding the mutex through the call to `Cancel` in
the listener goroutine, ensuring that the `Flow` object is not unset
from under the listener. Additionally, this commit clarifies the
callbacks that are performed at the very beginning and very end of
`Cleanup` method.

Release note: None
@yuzefovich yuzefovich force-pushed the setup-flow-follow-up branch from d525206 to 23da047 Compare January 19, 2023 20:14
@yuzefovich
Copy link
Member Author

I added another commit that fixes another (although less frequent) race. RFAL.

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!
:lgtm:

Reviewed 1 of 1 files at r1, 5 of 5 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei)

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @yuzefovich)

@yuzefovich
Copy link
Member Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 20, 2023

Build succeeded:

@craig craig bot merged commit 0775fcc into cockroachdb:master Jan 20, 2023
@yuzefovich yuzefovich deleted the setup-flow-follow-up branch January 20, 2023 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pkg/sql/logictest/tests/fakedist-disk/fakedist-disk_test: TestLogic_datetime failed
4 participants