Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Test demonstrating changefeed inhibiting node shutdown. #82767

Closed
wants to merge 3 commits into from

Conversation

miretskiy
Copy link
Contributor

Full fix tracked in #82765

This test demonstrates that running changfeeds inhibit node shutdown.
The issue is on the dist flow side. This is a temporary fix.
The test itself takes more than 1 minute to run. This is the reason
why it's being pushed as a proof of concept PR.
However, it does demonstrate the issue; and that the context
cancellation upon node shutdown added in change aggregator fixes the
issue.

Once we determine the cause of this test taking so much time, then
it can also be added.

Release Notes: None

Yevgeniy Miretskiy added 3 commits June 10, 2022 18:57
Treat NodeUnavailable error as a retryable changefeed error.

NodeUnavaiable error may be returned by changefeed processors if the
node is being shutdown/drained.  However, this error return is racy.
Sometimes, the coordinator would see "rpc context cancellation" error
instead -- in those cases it would treat the error as retryable.
However, sometimes it is possible to get this error propagated
(for example: when server shutdown races with starting up kv feed,
which uses Stopper, whcih may return NodeUnavailable error if it's
being shutdown).

Release Notes: None
Informs cockroachdb#70433
Permanet fix is being tracked by the above issue.
This is a temporary fix to ensure that change aggregators
cancel their context when the server quiesces so that the
server shutdown is not inhibited by the running changefeeds.
Full fix tracked in cockroachdb#82765

This test demonstrates that running changfeeds inhibit node shutdown.
The issue is on the dist flow side.  This is a temporary fix.
The test itself takes more than 1 minute to run.  This is the reason
why it's being pushed as a proof of concept PR.
However, it does demonstrate the issue; and that the context
cancellation upon node shutdown added in change aggregator fixes the
issue.

Once we determine the cause of this test taking so much time, then
it can also be added.

Release Notes: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

miretskiy pushed a commit to miretskiy/cockroach that referenced this pull request Jun 10, 2022
Informs cockroachdb#82765

Permanet fix is being tracked by the above issue.
This is a temporary fix to ensure that change aggregators
cancel their context when the server quiesces so that the
server shutdown is not inhibited by the running changefeeds.

The test for this functionality is not being merged due
to the fact that it takes too long to run; however, the test
can be seen here: cockroachdb#82767

Release Notes (enterprise change): Ensure running changefeeds do
not inhibit node shutdown.
miretskiy pushed a commit to miretskiy/cockroach that referenced this pull request Jun 13, 2022
Informs cockroachdb#82765

Permanet fix is being tracked by the above issue.
This is a temporary fix to ensure that change aggregators
cancel their context when the server quiesces so that the
server shutdown is not inhibited by the running changefeeds.

The test for this functionality is not being merged due
to the fact that it takes too long to run; however, the test
can be seen here: cockroachdb#82767

Release Notes (bug fix): Ensure running changefeeds do
not inhibit node shutdown.
miretskiy pushed a commit to miretskiy/cockroach that referenced this pull request Jun 13, 2022
Informs cockroachdb#82765

Permanet fix is being tracked by the above issue.
This is a temporary fix to ensure that change aggregators
cancel their context when the server quiesces so that the
server shutdown is not inhibited by the running changefeeds.

The test for this functionality is not being merged due
to the fact that it takes too long to run; however, the test
can be seen here: cockroachdb#82767

Release Notes (bug fix): Ensure running changefeeds do
not inhibit node shutdown.
craig bot pushed a commit that referenced this pull request Jun 14, 2022
82489: sql: disallow adding column as primary key r=jasonmchan a=jasonmchan

Previously, the behavior of ALTER TABLE ... ADD COLUMN ... PRIMARY KEY
was undefined. With the legacy schema changer, this statement would
appear to succeed, but it would break the new schema changer. This
commit changes this by returning an explicit error.

In the future, we should support this statement if the table's primary
key was originally the default rowid primary key (see #82735).

Fixes #82489

Release note: None

82768: changefeedccl: Do not inhibit node shutdown r=miretskiy a=miretskiy

See individual commits for details:
  * Ensure running changefeed do not inhibit node shutdown (Informs #82765)
  * Treat node unavailable error as retryable.

Test not being merged as part of this PR -- see #82767

Release Notes (bug fix): Ensure running changefeeds do
not inhibit node shutdown.
Release Notes (bug fix): Treat node unavailable error as a retryable changefeed error.

Co-authored-by: Jason Chan <[email protected]>
Co-authored-by: Yevgeniy Miretskiy <[email protected]>
miretskiy pushed a commit to miretskiy/cockroach that referenced this pull request Jun 14, 2022
Informs cockroachdb#82765

Permanet fix is being tracked by the above issue.
This is a temporary fix to ensure that change aggregators
cancel their context when the server quiesces so that the
server shutdown is not inhibited by the running changefeeds.

The test for this functionality is not being merged due
to the fact that it takes too long to run; however, the test
can be seen here: cockroachdb#82767

Release Notes (bug fix): Ensure running changefeeds do
not inhibit node shutdown.
miretskiy pushed a commit that referenced this pull request Jul 5, 2022
Informs #82765

Permanet fix is being tracked by the above issue.
This is a temporary fix to ensure that change aggregators
cancel their context when the server quiesces so that the
server shutdown is not inhibited by the running changefeeds.

The test for this functionality is not being merged due
to the fact that it takes too long to run; however, the test
can be seen here: #82767

Release Notes (bug fix): Ensure running changefeeds do
not inhibit node shutdown.
@miretskiy miretskiy closed this Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants