Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Changefeeds fail due to Replica.Send error #73016

Closed
miretskiy opened this issue Nov 20, 2021 · 1 comment · Fixed by #90810
Closed

changefeedccl: Changefeeds fail due to Replica.Send error #73016

miretskiy opened this issue Nov 20, 2021 · 1 comment · Fixed by #90810
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc

Comments

@miretskiy
Copy link
Contributor

miretskiy commented Nov 20, 2021

Customer reported changefeeds terminating with aborted during Replica.Send: context deadline exceeded error.
This shouldn't really happen. It's not clear exactly where this error is coming from; but we ought to handle it
like we handle any other network related issues.

Epic CRDB-11783

Jira issue: CRDB-11384

@miretskiy miretskiy added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc labels Nov 20, 2021
@blathers-crl
Copy link

blathers-crl bot commented Nov 20, 2021

cc @cockroachdb/cdc

miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Oct 27, 2022
Prior to this PR, changefeeds would rely on a white list
approach in order to determine which errors were retryable.
All other errors would be deemed terminal, causing changefeed
to fail.

The above approach is brittle, and causes unwanted
changefeed termination.

This PR changes this approach to treat all errors as retryable,
unless otherwise indicated.  Errors that are known by changefeed
to be fatal are handled explicitly, by marking such errors
as terminal.  For example, changefeeds would exit
if the targetted table is dropped.  On the other hand, inability
to read this table for any reason would not be treated as
terminal.

Fixes cockroachdb#90320
Fixes cockroachdb#77549
Fixes cockroachdb#63317
Fixes cockroachdb#71341
Fixes cockroachdb#73016
Informs CRDB-6788

Release note (enterprise change): Changefeed will now treat
all errors, unless otherwise indicated, as retryable errors.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 2, 2022
Prior to this PR, changefeeds would rely on a white list
approach in order to determine which errors were retryable.
All other errors would be deemed terminal, causing changefeed
to fail.

The above approach is brittle, and causes unwanted
changefeed termination.

This PR changes this approach to treat all errors as retryable,
unless otherwise indicated.  Errors that are known by changefeed
to be fatal are handled explicitly, by marking such errors
as terminal.  For example, changefeeds would exit
if the targetted table is dropped.  On the other hand, inability
to read this table for any reason would not be treated as
terminal.

Fixes cockroachdb#90320
Fixes cockroachdb#77549
Fixes cockroachdb#63317
Fixes cockroachdb#71341
Fixes cockroachdb#73016
Informs CRDB-6788
Informs CRDB-7581

Release note (enterprise change): Changefeed will now treat
all errors, unless otherwise indicated, as retryable errors.
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Nov 3, 2022
Prior to this PR, changefeeds would rely on a white list
approach in order to determine which errors were retryable.
All other errors would be deemed terminal, causing changefeed
to fail.

The above approach is brittle, and causes unwanted
changefeed termination.

This PR changes this approach to treat all errors as retryable,
unless otherwise indicated.  Errors that are known by changefeed
to be fatal are handled explicitly, by marking such errors
as terminal.  For example, changefeeds would exit
if the targetted table is dropped.  On the other hand, inability
to read this table for any reason would not be treated as
terminal.

Fixes cockroachdb#90320
Fixes cockroachdb#77549
Fixes cockroachdb#63317
Fixes cockroachdb#71341
Fixes cockroachdb#73016
Informs CRDB-6788
Informs CRDB-7581

Release note (enterprise change): Changefeed will now treat
all errors, unless otherwise indicated, as retryable errors.
HonoreDB pushed a commit to HonoreDB/cockroach that referenced this issue Nov 4, 2022
Prior to this PR, changefeeds would rely on a white list
approach in order to determine which errors were retryable.
All other errors would be deemed terminal, causing changefeed
to fail.

The above approach is brittle, and causes unwanted
changefeed termination.

This PR changes this approach to treat all errors as retryable,
unless otherwise indicated.  Errors that are known by changefeed
to be fatal are handled explicitly, by marking such errors
as terminal.  For example, changefeeds would exit
if the targetted table is dropped.  On the other hand, inability
to read this table for any reason would not be treated as
terminal.

Fixes cockroachdb#90320
Fixes cockroachdb#77549
Fixes cockroachdb#63317
Fixes cockroachdb#71341
Fixes cockroachdb#73016
Informs CRDB-6788
Informs CRDB-7581

Release note (enterprise change): Changefeed will now treat
all errors, unless otherwise indicated, as retryable errors.
craig bot pushed a commit that referenced this issue Nov 6, 2022
90810: changefeedccl: Rework error handling r=miretskiy a=miretskiy

Prior to this PR, changefeeds would rely on a white list
approach in order to determine which errors were retryable.
All other errors would be deemed terminal, causing changefeed
to fail.

The above approach is brittle, and causes unwanted
changefeed termination.

This PR changes this approach to treat all errors as retryable,
unless otherwise indicated.  Errors that are known by changefeed
to be fatal are handled explicitly, by marking such errors
as terminal.  For example, changefeeds would exit
if the targeted table is dropped.  On the other hand, inability
to read this table for any reason would not be treated as
terminal.

Fixes #90320
Fixes #77549
Fixes #63317
Fixes #71341
Fixes #73016
Informs CRDB-6788
Informs CRDB-7581

Release Note (enterprise change): Changefeed will now treat
all errors, unless otherwise indicated, as retryable errors.


Co-authored-by: Yevgeniy Miretskiy <[email protected]>
@craig craig bot closed this as completed in 86fffa9 Nov 6, 2022
miretskiy pushed a commit to miretskiy/cockroach that referenced this issue Dec 6, 2022
Prior to this PR, changefeeds would rely on a white list
approach in order to determine which errors were retryable.
All other errors would be deemed terminal, causing changefeed
to fail.

The above approach is brittle, and causes unwanted
changefeed termination.

This PR changes this approach to treat all errors as retryable,
unless otherwise indicated.  Errors that are known by changefeed
to be fatal are handled explicitly, by marking such errors
as terminal.  For example, changefeeds would exit
if the targetted table is dropped.  On the other hand, inability
to read this table for any reason would not be treated as
terminal.

Fixes cockroachdb#90320
Fixes cockroachdb#77549
Fixes cockroachdb#63317
Fixes cockroachdb#71341
Fixes cockroachdb#73016
Informs CRDB-6788
Informs CRDB-7581

Release note (enterprise change): Changefeed will now treat
all errors, unless otherwise indicated, as retryable errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants