-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedcc: Alter changefeed w/ Resolved has murky semantics. #84102
Comments
miretskiy
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
A-cdc
Change Data Capture
T-cdc
labels
Jul 8, 2022
cc @cockroachdb/cdc |
miretskiy
pushed a commit
to miretskiy/cockroach
that referenced
this issue
Jul 8, 2022
Address multiple source of flakes in changefeed tests. cockroachdb#83530 made a change to ensure that changefeed do not fail when they are in the transient (e.g. pause-requested) state. Unfortunately, the PR made a mistake where even if the checkpoint could not be completed because the cangefeed is in the "pause requested" state, we would still proceed to emit resolved event. This is wrong, and the resolved event should never be emitted if we failed to checkpoint. In addition, alter changefeed can be used to add new tables to existing changefeed, with initial scan. In such cases, the newly added table will emit events as of the timestamp of "alter changefeed statement". When this happens, the semantics around resolved events are murky as document in cockroachdb#84102 Address this issue by making cloud storage sink more permissive around it's handling of resolved timestamp. When completing initial scan for newly added tables, fix an "off by 1" error when frontier was advanced to the next timestamp. This was wrong since cockroachdb#82451 clarified that the rangefeed start time is exclusive. Informs cockroachdb#83882 Fixes cockroachdb#83946 Release Notes: None
craig bot
pushed a commit
that referenced
this issue
Jul 9, 2022
84109: changefeedcc: De-flake changefeed tests. r=miretskiy a=miretskiy Address multiple source of flakes in changefeed tests. #83530 made a change to ensure that changefeed do not fail when they are in the transient (e.g. pause-requested) state. Unfortunately, the PR made a mistake where even if the checkpoint could not be completed because the cangefeed is in the "pause requested" state, we would still proceed to emit resolved event. This is wrong, and the resolved event should never be emitted if we failed to checkpoint. In addition, alter changefeed can be used to add new tables to existing changefeed, with initial scan. In such cases, the newly added table will emit events as of the timestamp of "alter changefeed statement". When this happens, the semantics around resolved events are murky as documented in #84102 Address this issue by making cloud storage sink more permissive around its handling of resolved timestamp. When completing initial scan for newly added tables, fix an "off by 1" error when frontier was advanced to the next timestamp. This was wrong since #82451 clarified that the rangefeed start time is exclusive. Informs #83882 Fixes #83946 Release Notes: None Co-authored-by: Yevgeniy Miretskiy <[email protected]>
closing and making this a docs issue |
miretskiy
pushed a commit
to miretskiy/cockroach
that referenced
this issue
Jul 27, 2022
Address multiple source of flakes in changefeed tests. cockroachdb#83530 made a change to ensure that changefeed do not fail when they are in the transient (e.g. pause-requested) state. Unfortunately, the PR made a mistake where even if the checkpoint could not be completed because the cangefeed is in the "pause requested" state, we would still proceed to emit resolved event. This is wrong, and the resolved event should never be emitted if we failed to checkpoint. In addition, alter changefeed can be used to add new tables to existing changefeed, with initial scan. In such cases, the newly added table will emit events as of the timestamp of "alter changefeed statement". When this happens, the semantics around resolved events are murky as document in cockroachdb#84102 Address this issue by making cloud storage sink more permissive around it's handling of resolved timestamp. When completing initial scan for newly added tables, fix an "off by 1" error when frontier was advanced to the next timestamp. This was wrong since cockroachdb#82451 clarified that the rangefeed start time is exclusive. Informs cockroachdb#83882 Fixes cockroachdb#83946 Release Notes: None Release note (<category, see below>): <what> <show> <why>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Alter changefeed to add new table with initial scan has murky
"resolved" semantics.
Consider the case when existing changefeed, running with resolved option,
is paused , and has the checkpoint as of time T
(and thus resolved event has been emitted at time T).
We then subsequently alter the feed, and add a new table with initial scan.
All previous tables will not be scanned; but the new table will be scanned
as of time T. This means that we will be emitting events with timestamp
set to T -- which seem to violate (or at least make murky) semantics
around previously emitted resolved event. That is: we have promised that all
events up to and including T have been emitted, and yet, we are emitting
more events with timestamp T.
This might not be a violation per se -- but it is confusing.
We should do 2 things:
Jira issue: CRDB-17461
Epic CRDB-14988
The text was updated successfully, but these errors were encountered: