-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdc: only backfill tables which experience schema changes #43896
Comments
If we did this then one thing that we might want to do is checkpoint the progress of tables other than the one being backfilled. It might take a new feature in the span frontier to see, hey, has the frontier for any complete table spans moved? Could be good. |
Hi @ajwerner , any update on this ticket? Thanks! |
@cjireland, I've opened a PR to close this; it will likely be merged tomorrow or next week. |
Lovely, thanks @HonoreDB In what version do you think it will be released? |
I'd think we'd call this a bug in which case it'd be eligible for backport so 20.2.1 and 20.1.x whenever that happens. |
Thanks @ajwerner |
Describe the problem
When a schema change which changes the logical layout of a table occurs, namely a column addition or removal, we send a backfill of all of the rows in the table. Currently we always do this backfill on schema changes though #31213 is to make that optional. Changefeeds watch multiple tables at a time. The logic to perform a backfill does not distinguish which spans need to be backfilled. When a backfill occurs it will backfill all of the rows from all of the tables in a changefeed.
To Reproduce
Now create a changefeed:
First we'll see the initial values:
Now add a new column to
a
:Now we'll see the writes of the backfill (#35738):
Then after the resolved timestamp for the schema change passes we'll see the backfill of not just
a
but also ofb
. This is the issue:Expected behavior
I'd expect to only see the backfill of
a
.The text was updated successfully, but these errors were encountered: