changefeedccl: fix initial scan checkpointing #123625

wenyihu6 · 2024-05-04T14:41:14Z

Initially, all span initial resolved timestamps are kept as zero upon resuming a
job since initial resolved timestamps are set as initial high water which
remains zero until initial scan is completed. However, since
0eda540,
we began reloading checkpoint timestamps instead of setting them all to zero at
the start. In PR #102717, we introduced a mechanism to reduce message duplicates
by re-loading job progress upon resuming which largely increased the likelihood
of this bug. These errors could lead to incorrect frontier and missing events
during initial scans. This patches changes how we initialize initial high water
and frontier by initializing it as zero if there are any zero initial high water
in initial resolved timestamps.

Fixes: #123371

Release note (enterprise change): Fixed a bug in v22.2+ where long running
initial scans may incorrectly restore checkpoint job progress and drop events
during node / changefeed restart. This issue was most likely to occur in
clusters with: 1) changefeed.shutdown_checkpoint.enabled (v23.2) is set 2)
Multiple table targets in a changefeed, or 3) Low
changefeed.frontier_checkpoint_frequency or low
changefeed.frontier_highwater_lag_checkpoint_threshold.

blathers-crl · 2024-05-04T14:41:18Z

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

cockroach-teamcity · 2024-05-04T14:41:23Z

This change is

rharding6373

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @wenyihu6)

pkg/ccl/changefeedccl/changefeed_stmt.go line 1375 at r1 (raw file):

	// local state; instead, this checkpoint is applied to the reloaded job
	// progress, and the resulting progress record persisted back to the jobs
	// table.

Could you update the comment to explain why we should return early if the local state's hwm is empty?

rharding6373

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @wenyihu6)

pkg/ccl/changefeedccl/changefeed_stmt.go line 1351 at r1 (raw file):

// reconcileJobStateWithLocalState ensures that the job progress information
// is consistent with the state present in the local state.
func reconcileJobStateWithLocalState(

It would be really great if you were able to make a unit test for reconciliation that exercised your fix.

andyyang890

I've been doing some code reading and I agree with Steven's suggestion to move the fix to the for loop in setupSpansAndFrontier. As you've mentioned before, there is some non-deterministic behavior here depending on the ordering of the empty timestamps in the watched spans slice and I don't think that was intentional. It seems the intent of that for loop was that initialHighwater should be the minimum InitialResolved of all the watched spans.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373 and @wenyihu6)

wenyihu6 · 2024-05-08T16:32:08Z

I agree with you and Steven that this could be another possible fix as far as I can tell. But I still want to look into more to see how it plays along with other changefeeds and backfill since this is legacy code from

cockroach/pkg/ccl/changefeedccl/changefeed_processors.go

Lines 418 to 422 in 0eda540

    
           for _, watch := range spec.Watches { 
        
           	if initialHighWater.IsEmpty() || watch.InitialResolved.Less(initialHighWater) { 
        
           		initialHighWater = watch.InitialResolved 
        
           	} 
        
           }

.

andyyang890

To add to my previous comment, the effect of your current proposed change would make it so that we don't use the saved shutdown checkpoint data (controlled by changefeed.shutdown_checkpoint.enabled) during initial scans (i.e. not reduce duplicates during initial scan restarts), but I don't think we need to be that conservative. We just need to make sure that we don't set the aggregator frontier's initial highwater to a non-empty timestamp when it should be empty.

But I still want to look into more to see how it plays along with other changefeeds and backfill since this is legacy code from

legacy code can be wrong too 😛

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373 and @wenyihu6)

rharding6373

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @wenyihu6)

pkg/ccl/changefeedccl/changefeed_processors.go line 525 at r2 (raw file):

// used to filter out some previously emitted rows, and by the cloudStorageSink
// to name its output files in lexicographically monotonic fashion.
func (ca *changeAggregator) setupSpansAndFrontier() (spans []roachpb.Span, err error) {

It would still be really nice to have a unit test to exercise this new behavior in the aggregator setup.

pkg/ccl/changefeedccl/changefeed_processors.go line 531 at r2 (raw file):

	for _, watch := range ca.spec.Watches {
		spans = append(spans, watch.Span)
		if watch.InitialResolved.IsEmpty() {

optional nit: It may be clearer if we add a if isInitialScan { continue } first. It's a bit difficult to reason about whether we may set the initialHighWater to something else in future iterations.

pkg/ccl/changefeedccl/changefeed_processors.go line 533 at r2 (raw file):

		if watch.InitialResolved.IsEmpty() {
			// Keep initialHighWater as empty if there are any zero InitialResolved
			// timestamp to indicate that initial scan is required.

nit: s/required/ongoing

andyyang890

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rharding6373 and @wenyihu6)

pkg/ccl/changefeedccl/changefeed_processors.go line 525 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

It would still be really nice to have a unit test to exercise this new behavior in the aggregator setup.

+1, might also be nice if we could modify the cdc/initial-scan-rolling-restart roachtest to work on cloud so it can run during nightly CI

asg0451

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @rharding6373, and @wenyihu6)

pkg/ccl/changefeedccl/changefeed_processors.go line 527 at r3 (raw file):

func (ca *changeAggregator) setupSpansAndFrontier() (spans []roachpb.Span, err error) {
	var initialHighWater hlc.Timestamp
	isInitialScan := false

nit: this implementation is overcomplicated for the logic it tries to express. assuming i'm right that it's literally just "min", consider an implementation such as:

	spans = make([]roachpb.Span, 0, len(ca.spec.Watches))
	var initialHighWater hlc.Timestamp // initialHighWater is the minimum of all initial resolved timestamps, or zero if none exist.
	for i, watch := range ca.spec.Watches {
		spans = append(spans, watch.Span)
		if i == 0 {
			initialHighWater = watch.InitialResolved
			continue
		}
		if watch.InitialResolved.Less(initialHighWater) {
			initialHighWater = watch.InitialResolved
		}
	}

wenyihu6

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @asg0451, and @rharding6373)

pkg/ccl/changefeedccl/changefeed_processors.go line 525 at r2 (raw file):

Previously, andyyang890 (Andy Yang) wrote…

+1, might also be nice if we could modify the cdc/initial-scan-rolling-restart roachtest to work on cloud so it can run during nightly CI

Sounds good. Working on it.

pkg/ccl/changefeedccl/changefeed_processors.go line 531 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

optional nit: It may be clearer if we add a if isInitialScan { continue } first. It's a bit difficult to reason about whether we may set the initialHighWater to something else in future iterations.

Done.

pkg/ccl/changefeedccl/changefeed_processors.go line 533 at r2 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

nit: s/required/ongoing

Done.

pkg/ccl/changefeedccl/changefeed_processors.go line 527 at r3 (raw file):

Previously, asg0451 (Miles Frankel) wrote…

nit: this implementation is overcomplicated for the logic it tries to express. assuming i'm right that it's literally just "min", consider an implementation such as:

	spans = make([]roachpb.Span, 0, len(ca.spec.Watches))
	var initialHighWater hlc.Timestamp // initialHighWater is the minimum of all initial resolved timestamps, or zero if none exist.
	for i, watch := range ca.spec.Watches {
		spans = append(spans, watch.Span)
		if i == 0 {
			initialHighWater = watch.InitialResolved
			continue
		}
		if watch.InitialResolved.Less(initialHighWater) {
			initialHighWater = watch.InitialResolved
		}
	}

Nice. I would buy that.

pkg/ccl/changefeedccl/changefeed_stmt.go line 1351 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

It would be really great if you were able to make a unit test for reconciliation that exercised your fix.

Working on the aggregator unit test.

pkg/ccl/changefeedccl/changefeed_stmt.go line 1375 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Could you update the comment to explain why we should return early if the local state's hwm is empty?

Done.

wenyihu6 · 2024-05-10T02:11:22Z

@rharding6373 Do we want the roachtest commit to be in a separate commit so that we can backport this fix more easily?

andyyang890

Do we want the roachtest commit to be in a separate commit so that we can backport this fix more easily?

I would be in favor of that given we're adding a unit test in this PR. I made some updates to the roachtest in #123924.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asg0451, @rharding6373, and @wenyihu6)

-- commits line 4 at r9:
I think we might want to update this commit message to reflect that this bug was actually introduced in afb95b1 as a result of using the checkpoint to set InitialResolved.

pkg/ccl/changefeedccl/changefeed_processors.go line 535 at r9 (raw file):

		}
		if watch.InitialResolved.Less(initialHighWater) {
			// Keep initialHighWater as the minimum of all InitialResolved timestamps.

nit: move this comment to above the for loop

pkg/ccl/changefeedccl/changefeed_processors_test.go line 21 at r9 (raw file):

// TestSetupSpansAndFrontier tests that the setupSpansAndFrontier function
// correctly sets up frontier for the changefeed aggregator frontier.
func TestSetupSpansAndFrontier(t *testing.T) {

nice!

pkg/ccl/changefeedccl/changefeed_processors_test.go line 33 at r9 (raw file):

				{
					Span:            roachpb.Span{Key: roachpb.Key("a"), EndKey: roachpb.Key("b")},
					InitialResolved: hlc.Timestamp{},

nit: there's no need to include fields that are the type's zero value

pkg/ccl/changefeedccl/changefeed_processors_test.go line 64 at r9 (raw file):

		},
		{
			name:             "incomplete initial scan",

nit: subtest names should be unique

pkg/ccl/changefeedccl/changefeed_processors_test.go line 101 at r9 (raw file):

	} {
		t.Run(tc.name, func(t *testing.T) {
			mockChangefeedAggregator := &changeAggregator{}

nit: s/mockChangefeedAggregator/changeAggregator

This isn't a mock since you're testing the production struct

rharding6373

Yes, agree that the roachtest update can be merged separately. Nice work on this, love the new test coverage!

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andyyang890, @asg0451, and @wenyihu6)

-- commits line 4 at r11:
Reopened this issue, so we can add Fixes: #[123371](https://github.com/cockroachdb/cockroach/issues/123371)

-- commits line 4 at r11:
Please also add a release note.

pkg/ccl/changefeedccl/changefeed_processors_test.go line 21 at r9 (raw file):

Previously, andyyang890 (Andy Yang) wrote…

nice!

+1 nice test Wenyi!

pkg/ccl/changefeedccl/changefeed_processors_test.go line 46 at r11 (raw file):

		},
		{
			name:             "incomplete initial scan with non-empty initial resolved in the end",

test suggestion: add one more test with only one empty initial resolved timestamp in the first position.

rharding6373

Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @asg0451)

andyyang890

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @asg0451, @rharding6373, and @wenyihu6)

-- commits line 21 at r15:
Thanks for adding more details!

nit: the numbered list might be readable if you do 1) changefeed.shutdown_checkpoint.enabled (v23.2) is set, 2) Multiple table targets in a changefeed, or 3) Low changefeed.frontier_checkpoint_frequency or low changefeed.frontier_highwater_lag_checkpoint_threshold.

wenyihu6 · 2024-05-10T17:45:33Z

-- commits line 21 at r15:

Previously, andyyang890 (Andy Yang) wrote…

Thanks for adding more details!

nit: the numbered list might be readable if you do 1) changefeed.shutdown_checkpoint.enabled (v23.2) is set, 2) Multiple table targets in a changefeed, or 3) Low changefeed.frontier_checkpoint_frequency or low changefeed.frontier_highwater_lag_checkpoint_threshold.

Done.

rharding6373

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andyyang890, @asg0451, and @wenyihu6)

pkg/ccl/changefeedccl/changefeed_processors_test.go line 52 at r16 (raw file):

		},
		{
			name:             "incomplete initial scan with non-empty initial resolved in the end",

nit: s/end/middle

Initially, all span initial resolved timestamps are kept as zero upon resuming a job since initial resolved timestamps are set as initial high water which remains zero until initial scan is completed. However, since cockroachdb@0eda540, we began reloading checkpoint timestamps instead of setting them all to zero at the start. In PR cockroachdb#102717, we introduced a mechanism to reduce message duplicates by re-loading job progress upon resuming which largely increased the likelihood of this bug. These errors could lead to incorrect frontier and missing events during initial scans. This patches changes how we initialize initial high water and frontier by initializing it as zero if there are any zero initial high water in initial resolved timestamps. Fixes: cockroachdb#123371 Release note (enterprise change): Fixed a bug in v22.2+ where long running initial scans may incorrectly restore checkpoint job progress and drop events during node / changefeed restart. This issue was most likely to occur in clusters with: 1) changefeed.shutdown_checkpoint.enabled (v23.2) is set 2) Multiple table targets in a changefeed, or 3) Low changefeed.frontier_checkpoint_frequency or low changefeed.frontier_highwater_lag_checkpoint_threshold.

andyyang890

Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @asg0451, @rharding6373, and @wenyihu6)

wenyihu6 · 2024-05-10T18:52:13Z

Thanks for the reviews! Nice teamwork 👏

bors r=andyyang890, rharding6373

craig · 2024-05-10T19:19:27Z

Build succeeded:

blathers-crl · 2024-05-10T19:19:52Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from 07845ec to blathers/backport-release-23.1-123625: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

rharding6373 reviewed May 6, 2024

View reviewed changes

andyyang890 requested a review from rharding6373 May 8, 2024 16:26

andyyang890 reviewed May 8, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch from 52db77e to e6992e6 Compare May 8, 2024 22:02

rharding6373 reviewed May 8, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch from e6992e6 to d1836d1 Compare May 9, 2024 00:26

andyyang890 reviewed May 9, 2024

View reviewed changes

asg0451 reviewed May 9, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch 2 times, most recently from cece117 to 6c7c4c6 Compare May 9, 2024 19:53

wenyihu6 commented May 9, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch from 6c7c4c6 to 35734c9 Compare May 10, 2024 02:10

wenyihu6 force-pushed the tryagain branch from 35734c9 to 7761d43 Compare May 10, 2024 02:40

andyyang890 requested review from asg0451 and rharding6373 May 10, 2024 05:02

andyyang890 reviewed May 10, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch 2 times, most recently from 0b1a0f3 to df496e5 Compare May 10, 2024 12:54

rharding6373 requested a review from andyyang890 May 10, 2024 13:52

rharding6373 approved these changes May 10, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch from df496e5 to b2d9b46 Compare May 10, 2024 14:11

wenyihu6 marked this pull request as ready for review May 10, 2024 14:12

wenyihu6 requested a review from a team as a code owner May 10, 2024 14:12

wenyihu6 force-pushed the tryagain branch from b2d9b46 to 6ae57bb Compare May 10, 2024 14:16

wenyihu6 added backport-23.1.21-rc labels May 10, 2024

rharding6373 added the backport-24.1.0-rc label May 10, 2024

rharding6373 approved these changes May 10, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch from 22c6f1b to 1ecf552 Compare May 10, 2024 16:30

rharding6373 removed the backport-22.2.x label May 10, 2024

wenyihu6 force-pushed the tryagain branch from 1ecf552 to 4ee4967 Compare May 10, 2024 16:31

rharding6373 removed backport-23.1.21-rc labels May 10, 2024

andyyang890 requested a review from rharding6373 May 10, 2024 16:43

andyyang890 reviewed May 10, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch 3 times, most recently from 5ac778d to 6278275 Compare May 10, 2024 17:42

wenyihu6 force-pushed the tryagain branch from 6278275 to b871775 Compare May 10, 2024 18:08

rharding6373 requested a review from andyyang890 May 10, 2024 18:10

rharding6373 approved these changes May 10, 2024

View reviewed changes

wenyihu6 force-pushed the tryagain branch from b871775 to 07845ec Compare May 10, 2024 18:11

andyyang890 approved these changes May 10, 2024

View reviewed changes

craig bot merged commit 81e0abb into cockroachdb:master May 10, 2024
21 of 22 checks passed

This was referenced May 10, 2024

release-23.2: changefeedccl: fix initial scan checkpointing #123966

Merged

release-24.1: changefeedccl: fix initial scan checkpointing #123967

Merged

release-24.1.0-rc: changefeedccl: fix initial scan checkpointing #123968

Merged

wenyihu6 mentioned this pull request May 10, 2024

release-23.1: changefeedccl: fix initial scan checkpointing #123970

Merged

blathers-crl bot mentioned this pull request May 20, 2024

release-23.1.22-rc: release-23.1: changefeedccl: fix initial scan checkpointing #124454

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changefeedccl: fix initial scan checkpointing #123625

changefeedccl: fix initial scan checkpointing #123625

wenyihu6 commented May 4, 2024 •

edited

Loading

blathers-crl bot commented May 4, 2024

cockroach-teamcity commented May 4, 2024

rharding6373 left a comment

rharding6373 left a comment

andyyang890 left a comment

wenyihu6 commented May 8, 2024 •

edited

Loading

andyyang890 left a comment

rharding6373 left a comment

andyyang890 left a comment

asg0451 left a comment

wenyihu6 left a comment

wenyihu6 commented May 10, 2024

andyyang890 left a comment

rharding6373 left a comment

rharding6373 left a comment

andyyang890 left a comment

wenyihu6 commented May 10, 2024

rharding6373 left a comment

andyyang890 left a comment

wenyihu6 commented May 10, 2024

craig bot commented May 10, 2024

blathers-crl bot commented May 10, 2024

changefeedccl: fix initial scan checkpointing #123625

changefeedccl: fix initial scan checkpointing #123625

Conversation

wenyihu6 commented May 4, 2024 • edited Loading

blathers-crl bot commented May 4, 2024

cockroach-teamcity commented May 4, 2024

rharding6373 left a comment

Choose a reason for hiding this comment

rharding6373 left a comment

Choose a reason for hiding this comment

andyyang890 left a comment

Choose a reason for hiding this comment

wenyihu6 commented May 8, 2024 • edited Loading

andyyang890 left a comment

Choose a reason for hiding this comment

rharding6373 left a comment

Choose a reason for hiding this comment

andyyang890 left a comment

Choose a reason for hiding this comment

asg0451 left a comment

Choose a reason for hiding this comment

wenyihu6 left a comment

Choose a reason for hiding this comment

wenyihu6 commented May 10, 2024

andyyang890 left a comment

Choose a reason for hiding this comment

rharding6373 left a comment

Choose a reason for hiding this comment

rharding6373 left a comment

Choose a reason for hiding this comment

andyyang890 left a comment

Choose a reason for hiding this comment

wenyihu6 commented May 10, 2024

rharding6373 left a comment

Choose a reason for hiding this comment

andyyang890 left a comment

Choose a reason for hiding this comment

wenyihu6 commented May 10, 2024

craig bot commented May 10, 2024

blathers-crl bot commented May 10, 2024

wenyihu6 commented May 4, 2024 •

edited

Loading

wenyihu6 commented May 8, 2024 •

edited

Loading