Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: cdc/filtering/session fails due to duplicated events #117590

Closed
srosenberg opened this issue Jan 10, 2024 · 4 comments · Fixed by #117637
Closed

roachtest: cdc/filtering/session fails due to duplicated events #117590

srosenberg opened this issue Jan 10, 2024 · 4 comments · Fixed by #117637
Assignees
Labels
A-cdc Change Data Capture C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc

Comments

@srosenberg
Copy link
Member

srosenberg commented Jan 10, 2024

This came up during an adhoc run in AWS using graviton3 instances,

07:06:20 test_runner.go:824: [w4] test failed: cdc/filtering/session (run 1)
07:06:20 test_runner.go:840: [w4] destroying cluster srosenberg-1704783858-01-n3cpu4 [tag:] (3 nodes) because: cdc/filtering/session (1) - (assertions.go:333).Fail:
        Error Trace:    github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/cdc_filtering.go:278
                                                github.com/cockroachdb/cockroach/pkg/cmd/roachtest/tests/cdc_filtering.go:192
                                                main/pkg/cmd/roachtest/test_runner.go:1107
                                                src/runtime/asm_amd64.s:1650
        Error:          Not equal:
                        expected: []string{"A@1", "B@1", "C@1", "B@2 (before: B@1)", "C@2 (before: C@1)", "C@3 (before: C@2)", "A@4 (before: A@3)", "D@1"}
                        actual  : []string{"A@1", "B@1", "C@1", "B@2 (before: B@1)", "B@2 (before: B@1)", "C@2 (before: C@1)", "C@2 (before: C@1)", "C@3 (before: C@2)", "C@3 (before: C@2)", "A@4 (before: A@3)", "A@4 (before: A@3)", "D@1", "D@1"}

We can see that some events were duplicated whereas the test assertion expects uniques.
cdc_filtering_logs.tar.gz

Jira issue: CRDB-35256

Epic CRDB-13169

@srosenberg srosenberg added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc labels Jan 10, 2024
Copy link

blathers-crl bot commented Jan 10, 2024

cc @cockroachdb/cdc

@blathers-crl blathers-crl bot added the A-cdc Change Data Capture label Jan 10, 2024
@srosenberg
Copy link
Member Author

Looking at n1, we can see that a range was split immediately after a changfeed was created,

I240109 07:05:53.804169 5274 kv/kvserver/replica_command.go:440 ⋮ [T1,Vsystem,n1,split,s1,r67/1:‹/{Table/65-Max}›] 205  initiating a split of this range at key /Table/104 [r68] (span config)

Note that replication of system ranges is co-occurring with the changefeeds. It might be possible to exclude duplicates by waiting for replication to finish (see WaitFor3XReplication). Given the current implementation of this test, we would then expect no other range splits. Otherwise, if duplicates cannot be provably excluded, then the assertion should be weakened.

CC @andyyang890 @nicktrav

@nicktrav
Copy link
Collaborator

@andyyang890 - wdyt about just updating the testing logic in here to eliminate dupes? It's expected that we will be encountering them on a changefeed. This should be a pretty easy fix.

@andyyang890
Copy link
Collaborator

Sure, I'll do that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-cdc
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants