Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv/kvclient/kvcoord: TestTransactionUnexpectedlyCommitted failed #130492

Closed
cockroach-teamcity opened this issue Sep 11, 2024 · 3 comments
Closed
Assignees
Labels
A-kv-replication Relating to Raft, consensus, and coordination. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-kv KV Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Sep 11, 2024

kv/kvclient/kvcoord.TestTransactionUnexpectedlyCommitted failed with artifacts on release-23.1 @ e5d5d380b5fba8e375e874c0bec1655f9e1750bc:

=== RUN   TestTransactionUnexpectedlyCommitted
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/40ec041d13327ceafaefe721fe51f834/logTestTransactionUnexpectedlyCommitted1628962813
    test_log_scope.go:79: use -show-logs to present logs inline
    dist_sender_ambiguous_test.go:428: first range: r64:/Table/100/"{a"-b"} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    dist_sender_ambiguous_test.go:429: second range: r65:/{Table/100/"b"-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
=== CONT  TestTransactionUnexpectedlyCommitted
    dist_sender_ambiguous_test.go:311: [op 1] (txn2) n1->n1:r64/1 batchReq={EndTxn(commit) [/Table/100/"a'"], [txn: b619fecd]}, meta={id=b619fecd key=/Table/100/"a'" pri=0.01762235 epo=0 ts=1726056905.381365072,0 min=1726056905.381365072,0 seq=2}
=== CONT  TestTransactionUnexpectedlyCommitted
    dist_sender_ambiguous_test.go:1395: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/40ec041d13327ceafaefe721fe51f834/logTestTransactionUnexpectedlyCommitted1628962813
--- FAIL: TestTransactionUnexpectedlyCommitted (62.14s)
=== RUN   TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease
    dist_sender_ambiguous_test.go:311: [op 1] (_) n1->n1:r65/1 batchReq={ResolveIntent [/Table/100/"b",/Min)}, meta={<nil>}
    dist_sender_ambiguous_test.go:402: valid lease info for r64: repl=(n1,s1):1 seq=1 start=0,0 epo=1 pro=1726056897.168158479,0
    dist_sender_ambiguous_test.go:444: condition failed to evaluate within 45s: awaiting upgrade to epoch-based lease for r65:/{Table/100/"b"-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=6, sticky=9223372036.854775807,2147483647]
    --- FAIL: TestTransactionUnexpectedlyCommitted/recovery_after_transfer_lease (45.24s)

Parameters:

  • TAGS=bazel,gss,deadlock
Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/kv

This test on roachdash | Improve this report!

Jira issue: CRDB-42090

@cockroach-teamcity cockroach-teamcity added branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team labels Sep 11, 2024
@cockroach-teamcity cockroach-teamcity added this to the 23.1 milestone Sep 11, 2024
@miraradeva
Copy link
Contributor

This fails during the test setup because the expiration-based lease for one of the test ranges doesn't change to epoch-based. It fails with:

E240911 12:15:05.732641 1696001 kv/kvserver/queue.go:1142 ⋮ [T1,n2,replicate,s2,r65/2:‹/{Table/100/"b"-Max}›] 392  expiration of liveness record ‹liveness(nid:2 epo:1 exp:1726056911.722186448,0)› is not greater than expiration of the previous lease repl=(n2,s2):2 seq=26 start=1726056905.723725850,0 exp=1726056911.723395893,0 pro=1726056905.723395893,0 after liveness heartbeat

This was supposed to be handled by #124885, which was backported to 23.1 in #130124. In that last PR, the description lists 7 commits and I only see 6 in the PR. It looks like the PR that retries the heartbeats is missing? @nvanbenschoten was that intentional?

@nvanbenschoten
Copy link
Member

@miraradeva nice catch. This wasn't intentional. The 23.1 backport required an extra commit (f97e02c) and things must have gotten mixed up as a result. I opened #130623 to backport that fix.

@nvanbenschoten
Copy link
Member

Should be fixed by #130623.

@arulajmani arulajmani added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-replication Relating to Raft, consensus, and coordination. and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. T-kv KV Team
Projects
None yet
Development

No branches or pull requests

4 participants