Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: flake in some stressrace runs due to "observed raft log position less than per-stream lower bound" #107156

Closed
knz opened this issue Jul 19, 2023 · 1 comment · Fixed by #107412
Assignees
Labels
A-admission-control C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). T-kv KV Team

Comments

@knz
Copy link
Contributor

knz commented Jul 19, 2023

Observed e.g. here: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_BazelExtendedCi/10966431?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandBuildChangesSection=true&expandBuildProblemsSection=true&expandBuildTestsSection=true

(On top of the branch in #107135 - which appears unrelated).

xref #106589 and #106123.

F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1  observed raft log position less than per-stream lower bound (log-position=0/0 <= log-position=9/594)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !goroutine 203666 [running]:
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !runtime/debug.Stack()
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  GOROOT/src/runtime/debug/stack.go:24 +0x72
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.(*loggerT).outputLogEntry(0xc01818c300, {{{0xc008a14810, 0x24}, {0xa7ff583, 0x1}, {0xa7ff582, 0x1}, {0xa7ff583, 0x1}}, 0x1773207a1003c337, ...})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/util/log/clog.go:261 +0xbc
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.logfDepthInternal({0xf2b7fa0, 0xc01f87a840}, 0x2, 0x4, 0x0, 0x0, {0xa8f4d4d, 0x46}, {0xc012181040, 0x2, ...})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/util/log/channels.go:106 +0xd3e
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.logfDepth({0xf2b7fa0, 0xc01f87a840}, 0x1, 0x4, 0x0, {0xa8f4d4d, 0x46}, {0xc012181040, 0x2, 0x2})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/util/log/channels.go:39 +0x152
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/util/log.Fatalf({0xf2b7fa0, 0xc01f87a840}, {0xa8f4d4d, 0x46}, {0xc012181040, 0x2, 0x2})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/bazel-out/k8-dbg/bin/pkg/util/log/log_channels_generated.go:848 +0x12a
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowtokentracker.(*Tracker).Track(0xc01bf3d590, {0xf2b7fa0, 0xc01f87a840}, 0x0, 0x23b, {0x0, 0x0})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95 +0xfcf
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle.(*Handle).deductTokensForInner(0xc005417ce0, {0xf2b7fa0, 0xc01f87a840}, 0x0, {0x0, 0x0}, 0x23b)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle/kvflowhandle.go:162 +0x566
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle.(*Handle).DeductTokensFor(0xc005417ce0, {0xf2b7fa0, 0xc01f87a840}, 0x0, {0x0, 0x0}, 0x23b)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvflowcontrol/kvflowhandle/kvflowhandle.go:141 +0x105
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.maybeDeductFlowTokens({0xf2b7fa0, 0xc018d9eea0}, {0xf2f5e78, 0xc005417ce0}, {0xc00a83b130, 0x3, 0x3}, {0xc0054e6630, 0x3, 0x3})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_proposal_buf.go:1051 +0x5d3
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.proposeBatch({0xf2b7fa0, 0xc018d9eea0}, {0x7f2c0f529b90, 0xc01a928d80}, {0xf2d69a0, 0xc00adf6c80}, {0xc0054e6630, 0x3, 0x3}, {0xc00a83b130, ...}, ...)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_proposal_buf.go:1028 +0x6ab
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*propBuf).FlushLockedWithRaftGroup(0xc01a9295d0, {0xf2b7fa0, 0xc018d9eea0}, {0xf2d69a0, 0xc00adf6c80})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_proposal_buf.go:621 +0x2fbb
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked.func2(0xc00adf6c80)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:796 +0x1a5
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).withRaftGroupLocked(0xc01a928d80, 0xc00f9a5f20)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:2009 +0x15d
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReadyRaftMuLocked(_, {_, _}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:793 +0x7de
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).handleRaftReady(_, {_, _}, {{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...}, ...})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/replica_raft.go:749 +0x418
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Store).processReady(0xc001fa1500, 0x2)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/store_raft.go:670 +0x31a
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftSchedulerShard).worker(0xc00a93e230, {0xf2b7fa0, 0xc005a8b9b0}, {0xf29db80, 0xc001fa1500}, 0xc001fa0a80)
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:418 +0x682
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*raftScheduler).Start.func2({0xf2b7fa0, 0xc005a8b9b0})
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/kv/kvserver/pkg/kv/kvserver/scheduler.go:321 +0xf7
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx.func2()
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:484 +0x512
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTaskEx
F230719 01:26:33.820848 203666 kv/kvserver/kvflowcontrol/kvflowtokentracker/tracker.go:95  [T1,n2,s2,r2/3:/System/NodeLiveness{-Max}] 1 !  github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:475 +0xc54

Jira issue: CRDB-29920

@knz knz added C-test-failure Broken test (automatically or manually discovered). A-admission-control T-kv KV Team labels Jul 19, 2023
@knz
Copy link
Contributor Author

knz commented Jul 19, 2023

@irfansharif just a 2c thought: maybe the stressrace run makes the code just slow enough that it doesn't tick enough? (I'm thinking about your comment here).

craig bot pushed a commit that referenced this issue Jul 24, 2023
107412: kvserver: fix 'observed raft log position' assertion r=irfansharif a=irfansharif

Fixes #107336.
Fixes #106123.
Fixes #107156.
Fixes #106589.

It's possible to hit this assertion under --stress --race when the proposing replica is starved enough for raft ticks that it loses leadership right when it steps proposals through raft. We're relying on undocumented API semantics in the etcd raft library whereby it mutates stepped entries with the term+index its to end up in. But that's only applicable if stepping through entries as a leader. Simply relax this assertion instead.

Release note: None

Co-authored-by: irfan sharif <[email protected]>
@craig craig bot closed this as completed in deb538f Jul 24, 2023
@tbg tbg added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-admission-control C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-test-failure Broken test (automatically or manually discovered). T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants