-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: loqrecovery/workload=movr/rangeSize=2mb failed #91016
Comments
cc @cockroachdb/replication |
Digging into this there are 2 problems.
|
@tbg what do you think, should we disallow plan creation if we see any descriptor updates? If it would be reported as error and require 'force' to create a plan, then one can just go and remove replica from plan if they know it would cause a panic. |
Could you type out a few more details? i.e. we have stores X Y Z and then the follwing update is made and after the update Y will panic with which error etc. Thanks! |
We have stores 1,2,3,4 with replicas on 1,2,3. There's a pending descriptor change to add a learner to node 4. Nodes 2 and 3 are killed. Recovery proceeds and picks 1 despite it having an unapplied descriptor change. Update removes all other replicas from descriptor and bumps replica ID from 4 to 15. Node restarts and tries to apply committed log with descriptor change which tries to change replica ID back to its previous value. At this point it panics:
Note the real case in logs has 4 voters and one added learner, but it doesn't make much difference for this case. |
Underlying failure is tracked in a separate issue #91271 |
roachtest.loqrecovery/workload=movr/rangeSize=2mb failed with artifacts on release-22.2.0 @ 207058d86e0b30ee27866cdf7df923791e1adf55:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=4
,ROACHTEST_encrypted=false
,ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-21074
The text was updated successfully, but these errors were encountered: