Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evict leader scheduler can not show after pd leader recovery from failure #4707

Closed
mayjiang0203 opened this issue Mar 4, 2022 · 4 comments
Closed

Comments

@mayjiang0203
Copy link

Bug Report

What did you do?

1、Add evict-leader-scheduler to two tikv;
2、Inject pd leader instance down chaos;
3、After more than 5min,show scheduler config, found no evict-leader,try remove it at this time, return 404 also;
4、After several hours,show scheduler again, found it exist.

What did you expect to see?

In step 3, there should be evict-leader show here.

What did you see instead?

In step 3, no evict leader scheduler.

What version of PD are you using (pd-server -V)?

/ # /pd-server -V
Release Version: v5.5.0-alpha-72-gcc256b5e
Edition: Community
Git Commit Hash: cc256b5
Git Branch: master
UTC Build Time: 2022-03-01 08:26:57

Test logs:

[2022/03/04 13:20:41.459 +08:00] [INFO] [pdutil.go:105] ["/pd-ctl scheduler remove evict-leader-scheduler:[404] "[PD:scheduler:ErrSchedulerNotFound]scheduler not found""]
2022-03-04T13:20:41.729+0800 INFO k8s/client.go:107 it should be noted that a long-running command will not be interrupted even the use case has ended. For more information, please refer to https://github.com/pingcap/test-infra/discussions/129
[2022/03/04 13:20:42.163 +08:00] [INFO] [pdutil.go:105] ["/pd-ctl scheduler add evict-leader-scheduler 4:Success!"]
[2022/03/04 13:20:42.223 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-1=4007]
[2022/03/04 13:20:52.283 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-1=4007.5]
[2022/03/04 13:21:02.338 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-1=4008]
[2022/03/04 13:21:12.406 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-1=4008]
[2022/03/04 13:21:22.479 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-1=2004]
[2022/03/04 13:21:32.553 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-1=0]
2022-03-04T13:21:32.553+0800 INFO k8s/client.go:107 it should be noted that a long-running command will not be interrupted even the use case has ended. For more information, please refer to https://github.com/pingcap/test-infra/discussions/129
[2022/03/04 13:21:33.108 +08:00] [INFO] [pdutil.go:105] ["/pd-ctl scheduler add evict-leader-scheduler 5:Success!"]
[2022/03/04 13:21:33.162 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=5352]
[2022/03/04 13:21:43.236 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=5352]
[2022/03/04 13:21:53.315 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=5351.5]
[2022/03/04 13:22:03.380 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=5349.5]
[2022/03/04 13:22:13.436 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=5349.5]
[2022/03/04 13:22:23.498 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=2674]
[2022/03/04 13:22:33.564 +08:00] [INFO] [check.go:471] ["current leader:"] [tc-tikv-3=0]
[2022/03/04 13:22:33.564 +08:00] [INFO] [chaos.go:358] ["fault will last for"] [duration=2m0s]
[2022/03/04 13:22:34.056 +08:00] [INFO] [chaos.go:86] ["Run chaos"] [name="pd leader"] [selectors="[testbed-oltp-hm-7wksp/tc-pd-0]"] [experiment="{"Duration":"","Scheduler":null}"]
[2022/03/04 13:24:34.128 +08:00] [INFO] [chaos.go:151] ["Clean chaos"] [name="pd leader"] [chaosId="ns=testbed-oltp-hm-7wksp,kind=failure,name=pod-failure-qcsfgfnq,spec=&k8s.ChaosIdentifier{Namespace:"testbed-oltp-hm-7wksp", Name:"pod-failure-qcsfgfnq", Spec:FailureExperimentSpec{Duration: "", Scheduler: }}"]
2022-03-04T13:26:34.329+0800 INFO k8s/client.go:107 it should be noted that a long-running command will not be interrupted even the use case has ended. For more information, please refer to https://github.com/pingcap/test-infra/discussions/129
[2022/03/04 13:26:34.750 +08:00] [INFO] [pdutil.go:105] ["/pd-ctl scheduler config evict-leader-scheduler:[404] scheduler not found"]
2022-03-04T13:26:34.751+0800 INFO k8s/client.go:107 it should be noted that a long-running command will not be interrupted even the use case has ended. For more information, please refer to https://github.com/pingcap/test-infra/discussions/129
[2022/03/04 13:26:35.188 +08:00] [INFO] [pdutil.go:105] ["/pd-ctl scheduler remove evict-leader-scheduler:[404] "[PD:scheduler:ErrSchedulerNotFound]scheduler not found""]

image

@mayjiang0203 mayjiang0203 added the type/bug The issue is confirmed as a bug. label Mar 4, 2022
@mayjiang0203
Copy link
Author

mayjiang0203 commented Mar 4, 2022

/type bug
/severity major
/found automation

@rleungx
Copy link
Member

rleungx commented Mar 11, 2022

Is it reproducible? I think it might be that the scheduler deletion isn't persisted successfully.

@mayjiang0203 mayjiang0203 changed the title Evict leader scheduler can been show after pd leader recovery from failure Evict leader scheduler can not show after pd leader recovery from failure Apr 11, 2022
@rleungx
Copy link
Member

rleungx commented Apr 11, 2022

might be related to #4769

@VelocityLight VelocityLight added affects-6.1 This bug affects the 6.1.x(LTS) versions. and removed affects-6.1 This bug affects the 6.1.x(LTS) versions. labels May 20, 2022
@VelocityLight VelocityLight added the affects-6.5 This bug affects the 6.5.x(LTS) versions. label Dec 2, 2022
@VelocityLight VelocityLight added affects-6.6 and removed affects-6.5 This bug affects the 6.5.x(LTS) versions. labels Feb 6, 2023
@VelocityLight VelocityLight added the affects-7.1 This bug affects the 7.1.x(LTS) versions. label Apr 20, 2023
@nolouch
Copy link
Contributor

nolouch commented Jul 5, 2023

close due too staled.

@nolouch nolouch closed this as completed Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants