Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stability test failed: problem with partition settings when PD is continuously rolling upgrade #784

Closed
shuijing198799 opened this issue Aug 19, 2019 · 1 comment · Fixed by #830
Assignees
Labels
test/stability stability tests type/bug Something isn't working

Comments

@shuijing198799
Copy link
Contributor

shuijing198799 commented Aug 19, 2019

Bug Report

What version of Kubernetes are you using?

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:54:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.2", GitCommit:"17c77c7898218073f14c8d573582e8d2313dc740", GitTreeState:"clean", BuildDate:"2018-10-24T06:43:59Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
What version of TiDB Operator are you using?

TiDB Operator Version: version.Info{GitVersion:"v1.0.0-rc.1.54+3045bfe24cdce5", GitCommit:"3045bfe24cdce51a1551f685a1f14ff28fea7bbc", GitTreeState:"clean", BuildDate:"2019-08-16T08:03:45Z", GoVersion:"go1.12", Compiler:"gc", Platform:"linux/amd64"}

What storage classes exist in the Kubernetes cluster and what are used for PD/TiKV pods?

NAME PROVISIONER AGE
local-storage kubernetes.io/no-provisioner 132d
What's the status of the TiDB cluster pods?

All pods are running normally
What did you do?

What did you expect to see?
Stability test is report error "pd is leader, can't be deleted namespace ns2 name cluster2-pd-2"
What did you see instead?
Stability is runnig well.

@shuijing198799
Copy link
Contributor Author

shuijing198799 commented Aug 19, 2019

Additional:
I found that the rolling upgrade operation was performed twice in the stability test. I caught this log during the second rolling upgrade.

I0818 01:12:23.535288 1 event.go:221] Event(v1.ObjectReference{Kind:"TidbCluster", Namespace:"ns2", Name:"cluster2", UID:"0a3d8b21-c150-11e9-8fa1-525400f42bc9" , APIVersion: "pingcap.com/v1alpha1", ResourceVersion: "35283531", FieldPath:""}): type: 'Normal' reason: 'SuccessfulUpdate' update StatefulSet cluster2-pd in TidbCluster cluster2 successful
I0818 01:12:23.555194 1 tidbcluster_control.go:68] TidbCluster: [ns2/cluster2] updated successfully
I0818 01:12:23.572846 1 utils.go:185] set ns2/cluster2-pd partition to 3
I0818 01:12:23.572875 1 utils.go:185] set ns2/cluster2-pd partition to 0

From the perspective of rolling upgrades, at the first time to set partition to 3 should be the correct operation. But the second time should be set to 2 instead of 0, the initial suspicion is

Https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/pd_upgrader.go#L60-L80

In this code, pd is continuously upgraded, causing problems with the revision settings.

@xiaojingchen please to see this question with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test/stability stability tests type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants