Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] operator crash #464

Closed
acekingke opened this issue Apr 25, 2022 · 3 comments · Fixed by #465
Closed

[bug] operator crash #464

acekingke opened this issue Apr 25, 2022 · 3 comments · Fixed by #465
Assignees
Labels
bug Something isn't working ok to test test ok
Milestone

Comments

@acekingke
Copy link
Contributor

acekingke commented Apr 25, 2022

Describe the problem

To Reproduce

  1. When delete the cluster, I found the operator crash
igs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-04-25T08:22:19.519Z        ERROR   mysqlcluster.syncer     failed to get/update node raft status  {"node": "sample-mysql-2.sample-mysql.default", "error": "failed to get raft status, err: Get \"http://sample-mysql-2.sample-mysql.default:6601/v1/raft/status\": dial tcp 10.233.92.76:6601: connect: connection refused"}
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).updateNodeStatus
        /go/src/mysqlcluster/syncer/status.go:237
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).Sync
        /go/src/mysqlcluster/syncer/status.go:151
github.com/presslabs/controller-util/syncer.Sync
        /go/pkg/mod/github.com/presslabs/[email protected]/syncer/syncer.go:82
github.com/radondb/radondb-mysql-kubernetes/controllers.(*StatusReconciler).Reconcile
        /go/src/controllers/status_controller.go:100
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-04-25T08:22:21.126Z        DEBUG   controllers.Status      Schedule new cluster for reconciliation        {"key": "default/sample"}
2022-04-25T08:22:26.127Z        DEBUG   controllers.Status      Schedule new cluster for reconciliation        {"key": "default/sample"}
2022-04-25T08:22:28.528Z        ERROR   mysqlcluster.syncer     failed to check slave status  {"node": "sample-mysql-2.sample-mysql.default", "error": "Error 1045: Access denied for user 'radondb_operator'@'10.233.96.1' (using password: YES)"}
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).updateNodeStatus
        /go/src/mysqlcluster/syncer/status.go:251
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).Sync
        /go/src/mysqlcluster/syncer/status.go:151
github.com/presslabs/controller-util/syncer.Sync
        /go/pkg/mod/github.com/presslabs/[email protected]/syncer/syncer.go:82
github.com/radondb/radondb-mysql-kubernetes/controllers.(*StatusReconciler).Reconcile
        /go/src/controllers/status_controller.go:100
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-04-25T08:22:28.531Z        ERROR   mysqlcluster.syncer     failed to check read only     {"node": "sample-mysql-2.sample-mysql.default", "error": "Error 1045: Access denied for user 'radondb_operator'@'10.233.96.1' (using password: YES)"}
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).updateNodeStatus
        /go/src/mysqlcluster/syncer/status.go:257
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).Sync
        /go/src/mysqlcluster/syncer/status.go:151
github.com/presslabs/controller-util/syncer.Sync
        /go/pkg/mod/github.com/presslabs/[email protected]/syncer/syncer.go:82
github.com/radondb/radondb-mysql-kubernetes/controllers.(*StatusReconciler).Reconcile
        /go/src/controllers/status_controller.go:100
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
2022-04-25T08:22:28.537Z        ERROR   mysqlcluster.syncer     failed to update labels {"pod": "sample-mysql-2", "namespace": "default", "error": "Operation cannot be fulfilled on pods \"sample-mysql-2\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).updateNodeStatus
        /go/src/mysqlcluster/syncer/status.go:278
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).Sync
        /go/src/mysqlcluster/syncer/status.go:151
github.com/presslabs/controller-util/syncer.Sync
        /go/pkg/mod/github.com/presslabs/[email protected]/syncer/syncer.go:82
github.com/radondb/radondb-mysql-kubernetes/controllers.(*StatusReconciler).Reconcile
        /go/src/controllers/status_controller.go:100
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x43d316]

goroutine 412 [running]:
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).updateNodeStatus(0xc000c91d70, 0x216af58, 0xc000c91b30, 0x217af88, 0xc000124a50, 0xc000bc5000, 0x3, 0x4, 0x0, 0x0)
        /go/src/mysqlcluster/syncer/status.go:286 +0xe9a
github.com/radondb/radondb-mysql-kubernetes/mysqlcluster/syncer.(*StatusSyncer).Sync(0xc000c91d70, 0x216af58, 0xc000c91b30, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/mysqlcluster/syncer/status.go:151 +0xdd6
github.com/presslabs/controller-util/syncer.Sync(0x216af58, 0xc000c91b30, 0x21719b8, 0xc000c91d70, 0x2168a80, 0xc00073cf00, 0x0, 0x0)
        /go/pkg/mod/github.com/presslabs/[email protected]/syncer/syncer.go:82 +0xa2
github.com/radondb/radondb-mysql-kubernetes/controllers.(*StatusReconciler).Reconcile(0xc00073cf40, 0x216af58, 0xc000c91b30, 0xc000c569a9, 0x7, 0xc000c56994, 0x6, 0xac02b40d5b2d3900, 0x0, 0x0, ...)
        /go/src/controllers/status_controller.go:100 +0x63d
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000544d20, 0x216af58, 0xc000c91b30, 0x1e6cb00, 0xc000216300)
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:298 +0x409
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000544d20, 0x216aeb0, 0xc000380000, 0xc0000b7700)
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:253 +0x35a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2(0xc00045ba90, 0xc000544d20, 0x216aeb0, 0xc000380000)
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:214 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
        /go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:210 +0x85

Expected behavior

Environment:

debug mode.

  • RadonDB MySQL version: operator
@acekingke acekingke added the bug Something isn't working label Apr 25, 2022
@acekingke acekingke self-assigned this Apr 26, 2022
@acekingke acekingke added this to the Next milestone Apr 26, 2022
acekingke added a commit to acekingke/radondb-mysql-kubernetes that referenced this issue Apr 26, 2022
@runkecheng
Copy link
Collaborator

Cannot recurrence when deploy using image.

@andyli029
Copy link
Contributor

Debug Env

@acekingke
Copy link
Contributor Author

How to repeat in none debug env:
use shell script as follow:

repeat () {
    for i in `seq 1 $2`; do
        $1
    done
}
testOperator() {
    yq w -i config/samples/mysql_v1alpha1_mysqlcluster.yaml spec.replicas 2
    yq w -i  config/samples/mysql_v1alpha1_backup.yaml spec.image $SIDECARIMAGE
    yq w -i config/samples/mysql_v1alpha1_mysqlcluster.yaml spec.persistence.size 20Gi
    kubectl apply -f config/samples/mysql_v1alpha1_mysqlcluster.yaml 
    yq w -i config/samples/mysql_v1alpha1_mysqlcluster.yaml spec.replicas 3
    kubectl apply -f config/samples/mysql_v1alpha1_mysqlcluster.yaml 
     kubectl delete -f config/samples/mysql_v1alpha1_mysqlcluster.yaml 
}

repeat testOperator 50

run it again and again, You can see it.

@qianfen2021 qianfen2021 added the ok to test test ok label May 7, 2022
acekingke added a commit to acekingke/radondb-mysql-kubernetes that referenced this issue Jun 15, 2022
zhl003 pushed a commit to zhl003/radondb-mysql-kubernetes that referenced this issue Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ok to test test ok
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants