Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: vitess could up the cluster on the latest tag #12678

Closed
Areso opened this issue Mar 21, 2023 · 2 comments · Fixed by planetscale/vitess-operator#398
Closed

Bug Report: vitess could up the cluster on the latest tag #12678

Areso opened this issue Mar 21, 2023 · 2 comments · Fixed by planetscale/vitess-operator#398
Assignees
Labels

Comments

@Areso
Copy link

Areso commented Mar 21, 2023

Overview of the Issue

  1. After changing images vtctld: vitess/lite:v16.0.0 to vtctld: vitess/lite:latest cluster couldn't up becasue of the error:
mysqld socket file exists, but can't connect: server asked for unsupported auth method: sha256_password (errno 2012) (sqlstate HY000)

Reproduction Steps

  1. create namespace vitess
  2. apply -f no_secret_prom.yml no_secret_prom.txt
  3. apply -f cluster_k8s-mysql_to_share.yml cluster_k8s-mysql_to_share.txt
  4. users aren't created, if instead of vitess/lite:v16.0.0 used vitess/lite:latest
mysql> select user from mysql.user;
+------------------+
| user             |
+------------------+
| mysql.infoschema |
| mysql.session    |
| mysql.sys        |
| root             |
+------------------+
4 rows in set (0.00 sec)

thus we got error

mysql> select * from information_schema.CONNECTION_CONTROL_FAILED_LOGIN_ATTEMPTS;
+---------------------------+-----------------+
| USERHOST                  | FAILED_ATTEMPTS |
+---------------------------+-----------------+
| 'vt_allprivs'@'localhost' |              17 |
| 'vt_dba'@'localhost'      |              91 |
+---------------------------+-----------------+

thus the cluster is never ready (probes are fail):

NAME                                                              READY   STATUS    RESTARTS      AGE
test-vttablet-eucentral1c-2380659132-c1769cbe                  2/3     Running   2 (84m ago)   85m
test-vttablet-eucentral1c-3099445110-3649f3e3                  2/3     Running   2 (84m ago)   85m
test-vttablet-eucentral1c-3197856528-2bd8bdf3                  2/3     Running   2 (84m ago)   85m

these one:

  Warning  Unhealthy               37m (x2 over 37m)      kubelet                  Readiness probe failed: dial tcp 10.11.75.79:3306: connect: connection refused
  Warning  Unhealthy               2m45s (x244 over 37m)  kubelet                  Readiness probe failed: HTTP probe failed with statuscode: 500

Binary Version

vtctlclient --server 127.0.0.1:15999 --version
Version: 15.0.2 (Git revision a914f409c823ba1fe74d816885e3e17c57e63f08 branch 'HEAD') built on Wed Dec 14 17:57:23 UTC 2022 by runner@fv-az577-365 using go1.18.9 linux/amd64

strange, because current thing looks like this

knv describe pod test-eucentral1c-vtctld-362d5fd2-84d978447-d674f
Name:                 test-eucentral1c-vtctld-362d5fd2-84d978447-d674f
Namespace:            vitess
Priority:             1000
Priority Class Name:  vitess
Service Account:      default
Node:                 ip-10-96-154-27.eu-central-1.compute.internal/10.96.154.27
Start Time:           Tue, 21 Mar 2023 14:19:07 +0200
Labels:               planetscale.com/cell=eucentral1c
                      planetscale.com/cluster=test
                      planetscale.com/component=vtctld
                      pod-template-hash=84d978447
Annotations:          kubernetes.io/psp: eks.privileged
Status:               Running
IP:                   10.11.75.251
IPs:
  IP:           10.11.75.251
Controlled By:  ReplicaSet/test-eucentral1c-vtctld-362d5fd2-84d978447
Containers:
  vtctld:
    Container ID:  docker://fa07d78513bcdae0f95af657190486c141ea46b80d3da110925f3304926c333e
    Image:         vitess/lite:latest
    Image ID:      docker-pullable://vitess/lite@sha256:6d7e67bbef4493516d87542def9a08d77a36c7e59904585c04b94fc30b55905c
    Ports:         15000/TCP, 15999/TCP
    Host Ports:    0/TCP, 0/TCP
    Command:
      /vt/bin/vtctld


### Operating System and Environment details

```sh
EKS, "version": "1.23"

Log Fragments

No response

@Areso Areso added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Mar 21, 2023
@Areso
Copy link
Author

Areso commented Mar 21, 2023

image

@mattlord mattlord self-assigned this Mar 21, 2023
@mattlord mattlord added Component: Operator Vtop related issues and removed Needs Triage This issue needs to be correctly labelled and triaged labels Mar 21, 2023
@mattlord
Copy link
Contributor

mattlord commented Mar 21, 2023

Hi @Areso !

Thank you for reporting this issue! We ran into the same thing ourselves earlier today and started working on the fix here: planetscale/vitess-operator#398

This was caused by a recent change in v17.0.0-SNAPSHOT: #12206

The database instance init code in the deployment yaml now needs to temporarily disable super_read_only in order to create the users in the mysqld instance, before re-enabling super_read_only. You can see an example of where that's being done in the updated deployment yaml here: https://github.com/planetscale/vitess-operator/pull/398/files#diff-a83de14404466819742421b2aa49ce2575b2f8b4304b3078c8b28cd49582aa14
Because it did not, we failed to create the users and encountered the issue described here: https://lefred.be/content/mysql-whos-filling-my-error-log/

This is a breaking change in v17 that we have documented int the release notes and will have more details on. We'll need to be sure that we clearly communicate the breaking change and how to upgrade w/o issue.

I'm confused as to why you said you were using the v16.0.0 image, then upgraded to latest which is v17.0.0-SNAPSHOT), yet noted that you're using v15.0.2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants