Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 etcd client terseness #2486

Conversation

sethp-nr
Copy link
Contributor

What this PR does / why we need it:

These changes surface the connection error that occurs inside the etcd
client when a misconfiguration occurs (such as invalid TLS certificates,
or an address that does not match one of the valid SANs).

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2454

Two things I don't feel great about are:

  • I'm not sure how to write a reasonably sized test for the behavior; the "easy" way to reproduce the issue I saw is to delete the etcd certs from a CAPI management cluster and let them be regenerated. Another approach that worked well in testing was to change this line in controlplane/kubeadm/internal/cluster.go:
	etcdclient, err := etcd.NewEtcdClient("127.0.0.1", dialer.DialContextWithAddr, tlsConfig)

to

	etcdclient, err := etcd.NewEtcdClient("invalid name that doesn't show up in the server's SAN list", dialer.DialContextWithAddr, tlsConfig)

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 28, 2020
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 28, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sethp-nr
To complete the pull request process, please assign davidewatson
You can assign the PR to them by writing /assign @davidewatson in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sethp-nr sethp-nr force-pushed the fix/etcd-client-terseness branch from 10252ff to 5f1af73 Compare February 28, 2020 22:31
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 28, 2020
@sethp-nr sethp-nr force-pushed the fix/etcd-client-terseness branch from 5f1af73 to c143f0a Compare February 28, 2020 22:32
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 28, 2020
These changes surface the connection error that occurs inside the etcd
client when a misconfiguration occurs (such as invalid TLS certificates,
or an `address` that does not match one of the valid SANs).
@sethp-nr sethp-nr force-pushed the fix/etcd-client-terseness branch from c143f0a to 5eec3b0 Compare February 28, 2020 23:03
@randomvariable
Copy link
Member

nice digging.
we may need to wait for your PR to grpc-go to merge though
/assign @vincepri for thoughts

@vincepri
Copy link
Member

vincepri commented Mar 2, 2020

/hold
/milestone v0.3.x

Waiting for the upstream PR to GRPC to be merged before merging this

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 2, 2020
@k8s-ci-robot k8s-ci-robot added this to the v0.3.x milestone Mar 2, 2020
@detiber
Copy link
Member

detiber commented Mar 2, 2020

New upstream PR: grpc/grpc-go#3412

@sethp-nr
Copy link
Contributor Author

sethp-nr commented Mar 6, 2020

I got redirected to grpc/grpc-go#2031 to continue the discussion.

@randomvariable
Copy link
Member

sigh We also have the issue that etcd is not yet compatible with go-rpc 1.27

@sethp-nr
Copy link
Contributor Author

sethp-nr commented Mar 6, 2020

Yeah, that's why this PR is pointing to a backport of my fix from the 1.27 branch to the 1.23 branch.

I'm not sure what grpc-go's backport policy is yet, my hope is that we can get whatever fix we land on into official 1.23!

@randomvariable
Copy link
Member

somebody made a start at updating etcd etcd-io/etcd#11663

@k8s-ci-robot
Copy link
Contributor

@sethp-nr: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 6, 2020
@vincepri vincepri marked this pull request as draft April 20, 2020 21:17
@vincepri
Copy link
Member

Going to close this for now, given that it has been open for a while without progress, feel free to reopen when needed.

/close

@k8s-ci-robot
Copy link
Contributor

@vincepri: Closed this PR.

In response to this:

Going to close this for now, given that it has been open for a while without progress, feel free to reopen when needed.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[kubeadm control plane]: etcd communication errors are being swallowed
5 participants