-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HA access to kube-apiserver
using a host-local proxy
#2106
Conversation
Hello nicolast,My role is to assist you with the merge of this Status report is not available. |
ConflictA conflict has been raised during the creation of I have not created the integration branch. Here are the steps to resolve this conflict: $ git fetch
$ git checkout -B w/2.5/improvement/GH-2103-apiserver-ha origin/development/2.5
$ git merge origin/improvement/GH-2103-apiserver-ha
$ # <intense conflict resolution>
$ git commit
$ git push -u origin w/2.5/improvement/GH-2103-apiserver-ha |
7c93b7c
to
3c4fd71
Compare
There's a leftover reference to the Need input @gdemonet. |
CI failures because in the upgrade/downgrade scenarios, |
b270c28
to
49449fc
Compare
49449fc
to
c93a3f4
Compare
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
Peer approvals must include at least 1 approval from the following list:
|
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
Peer approvals must include at least 1 approval from the following list:
The following reviewers are expecting changes from the author, or must review again: |
29be094
to
acab9d7
Compare
History mismatchMerge commit #2d6af2647b352daf7acee3aa6ee6f23d2771ceca on the integration branch It is likely due to a rebase of the branch Please use the |
acab9d7
to
706d07e
Compare
This seems to somehow break downgrade, because |
3d4a373
to
b6acef7
Compare
Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106
b6acef7
to
e4f5c6c
Compare
Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106
Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106
5a57463
to
f203600
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good just few comments
@@ -4,7 +4,7 @@ | |||
include: | |||
- metalk8s.internal.m2crypto | |||
|
|||
{%- set apiserver = 'https://' ~ pillar.metalk8s.api_server.host ~ ':6443' %} | |||
{%- set apiserver = 'https://' ~ grains['metalk8s']['control_plane_ip'] ~ ':6443' %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do not use the "proxy" here ?
That mean if for what ever reason local apiserver do not work the admin.conf
will not work and since we use this kubeconfig everywhere it's not really good IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this conf is perfectly valid to be used from outside the node, e.g. an admin copying it to his machine (aren't we even doing that for tests?)
Note how most KubeConfig
s are rewritten to use the local proxy, so the services using those shouldn't be impacted. AFAIK only salt-master
is using admin.conf
(which is a bug by itself...).
In practice, not much changes, because in our current deployments the apiServer.host
value is set to the hostname or IP of the bootstrap node anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that any script which uses admin.conf
and is certain it's running on a host on which the proxy is deployed, could pass -s
/--server
to the kubectl
invocation to use the proxy anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok but then I think we need to change something in our current restore script because if I well remember we need to contact apiserver before configuring the local one and we basically call kubeconfig
salt state from apiserver and then use this new admin,conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wouldn't work today in the first place then, unless the APIserver address was HA...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I mean, that's not a reasonable requirement in the first place.
Instead, the recover script should take, as an argument, the address of an API-server, and use that address when connecting to do whatever it needs to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR yes indeed but when restore script get writed in our bootstrap config the goal was to have an IP able to hit apiserver, and not only bound to bootstrap IP, either with keepalived or with any other system because configuring all server with the bootstrap IP as apiserver endpoint is really wrong.
But yes now we need to change this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that a VIP or LB for API-server has always been optional, so the restore script relying on that is wrong in the first place.
The bug fixed in this PR is the fact that sometimes we relied on this given address to be HA, given the context of #2108
The restore script should also be updated to support this properly, and I'm happy to open a bug for this, but I'd rather not move this into this PR, since the context is completely different IMHO.
Finally, the lack of CI for the restore operation (#1687) basically allows me to merge this without remorse ;-P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, I keep putting comments, but the topic keeps resonating.
What I was thinking, primarily: there should be no admin.conf
in the backup, because it's not at all required to restore a cluster. An admin.conf
can be generated from the CA data that does need to be backed up.
The whole restore process should be constructive, starting from the minimal set of required data as kept in a backup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I went to see, and apparently there's not even an admin.conf
in the backup archive (which is good!)
However, what is in there has a really strange structure... We really need to spend some time on that it seems.
For some reason `/etc/profile` isn't exported when `doit` is invoked. As such, the build is not using the caching HTTP proxy we have in place in our CI environment, resulting in a lot of unnecessary public traffic or hitting upstream services. Fixing by an explicit `source` of the environment file when invoking `doit.sh`.
Install of this version is broken (missing `requirements/base.txt` file used by `setup.py` in the package). See: kragniz/python-etcd3#952
This SLS really doesn't make much sense.
Using two different permissions for the same file doesn't really help trying to make states idempotent.
Since we'll start using `nginx` as a 'normal' container, we need to distribute it as both a tar archive (to inject in the `containerd` cache on the bootstrap node), as well as in the 'registry' for clients to pull. See: #2103
We replace the `save_as_tar` boolean by a list of formats to use to save the image. This also allows to cleanup the code of RemoteImage by deporting format-specific code into dedicated classes instead of branching hither and yon in RemoteImage. Refs: #2103 Signed-off-by: Sylvain Laperche <[email protected]> (cherry picked from commit ac15e32) Signed-off-by: Nicolas Trangez <[email protected]>
When using a local(host) `nginx` to proxy to the `kube-apiserver` instances using a simple *stream* transport (i.e., TCP socket, no TLS handling by `nginx`), we need to ensure the TLS server exposes a certificate the client will accept, i.e., has `IP:127.0.0.1` in the `subjectAlternateName` field. See: #2103
This is similar to the poor-mans-HA functionality as found in the Kubespray project. See: #2103
Instead of using `BootstrapConfiguration` `apiServer.host`, which will be removed in a subsequent commit since no longer required, use the control-plane IP of the host on which the `KubeConfig` file is generated. See: #2103
…rver` Instead of relying on the `BootstrapConfiguration` `apiServer.host` value, we can instead use the proxy to `kube-apiserver` running on every host in the cluster. See: #2103
Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106
We no longer need this since we provide in-cluster HA for `kube-apiserver` access. If this is desired for out-of-cluster access, we can provide this using a `LoadBalancer` `Service` once we have the infrastructure to support this in place. This also removed the optional deployment of `keepalived`. See: #2103 See: #1788
History mismatchMerge commit #edb50cc539c1bad95a84d59ec3a8fd77e294bbcf on the integration branch It is likely due to a rebase of the branch Please use the The following options are set: approve |
f203600
to
9d70c9d
Compare
/reset |
Reset completeI have successfully deleted this pull request's integration branches. The following options are set: approve |
1 similar comment
Reset completeI have successfully deleted this pull request's integration branches. The following options are set: approve |
ConflictA conflict has been raised during the creation of I have not created the integration branch. Here are the steps to resolve this conflict: $ git fetch
$ git checkout -B w/2.5/improvement/GH-2103-apiserver-ha origin/development/2.5
$ git merge origin/improvement/GH-2103-apiserver-ha
$ # <intense conflict resolution>
$ git commit
$ git push -u origin w/2.5/improvement/GH-2103-apiserver-ha The following options are set: approve |
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
Peer approvals must include at least 1 approval from the following list:
The following options are set: approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (except restore script which is broken now but will be tackle in another PR)
In the queueThe changeset has received all authorizations and has been added to the The changeset will be merged in:
The following branches will NOT be impacted:
There is no action required on your side. You will be notified here once IMPORTANT Please do not attempt to modify this pull request.
If you need this pull request to be removed from the queue, please contact a The following options are set: approve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works on my machine, and code looks good to me, except for the apiVersion
of the BootstrapConfiguration
which should change (as you documented it, it's a breaking change).
name: apiserver-proxy | ||
namespace: kube-system | ||
labels: | ||
addonmanager.kubernetes.io/mode: Reconcile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I was surprised by this "addon manager", looks like it is a "Bash Operator" :) https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/addon-manager/kube-addons.sh
{%- if apiserver == grains['metalk8s']['control_plane_ip'] %} | ||
{%- set weight = 100 %} | ||
{%- else %} | ||
{%- set weight = 1 %} | ||
{%- endif %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, you can use inline conditional expressions with Jinja:
{%- set weight = 100 if apiserver == grains['metalk8s']['control_plane_ip'] else 1 %}
@@ -2,6 +2,15 @@ | |||
|
|||
## Release 2.4.2 (in development) | |||
|
|||
### Breaking changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would probably make sense to bump the version of BootstrapConfiguration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. What you can put in the 'object' doesn't really change: there are no backwards-incompatible changes or anything. We just now ignore information we did use before (and as such, no longer require them to be set). It's a required value that now becomes kinda 'optional' (and unused, at all).
Basically, any existing v1alpha2
BootstrapConfiguration
still works.
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue GH-2103. Goodbye nicolast. |
Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106
Implement the 'local
nginx
' approach as described in #2103.Fixes: #2103