Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic certificates renewal #2914

Merged
merged 23 commits into from
Dec 16, 2020

Conversation

alexandre-allard
Copy link
Contributor

@alexandre-allard alexandre-allard commented Nov 4, 2020

Component: salt

Context:
In MetalK8s, we deploy a lot of certificates for interaction between components.
These certificates have an expiration date and must then be renewed, otherwise the services can no longer communicate.

Summary:

  • Add a beacon configuration through role pillars for each deployed certificate to trigger an event when a certificate is expired or will expire.
  • Add a reactor listening for the certificate expiration event, then based on information in role pillars, run specific sls to renew the expired certificates.

Acceptance criteria:
Tests that every certificate is well renewed and that nothing breaks or every services that must be restarted are.


Closes: #1887

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@alexandre-allard alexandre-allard force-pushed the feature/1887-automatic-certs-renewal branch from 455c330 to 43dfb73 Compare November 4, 2020 08:23
@bert-e

This comment has been minimized.

@alexandre-allard alexandre-allard force-pushed the feature/1887-automatic-certs-renewal branch from 43dfb73 to 5891105 Compare November 4, 2020 13:28
@bert-e

This comment has been minimized.

@alexandre-allard

This comment has been minimized.

@alexandre-allard alexandre-allard marked this pull request as ready for review November 4, 2020 13:29
@alexandre-allard alexandre-allard requested a review from a team as a code owner November 4, 2020 13:29
@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@NicolasT
Copy link
Contributor

NicolasT commented Nov 4, 2020

This is not sufficient, since from what I can see it only checks 'plain file' certs. However, we have certs embedded in KubeConfig-style files, e.g. in /etc/kubernetes/calico.conf or /etc/kubernetes/admin.conf which should be checked and updated as well.

Also, I think there should be some form of documentation/notes somewhere explaining how this feature 'works' and how the various pieces work together.

Copy link
Contributor

@gdemonet gdemonet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the kubeconfig-related parts.

salt/_beacons/metalk8s_kubeconfig_info.py Outdated Show resolved Hide resolved
salt/_beacons/metalk8s_kubeconfig_info.py Outdated Show resolved Hide resolved
salt/_beacons/metalk8s_kubeconfig_info.py Outdated Show resolved Hide resolved
salt/_beacons/metalk8s_kubeconfig_info.py Outdated Show resolved Hide resolved
salt/_beacons/metalk8s_kubeconfig_info.py Outdated Show resolved Hide resolved
@alexandre-allard alexandre-allard force-pushed the feature/1887-automatic-certs-renewal branch from 87604df to d910eca Compare November 5, 2020 11:10
@bert-e

This comment has been minimized.

@alexandre-allard

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e

This comment has been minimized.

@alexandre-allard alexandre-allard force-pushed the feature/1887-automatic-certs-renewal branch from d910eca to 3a414ae Compare November 5, 2020 11:12
@bert-e

This comment has been minimized.

@alexandre-allard

This comment has been minimized.

@bert-e

This comment has been minimized.

@bert-e
Copy link
Contributor

bert-e commented Nov 5, 2020

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

@bert-e
Copy link
Contributor

bert-e commented Nov 5, 2020

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

@alexandre-allard alexandre-allard changed the title Feature/1887 automatic certs renewal Automatic certificates renewal Nov 5, 2020
@alexandre-allard alexandre-allard force-pushed the feature/1887-automatic-certs-renewal branch 2 times, most recently from 351e4e9 to d6e3363 Compare November 9, 2020 13:57
This will allow to change the time a kubeconfig
is valid (by changing the embedded certificates
validity period).

Refs: #1887
This beacon takes a list of kubeconfig as input
and checks whether they need to be renewed or not
triggering an event on Salt bus when needed.

To configure this beacon, a section must be
added, either in minion configuration or through
the pillar, as follows:

beacons:
  metalk8s_kubeconfig_info:
    - files:
        - /etc/kubernetes/calico.conf
        - /etc/kubernetes/admin.conf:
            notify_days: 30
    - interval: 86400
    - notify_days: 15

Default notify_days, if not provided, is 45.
It can be overridden for a specific kubeconfig as
shown above.

Refs: #1887
This file is generated by pytest when using
--cov* options for salt unit tests, so let's
ignore it to avoid polluting git command output
or committing it because of a careless mistake.
These defaults will be merged with the pillar
and can be overriden, they'll be used by both
the certificate & kubeconfig expiry beacons,
the related reactor (certs renewal) and the
`x509.certificate_managed` state.

Refs: #1887
This sls setup the two beacons used to watch
certificates and kubeconfig expiry.
We also need to install pyOpenSSL package for
cert_info beacon to work.

Refs: #1887
Since this is needed on almost any node, let's
deploy the beacons on all nodes, if there is
no certificate to watch it will do nothing anyway.

Refs: #1887
This pillar entry will be consumed by the Salt
beacon configuration formula.
This beacon watches certificate expirations.

Refs: #1887
This pillar entry will be consumed by the Salt
formulas configuring bootstrap role nodes,
the beacon and the reactor listening for
certificate expiration events.

If the path of an expired certificate matches one
in this list, the sls under `regen_sls` will be run.

Refs: #1887
This pillar entry will be consumed by the Salt
formulas configuring etcd, the beacon and the
reactor listening for certificate expiration
events.

If the path of an expired certificate matches one
in this list, the sls under `regen_sls` will be run.

Refs: #1887
This pillar entry will be consumed by Salt formulas
configuring master nodes, beacon and reactor
listening for certificate expiration events.

If the path of an expired certificate matches one
in this list, the sls under `regen_sls` will be run.

Refs: #1887
Replace hardcoded path for calico kubeconfig
in the related formulas, using the new
entries under certificates key in the
`defaults.yaml` file.

Refs: #1887
Replace hardcoded path for kubelet kubeconfig
in the related formulas, using the new
entries under certificates key in the
`defaults.yaml` file.

Refs: #1887
This orchestrate will be called by the reactor
when it will receive an event for an expired
certificates.
It will run `sls` defined under `certs_renewal`
pillar entry for each expired certificate.

Refs: #1887
This reactor will be called when an expired
certificate event will be received.
It will then launch an orchestrate
`orchestrate.certs.renew`, passing the list
of expired certificates, to renew them.

Refs: #1887
Set up the configuration in salt master cfg
for the certificate expiration reactor.

Refs: #1887
Set up the configuration in salt master cfg
for the kubeconfig expiration reactor.

Refs: #1887
This tests reconfigure the beacons and override
the pillar configuration to force the renewal
of all the certificates and kubeconfigs.

The goal is to ensure that beacons work well
and that nothing is broken in the cluster even
when everything is triggered at the very same
time.

Refs: #1887
Timeout has been raised because of a lot
of false positive in CI tests.
We now wait 40 * 5s (200s) for log to
show up in Loki from the logger Pod.
@alexandre-allard alexandre-allard force-pushed the feature/1887-automatic-certs-renewal branch from 6792d9f to e94e404 Compare December 15, 2020 08:32
@alexandre-allard
Copy link
Contributor Author

/approve

@bert-e
Copy link
Contributor

bert-e commented Dec 16, 2020

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

The following reviewers are expecting changes from the author, or must review again:

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Dec 16, 2020

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • one peer

Peer approvals must include at least 1 approval from the following list:

The following reviewers are expecting changes from the author, or must review again:

The following options are set: approve

@alexandre-allard alexandre-allard dismissed NicolasT’s stale review December 16, 2020 13:11

Changes done/Issue opened

@bert-e
Copy link
Contributor

bert-e commented Dec 16, 2020

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/2.7

The following branches will NOT be impacted:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

@bert-e
Copy link
Contributor

bert-e commented Dec 16, 2020

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/2.7

The following branches have NOT changed:

  • development/1.0
  • development/1.1
  • development/1.2
  • development/1.3
  • development/2.0
  • development/2.1
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6

Please check the status of the associated issue None.

Goodbye alexandre-allard-scality.

@bert-e bert-e merged commit c577c75 into development/2.7 Dec 16, 2020
@bert-e bert-e deleted the feature/1887-automatic-certs-renewal branch December 16, 2020 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Certificates rotation with Salt beacon & reactor
5 participants