docs: Add info about client side certificate rotation best practices. #1168

bwplotka · 2019-01-02T13:58:54Z

Hi and Happy New Year All!

Thanks for great product. We use it on production for long time, but we want to focus to improve automation and avoid manual intervention during certificate renewal for our services. How to ensure Pod's server will actually reload certificate? Particuraly:

Certificate is close to expiry time.
Cert-manager renews and updates Kubernetes secret.
Kubernetes refresh Secret in desired pods, but they are still using the old certificate.

It's definitely not cert-manager issue, but it would be nice for cert-manager to incldue potential solutions to this problem as best practices.

There are multiple options like:
A) Ensure application can reload it "hitless"/non-distruptive. E.g you can implement that for Golang HTTP server, or hope that your service you use allows that (mostly they don't). For example envoy recently added that option: envoyproxy/envoy#1194
B) Some generic cert-rotate operator that will rolling restart stateless deployments to load new certificates? Maybe logic like this in cert-manager makes sense?
C) Have your rollout tools handle that? (ensure pods are restarted frequently)

What is common way of solving this problem? I guess A for less distruptive rotation possible, but what if it's 3rd party tool that does not support hot reload? I have searched gh issues, but haven't found relevant response.

Do you agree that some docs for best practices for this would be suitable in cert-manager documention?

Environment details (if applicable):

Kubernetes version (e.g. v1.10.2): v1.9+
Cloud-provider/provisioner (e.g. GKE, kops AWS, etc): GKE

/kind feature

The text was updated successfully, but these errors were encountered:

munnerz · 2019-01-10T22:46:17Z

Happy new year! 🎉

I think this sort of thing would be great to add to our documentation - or at least notes summarising what you've put above, so that users can understand what they need to do and what their options are 😄

/kind documentation

paultiplady · 2019-03-02T22:15:16Z

This is one of those subtle issues that isn't apparent from reading the intro docs, and will cause a full outage when it bites you. I think it's worth at least calling out as a "here be dragons" kind of message; whatever your chosen solution, if you haven't picked one, then you are probably going to have an outage at some point (usually coinciding with when your team is all on vacation, since that's when the code/deploy velocity will have dropped off).

(Not being overly-specific because this is exactly what happened to me or anything like that...) :)

bwplotka · 2019-03-04T09:53:09Z

I don't get @paultiplady what is the actual outcome of your comment (: Are you just ranting about fact that nothing works for 100%? Sure but can we focus on fixing this issue, to recommend or explain solution that will be closer to 100% than others?

paultiplady · 2019-03-04T17:11:20Z

I'm adding a user use-case emphasizing that this is important to document, as it produces outages if it's not handled.

rmb938 · 2019-03-06T14:24:48Z

So I just found this https://github.com/pusher/wave. It will watch for changes on configmaps and secrets for deployments and perform a rolling deploy when they get updated. So to go off of the example from the initial issue the following would happen:

Create a Certificate
Create a deployment with the wave annotation and use the certificate's secret in the deployment
Cert-manager renews and updates Kubernetes secret
Wave sees that the secret was update and performs a rolling deployment.

bwplotka · 2019-03-06T15:34:08Z

Nice, if that is production rdy then it looks really promising!

retest-bot · 2019-06-04T15:58:32Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle stale

retest-bot · 2019-07-04T16:37:36Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to jetstack.
/lifecycle rotten
/remove-lifecycle stale

retest-bot · 2019-08-03T16:43:12Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

jetstack-bot · 2019-08-03T16:43:21Z

@retest-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to jetstack.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

PHameete · 2020-03-20T11:18:15Z

Currently have to implement a solution for this as well and saw the recommendation for Wave above. I also ran into https://github.com/stakater/Reloader which does the same things but has more stars and looks easier to install.

munnerz · 2020-03-20T11:45:02Z

/reopen
/remove-lifecycle rotten
/lifecycle frozen

jetstack-bot · 2020-03-20T11:45:04Z

@munnerz: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle rotten
/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jetstack-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 2, 2019

jetstack-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Jan 10, 2019

This was referenced Mar 7, 2019

WIP: Restarting pods #1453

Closed

Restart pods whose certificates were refreshed #1440

Closed

bwplotka mentioned this issue Mar 27, 2019

Feature Request: Detect credentials changes without application restarts thanos-io/thanos#985

Closed

jetstack-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 4, 2019

jetstack-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 4, 2019

jetstack-bot closed this as completed Aug 3, 2019

jetstack-bot reopened this Mar 20, 2020

jetstack-bot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Mar 20, 2020

munnerz added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Apr 23, 2020

megian mentioned this issue Jun 8, 2021

Encrypt cluster-internal traffic projectsyn/component-keycloak#14

Merged

3 tasks

nosvalds mentioned this issue Jan 11, 2022

auto certificate renewal with restartOnTLSSecretUpdate and cert-manager fails apache/solr-operator#390

Open

chap-dr mentioned this issue Jan 17, 2022

Automatically handle updated certificates Kong/kubernetes-ingress-controller#986

Closed

uhthomas mentioned this issue Aug 31, 2023

Refresh TLS certificates automatically gomods/athens#1878

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Add info about client side certificate rotation best practices. #1168

docs: Add info about client side certificate rotation best practices. #1168

bwplotka commented Jan 2, 2019

munnerz commented Jan 10, 2019

paultiplady commented Mar 2, 2019

bwplotka commented Mar 4, 2019

paultiplady commented Mar 4, 2019

rmb938 commented Mar 6, 2019

bwplotka commented Mar 6, 2019

retest-bot commented Jun 4, 2019

retest-bot commented Jul 4, 2019

retest-bot commented Aug 3, 2019

jetstack-bot commented Aug 3, 2019

PHameete commented Mar 20, 2020

munnerz commented Mar 20, 2020

jetstack-bot commented Mar 20, 2020

docs: Add info about client side certificate rotation best practices. #1168

docs: Add info about client side certificate rotation best practices. #1168

Comments

bwplotka commented Jan 2, 2019

munnerz commented Jan 10, 2019

paultiplady commented Mar 2, 2019

bwplotka commented Mar 4, 2019

paultiplady commented Mar 4, 2019

rmb938 commented Mar 6, 2019

bwplotka commented Mar 6, 2019

retest-bot commented Jun 4, 2019

retest-bot commented Jul 4, 2019

retest-bot commented Aug 3, 2019

jetstack-bot commented Aug 3, 2019

PHameete commented Mar 20, 2020

munnerz commented Mar 20, 2020

jetstack-bot commented Mar 20, 2020