-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/acme: force-renewing certificates is unreasonably difficult #81634
Comments
(maybe |
Interface/implementation sketch, after discussing with @yegortimoshenko: This would require some reorganization in the acme module to avoid duplicating the service logic, so it would probably be a good idea to clean it up at the same time. I'll try and get around to doing it if nobody else does, but don't want to block anyone who feels like taking it on themselves. |
Would also be a good idea to check for revocation with OCSP on the timer and do a force-renewal if so; this would mitigate the impact of future mass revocations in that the certificate would only be invalid for a day or so. |
There's another issue related to this: if you add an |
Previously, the NixOS ACME module defaulted to using P-384 for TLS certificates. I believe that this is a mistake, and that we should use P-256 instead, despite it being theoretically cryptographically weaker. The security margin of a 256-bit elliptic curve cipher is substantial; beyond a certain level, more bits in the key serve more to slow things down than add meaningful protection. It's much more likely that ECDSA will be broken entirely, or some fatal flaw will be found in the NIST curves that makes them all insecure, than that the security margin will be reduced enough to put P-256 at risk but not P-384. It's also inconsistent to target a curve with a 192-bit security margin when our recommended nginx TLS configuration allows 128-bit AES. [This Stack Exchange answer][pornin] by cryptographer Thomas Pornin conveys the general attitude among experts: > Use P-256 to minimize trouble. If you feel that your manhood is > threatened by using a 256-bit curve where a 384-bit curve is > available, then use P-384: it will increases your computational and > network costs (a factor of about 3 for CPU, a few extra dozen bytes > on the network) but this is likely to be negligible in practice (in a > SSL-powered Web server, the heavy cost is in "Web", not "SSL"). [pornin]: https://security.stackexchange.com/a/78624 While the NIST curves have many flaws (see [SafeCurves][safecurves]), P-256 and P-384 are no different in this respect; SafeCurves gives them the same rating. The only NIST curve Bernstein [thinks better of, P-521][bernstein] (see "Other standard primes"), isn't usable for Web PKI (it's [not supported by BoringSSL by default][boringssl] and hence [doesn't work in Chromium/Chrome][chromium], and Let's Encrypt [don't support it either][letsencrypt]). [safecurves]: https://safecurves.cr.yp.to/ [bernstein]: https://blog.cr.yp.to/20140323-ecdsa.html [boringssl]: https://boringssl.googlesource.com/boringssl/+/e9fc3e547e557492316932b62881c3386973ceb2 [chromium]: https://bugs.chromium.org/p/chromium/issues/detail?id=478225 [letsencrypt]: https://letsencrypt.org/docs/integration-guide/#supported-key-algorithms So there's no real benefit to using P-384; what's the cost? In the Stack Exchange answer I linked, Pornin estimates a factor of 3× CPU usage, which wouldn't be so bad; unfortunately, this is wildly optimistic in practice, as P-256 is much more common and therefore much better optimized. [This GitHub comment][openssl] measures the performance differential for raw Diffie-Hellman operations with OpenSSL 1.1.1 at a whopping 14× (even P-521 fares better!); [Caddy disables P-384 by default][caddy] due to Go's [lack of accelerated assembly implementations][crypto/elliptic] for it, and the difference there seems even more extreme: [this golang-nuts post][golang-nuts] measures the key generation performance differential at 275×. It's unlikely to be the bottleneck for anyone, but I still feel kind of bad for anyone having lego generate hundreds of certificates and sign challenges with them with performance like that... [openssl]: mozilla/server-side-tls#190 (comment) [caddy]: https://github.com/caddyserver/caddy/blob/2cab475ba516fa725d012f53ca417c3e039607de/modules/caddytls/values.go#L113-L124 [crypto/elliptic]: https://github.com/golang/go/tree/2910c5b4a01a573ebc97744890a07c1a3122c67a/src/crypto/elliptic [golang-nuts]: https://groups.google.com/forum/#!topic/golang-nuts/nlnJkBMMyzk In conclusion, there's no real reason to use P-384 in general: if you don't care about Web PKI compatibility and want to use a nicer curve, then Ed25519 or P-521 are better options; if you're a NIST-fearing paranoiac, you should use good old RSA; but if you're a normal person running a web server, then you're best served by just using P-256. Right now, NixOS makes an arbitrary decision between two equally-mediocre curves that just so happens to slow down ECDH key agreement for every TLS connection by over an order of magnitude; this commit fixes that. Unfortunately, it seems like existing P-384 certificates won't get migrated automatically on renewal without manual intervention, but that's a more general problem with the existing ACME module (see NixOS#81634; I know @yegortimoshenko is working on this). To migrate your certificates manually, run: $ sudo find /var/lib/acme/.lego/certificates -type f -delete $ sudo find /var/lib/acme -name '*.pem' -delete $ sudo systemctl restart 'acme-*.service' nginx.service (No warranty. If it breaks, you get to keep both pieces. But it worked for me.)
Changing the ACME server endpoint is also a scenario leading to a forced reload due to acme not noticing the change, too. |
Previously, the NixOS ACME module defaulted to using P-384 for TLS certificates. I believe that this is a mistake, and that we should use P-256 instead, despite it being theoretically cryptographically weaker. The security margin of a 256-bit elliptic curve cipher is substantial; beyond a certain level, more bits in the key serve more to slow things down than add meaningful protection. It's much more likely that ECDSA will be broken entirely, or some fatal flaw will be found in the NIST curves that makes them all insecure, than that the security margin will be reduced enough to put P-256 at risk but not P-384. It's also inconsistent to target a curve with a 192-bit security margin when our recommended nginx TLS configuration allows 128-bit AES. [This Stack Exchange answer][pornin] by cryptographer Thomas Pornin conveys the general attitude among experts: > Use P-256 to minimize trouble. If you feel that your manhood is > threatened by using a 256-bit curve where a 384-bit curve is > available, then use P-384: it will increases your computational and > network costs (a factor of about 3 for CPU, a few extra dozen bytes > on the network) but this is likely to be negligible in practice (in a > SSL-powered Web server, the heavy cost is in "Web", not "SSL"). [pornin]: https://security.stackexchange.com/a/78624 While the NIST curves have many flaws (see [SafeCurves][safecurves]), P-256 and P-384 are no different in this respect; SafeCurves gives them the same rating. The only NIST curve Bernstein [thinks better of, P-521][bernstein] (see "Other standard primes"), isn't usable for Web PKI (it's [not supported by BoringSSL by default][boringssl] and hence [doesn't work in Chromium/Chrome][chromium], and Let's Encrypt [don't support it either][letsencrypt]). [safecurves]: https://safecurves.cr.yp.to/ [bernstein]: https://blog.cr.yp.to/20140323-ecdsa.html [boringssl]: https://boringssl.googlesource.com/boringssl/+/e9fc3e547e557492316932b62881c3386973ceb2 [chromium]: https://bugs.chromium.org/p/chromium/issues/detail?id=478225 [letsencrypt]: https://letsencrypt.org/docs/integration-guide/#supported-key-algorithms So there's no real benefit to using P-384; what's the cost? In the Stack Exchange answer I linked, Pornin estimates a factor of 3× CPU usage, which wouldn't be so bad; unfortunately, this is wildly optimistic in practice, as P-256 is much more common and therefore much better optimized. [This GitHub comment][openssl] measures the performance differential for raw Diffie-Hellman operations with OpenSSL 1.1.1 at a whopping 14× (even P-521 fares better!); [Caddy disables P-384 by default][caddy] due to Go's [lack of accelerated assembly implementations][crypto/elliptic] for it, and the difference there seems even more extreme: [this golang-nuts post][golang-nuts] measures the key generation performance differential at 275×. It's unlikely to be the bottleneck for anyone, but I still feel kind of bad for anyone having lego generate hundreds of certificates and sign challenges with them with performance like that... [openssl]: mozilla/server-side-tls#190 (comment) [caddy]: https://github.com/caddyserver/caddy/blob/2cab475ba516fa725d012f53ca417c3e039607de/modules/caddytls/values.go#L113-L124 [crypto/elliptic]: https://github.com/golang/go/tree/2910c5b4a01a573ebc97744890a07c1a3122c67a/src/crypto/elliptic [golang-nuts]: https://groups.google.com/forum/#!topic/golang-nuts/nlnJkBMMyzk In conclusion, there's no real reason to use P-384 in general: if you don't care about Web PKI compatibility and want to use a nicer curve, then Ed25519 or P-521 are better options; if you're a NIST-fearing paranoiac, you should use good old RSA; but if you're a normal person running a web server, then you're best served by just using P-256. Right now, NixOS makes an arbitrary decision between two equally-mediocre curves that just so happens to slow down ECDH key agreement for every TLS connection by over an order of magnitude; this commit fixes that. Unfortunately, it seems like existing P-384 certificates won't get migrated automatically on renewal without manual intervention, but that's a more general problem with the existing ACME module (see NixOS#81634; I know @yegortimoshenko is working on this). To migrate your certificates manually, run: $ sudo find /var/lib/acme/.lego/certificates -type f -delete $ sudo find /var/lib/acme -name '*.pem' -delete $ sudo systemctl restart 'acme-*.service' nginx.service (No warranty. If it breaks, you get to keep both pieces. But it worked for me.) (cherry picked from commit 62e34d1)
I marked this as stale due to inactivity. → More info |
This issue is effectively solved now. With #91121, you can run |
Ran into this recently, luckily on a personal website that's being rebuilt right now.
|
Same here. No matching resources when trying to clean. |
@pkern just for quickly unblocking, |
Hey folks! Sorry for not responding sooner. I read over the generated service config and the systemctl docs, and I believe you need to add the
I believe this isn't highlighted in the nixos acme docs so I'll fix that in my next PR. |
Right now, the only way I've found that works is to set
validMinDays = 999;
to force the renewal. This wouldn't matter so much (it's usually only necessary if you e.g. tweak certificate options like OCSP Must-Staple) if not for the fact that Let's Encrypt screwed up and now a bunch of people have to do it by tomorrow, which they won't. Oops.I'm filing this as an issue rather than a PR in part because I'm not really sure what the a good interface would be here; the best thing I can imagine is something like
nix run -f '<nixpkgs/nixos>' something -c force-renew-certs
, which seems weird. Does anyone know of prior precedent for interfaces here?cc @aanderse @arianvp @m1cr0man @yegortimoshenko
If you just want to know how to force-renew your certificates this time
Add
security.acme.validMinDays = 999;
to your configuration and run anixos-rebuild switch
. This may or may not automatically renew the certificate depending on your nixpkgs version; to make sure, dosystemctl start 'acme-*.service'
. Make sure to remove thevalidMinDays
option and runnixos-rebuild switch
again afterwards, or you'll hammer the Let's Encrypt servers for a renewal every day!The text was updated successfully, but these errors were encountered: