-
-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[20.09] nixos/acme regression: chown: Operation not permitted #115976
Comments
1f05492 seems to have removed a log message that I would have found helpful here:
That changes the error from a "this module is broken" to "this module is guiding me to do a manual data migration". That's a big deal. Luckily 1f05492 did add |
I'm glad you found the old error message - that is indeed the issue here. .well-known + .well-known/acme-challenge need to be owned by acme + the configured group. I will add back in the echo in my next PR. Thanks for pointing that out. |
In a way it's weird that the code tries to do Thanks, adding back that message will help. Maybe even print the command the user should run ( |
Verifying the owner is correct, but setting the group is the main intention since it can be changed between rebuilds (and will work in normal circumstances). I guess |
From an end-user perspective, I wonder why the data migration isn't happening automatically. Why cannot NixOS run the needed command as root? Is there risk of data loss? |
I stumbled upon this today. I'm slightly confused: Why does the module need ownership of I had the ownership of the acme-challenge directory correct, but .well-known was owned by root, so it broke for me. But my understanding of the challenge is that it should only write/delete files in the acme-challenge subdirectory. |
The reason .well-known is also chown'ed is that lego (the ACME client that the module uses) tries to create acme-challenge every time, and will throw a permissions error before checking if it exists. I thought I had a copy of the error in #106857 comments somewhere but I can't find it now, so I will test it once again to make sure. As for a more permanent fix, I think I will move these ownership fixes into the acme-fixperms.service which runs as root. |
After some testing and some source code reading, I found that I don't actually need to fix .well-known any more. I'm almost certain this wasn't the case at one point, but alas removing it will resolve issues here. I have stripped the problematic part of the renewal script down to its bare minimum - it now only runs the following commands: mkdir -p '${data.webroot}/.well-known/acme-challenge' && chgrp '${data.group}' ${data.webroot}/.well-known/acme-challenge The chgrp is required as the configured group can be changed between runs. Arguably this could be removed too if you assume that acme will always own this folder, and the changes I've made to solve #114751. (Set UMask to 0022 for the service) make it world readable. In an abundance of caution, I am going to keep it around for now. |
With the UMask set to 0023, the mkdir -p command which creates the webroot could end up unreadable if the web server changes, as surfaced by the test suite in NixOS#114751 On top of this, the following commands to chown the webroot + subdirectories was mostly unnecessary. I stripped it back to only fix the deepest part of the directory, resolving NixOS#115976, and reintroduced a human readable error message.
I completely wiped my
This is on 20.09. Is this the same issue in a dfiferent flavor? i don't have any acme related state on the machine and this is the entire config:
|
Let's leave this open until the fix is backported to release-20.09. |
With the UMask set to 0023, the mkdir -p command which creates the webroot could end up unreadable if the web server changes, as surfaced by the test suite in NixOS#114751 On top of this, the following commands to chown the webroot + subdirectories was mostly unnecessary. I stripped it back to only fix the deepest part of the directory, resolving NixOS#115976, and reintroduced a human readable error message. (cherry picked from commit 920a3f5)
Sorry it took me so long... busy week. PR is open. If someone like @arianvp could give it a test and see if it fixes their issues that would be great. |
Backports merged, closing issue. |
Describe the bug
An recent change on release-20.09 broke acme on my system:
It's trying to chown a path that's currently owned by root:
I track nixpkgs versions in my setup, so the working state is 2394284 and the broken state is 8d82c86.
To Reproduce
(Not sure if this is easily reproducible, but...)
Have a config like
and do an upgrade from NixOS 20.09 @ git commit 2394284 to 8d82c86.
Expected behavior
That services that used to work don't get permission error and fail on the stable branch.
Additional context
I rolled back to the known good version, that failed now(!), I rolled forward again and manually fixed up the permissions. Now I'm getting this error:
And later: I manually restarted the acme service. Now everything seems fine.
Notify maintainers
CC @NixOS/acme
CC @m1cr0man due to their acme commits on release-20.09.
Metadata
Maintainer information:
The text was updated successfully, but these errors were encountered: