-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide NSS modules globally, make nscd unnecessary (v2) #155655
Conversation
…stem.nssModules.path This makes this point to a single folder, not a colon-separated list of directories, which makes it much easier to symlink to it (what the next commit does). This makes overriding *already existing* NSS modules harder, as we can't just pretend to the list, but it's probably a good idea to explicitly handle this, instead of silently shadowing - plus, I'm not aware of anything in nixpkgs actually overwriting existing NSS modules.
NSS modules are now globally provided by a symlink in `/run`. See the description in `add-extra-module-load-path.patch` for further details. Fixes: NixOS#55276 Fixes: NixOS#135888 Fixes: NixOS#105353 Cc: NixOS#52411 (comment) Co-authored-by: Erik Arvstedt <[email protected]>
@dasJ this PR mostly provides the plumbing (glibc patch) to make some of these things possible, which is the heavy rebuild (and why this targets staging) IMHO, we can do the discussions on what nss modules should be available by default (old glibc versions, 32bit or not) in the PR switching the nscd default to false. As long as |
@edolstra since you initially 👎 this on #138178 (comment), do you have an opinion on this iteration? As written in the PR description, this should address the binary incompatibilities and 32bit concerns mentioned in case of a disabled nscd, and still keeps nscd enabled by default, using a similar approach as the one you suggested. |
systemd.tmpfiles.rules = let | ||
glibcPlatform = "${if pkgs.stdenv.hostPlatform.is64bit then "64" else "32"}-${pkgs.glibc.version}"; | ||
in [ | ||
"L+ /run/nss-modules-${glibcPlatform} - - - - ${config.system.nssModules.path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this will "leak" symlinks over time as glibc versions change? Couldn't this be part of /run/current-system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will create symlinks to nix store paths there (until it's rebooted). Nix might eventually garbage collect them away, I'm not sure if /run
is checked. In that case, the NSS lookup will fail using that NSS module, which is probably still better than segfaulting ;-)
Having these NSS moduels inside /run/current-system would mean one would need to keep multiple glibc versions around as part of the system closure.
As this is only meant to prevent breakage of still running old versions of binaries, compiled against an old glibc (and even those will use nscd by default), I'd consider this to be not much of a problem.
I'm not in favor of this. While My experience with OpenGL ( I think it may be worthwhile to allow the NSS modules location to be overriden via environment variables. That allows people to build pure closures containing the NSS modules they need in particular environments, rather than requiring them to create |
It's also not clear to me how this is going to work in practical terms, especially on non-NixOS systems. For instance, suppose I have previously installed some package using On non-NixOS systems this is even worse, since there is no obvious way to create/manage |
Really what I'm trying to say here is that |
FYI: The Guix folks poked upstream about the Nscd removal: https://sourceware.org/pipermail/libc-alpha/2022-February/136741.html Relevant part:
Full thread: https://sourceware.org/pipermail/libc-alpha/2022-March/thread.html#136745 |
I think we're already using |
Where do these files go today? Say they are just in an output of libc, and programs reference that output. What goes wrong? |
I'm not sure I understand the question. Without nscd, glibc lookup nss modules defined in This is all fine for glibc-provided NSS modules, but doesn't allow discovery of NSS modules provided by other packages (like the systemd ones, which are quite critical for dynamic user support, or the avahi or ldap ones, …). Right now we do this by piggy-backing on This PR proposes adding another path to the glibc lookup path, before asking nscd ( @edolstra you mentioned this is worse on non-NixOS systems. However, non-NixOS systems already need nscd running so they can do lookups with non-glibc NSS modules. This has been a problem all the time (we just ignored it so far) and doesn't change with this PR - it just puts nscd (theoretically) out of the critical path in NixOS systems. We could probably do a compromise - use the logic from the patch here, look in With this approach:
|
Thanks @flokli for explaining. I think I will email the list to second the Guix thing regardless of what we do here. |
My preference is nscd as a "lingua franca" is still a good idea, unless we want to just statically link the |
This is not only a philosophical issue: the /run/opengl-driver impurity is already making debugging OpenGL errors in Nixpkgs a nightmare. Just look at this issue: #80936 (comment). There's also the fact that I dread the thought of this sort of situation extending to all programs through glibc. I understand that using a daemon just to discover a bunch of shared objects is a waste of resources and that the nscd DNS caching is a problem, but introducing another impurity is a strictly worse one. |
The conversation in the guix ML also seems to lean towards keeping nscd around. I'm still curious about how we can fix the issues we have with it right now:
|
Thanks for linking #124019. It is good to un-hardcode the choice of nscd like that. I hope we can reopen and merge that PR. |
unscd has the last release 2014, so it is pretty safe ot assume that development is dead. Also sounds like a suboptimal solution. |
As I said on that PR, I am less enamored with the idea of The best solution does seem to team up with Guix to ensure that the upstream NSCD becomes a lot simpler. |
I also think that this idea seems most practical. Also, there is a bit of urgency on this topic in general. The latent caching bugs (like faulty cached NXDOMAIN for lookups that once lost a UDP packet) keep creeping up every now and then and there's little we can do (except restart nscd if it happens and disable nscd if the affected node doesn't use any of the modules). However, I wish @edolstra's statement about the caching was true. We do set it to 0 but there are significant caching bugs in nscd that still trigger even when turned off ... 😢
|
Just from skimming the glibc source code, it may be enough to /* Add a new entry to the cache. The return value is zero if the function
call was successful.
This function must be called with the read-lock held.
We modify the table but we nevertheless only acquire a read-lock.
This is ok since we use operations which would be safe even without
locking, given that the `prune_cache' function never runs. Using
the readlock reduces the chance of conflicts. */
int
cache_add (int type, const void *key, size_t len, struct datahead *packet,
bool first, struct database_dyn *table,
uid_t owner, bool prune_wakeup) |
This could work but is untested: #include <stddef.h>
#include <stdbool.h>
#include <unistd.h>
int cache_add (int type, const void *key, size_t len, struct datahead *packet, bool first, struct database_dyn *table, uid_t owner, bool prune_wakeup) {
return 0;
} Build with:
Use with: {
systemd.services.nscd.environment.LD_PRELOAD = /path/to/test.so;
} |
This can be closed, since we have #196917, right? |
Hmmh, this specific PR has probably bitrotten. I think NixOS should probably be using nsncd, but on some non-NixOS distributions it might still make sense to have an environment variable that you can set to steer nix-built binaries to a set of NSS modules, without polluting all of LD_LIBRARY_PATH, in case you can't run nscd or nsncd on that system. This would be the -- NixOS with nsncd also still has the problem of DNS requests/host lookups leaking out of network namespaces. This could maybe be solved in nsncd, if we can detect the network namespace of the nss client in a race-free fashion. |
I summarized the different approaches in https://flokli.de/posts/2022-11-18-nsncd/. I think this can be closed. |
This is a follow-up to #138178 (diff) which fixes the binary incompatibilities of the original PR.
This PR allows glibc client binaries to access NSS modules configured via
system.nssModules
without nscd.nscd has significant caching bugs and causes friction in general (#95107, #154928).
For details about nscd bugs, see issue
DNS responses are cached
and the Fedora nscd deprecation notes.Some services set
LD_LIBRARY_PATH
to allow running them without nscd. These workarounds are now obsolete and are removed by this PR.Implementation
Provide global NSS modules at
/run/nss-modules-${word_size}-${glibc_version}/lib
(e.g./run/nss-modules-64-2.34/lib
) and patch glibc to use this path.The versioning suffix ensures that only binary compatible glibc client binaries will use this path.
Repo erikarvstedt/check-glibc-compatibilities shows that different NSS modules and glibc clients are compatible with each other, as long as they share the same minor glibc release (e.g.
2.34
).Because the patched code region is never inlined, the patch affects all binaries that dynamically link glibc. This includes binaries prebuilt with a non-nixpkgs libc that are processed with
patchelf
(likeslack
).Todo
In light of its defects and lack of maintenance, it might be sensible to disable nscd by default.
Note:
unscd
is no replacement for nscd because it doesn't implement all nsswitch functions (src thread).opengl.driSupport32Bit
andopengl.extraPackages32
. As a minimum,systemd
NSS modules should be provided. This can be addressed in another PR.Appendix
Fixes: #135888
Fixes: #105353
Fixes: #55276
Cc: #52411 (comment)
Closes: #111194
This PR is co-authored by @flokli.