-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Actual static linking / Not linking to glibc #5643
Comments
Sorry for the late reply. This is a complex issue that defies a quick and risk-free fix. I'll try to respond to each of your points:
While Go is renowned for producing binaries that can run on a wide variety of systems, it's far from a perfect solution. The Go toolchain distributed by golang.org does not run on Alpine. Alpine ships their own customized build of the Go toolchain and applies patches for compatibility with musl. Nomad requires CGO (for libcontainer among other things) which complicates portability.
Our Nvidia device driver implementation requires glibc due to differences in glibc's lazy binding.
If we shipped all of Nomad's plugins as external binaries, the agent itself could probably run on musl (perhaps even drop cgo entirely). This would require significant changes to our source layout, build system, distributed artifacts, documentation, and test infrastructure. We have no plans to produce "lite" portable agent binaries at this time.
We would love more assistance in improving Nomad's portability and Alpine/musl support. I hope we have demonstrated in #5537 that we do respond to specific issues and put effort toward improving our portability story. As far as I know making the prebuilt Nomad binaries work on Alpine/musl does not have a straightforward solution, but I'd love to be wrong or have assistance in working toward portability. |
@schmichael no worries at all on the slow reply, honestly I didn't expect to ever get a reply on this ticket. As a Void maintainer, I'm well aware of the hoops that are needed to get a full blown toolchain up and running, however I'm also not aware of any projects out there that have such deep dependencies (well maybe glibc, but that's another point entirely) that they care about the toolchain they're built with. I live in hope that the libcontainer dependency will move out into a plugin. As that system becomes more robust, I'd really like to get to a point where nomad itself is running rootless. I'm well acquainted with glibc's generally broken lazy binding system and the sad state of affairs that is GPU driver options on Linux. I'm very happy that there was a quick solution in adding the build tag to shut it off, though it would have been nice to get that into an actual release, rather than delaying it for a while since in the official binaries it was a noop change. I'd actually really like to see nomad shipping more plugins as external binaries that could either be setgid or setuid as necessary. Running the whole nomad binary as root is a really interesting security story that I'm still trying to grapple with. I can see why this is nowhere on the roadmap, as its a complete overhaul of a lot of the core parts of Nomad; as a means of improving the amount of code running with elevated privileges, I really hope this approach can be considered. #5537 was a fantastic start that was unfortunately marred by the fix not making it into the release. What it demonstrated from my perspective was that getting the fixes into the codebase was something that Hashicorp was willing to do, but putting them into another release - and unbreaking a class of users in the process - was another. The solution to running Nomad on Alpine is to compile from source. The nvidia integration has to be switched off to make it work, but as GPUs tend to be a specialized accelerator that's not on the vast vast majority of the fleet, this is a level of complexity I'm willing to pay. Useful takeway from this ticket: the single biggest thing that could be useful right now to supporting Nomad on non-glibc systems would be a smoke-test build before every release. This is making sure that the build works at all, as there have now been 2 releases where |
(not tested myself) Can the Nomad binaries be compiled and linked statically today with the current source tree and layout? |
Yes, you can. Its very hard to build something that can't be statically linked. Here's some commands that will do it for you in Alpine:
That's yanked from some obsolete files from when I was linking statically, now I maintain an internal Alpine repo which has the binaries dynamically linked to musl. |
The |
Not quite. cgo is pulled in by the DNS resolver. This is a fairly well understood gotcha at this point. The nvidia issue is that the bindings for that driver perform unchecked operations with no error handling, so if you don't have the shared objects available then you're SOL, since musl doesn't provide the same lazy bindings that glibc does (hence why this works on glibc if you don't have the soname, but not on musl). |
After thinking long and hard about this issue, I'm not sure in its present state traction can be made. Too many of Nomad's drivers need to link system level code. I think the only way to make progress on this would be to factor out all the drivers from Nomad into go-plugin executables. As a practical upshot and something that might be easier to sell to the product owners, this would mean the main, network exposed nomad binary doesn't need to be running with euid 0. I'm not sure what the overall desire within HashiCorp for this change would be, but it would be a pretty big win for security since increasingly less code would need to run in a privileged mode. If the podman driver that's been talked about on gitter ever gets off the ground, it would be trivial to run completely rootless nomad, and that would make my security team very happy indeed. |
👍 I'd love to ship a minimal Nomad binary (all plugins external) alongside our monolithic binary, but it's probably not going to be prioritized soon (not 0.11 and probably not 0.12) since it's nontrivial effort to support this in our build system. There are plans underway to ship proper Linux packages alongside our zip files which may give us a route to drop our monolithic binary while still providing a "batteries included" package. That could make this effort substantially easier. |
That's a really cool idea. I use go-plugin in some of my own projects, but haven't yet noticed any machinery that would let me selectively bake in plugins to a monolith. Can you share any pointers to documentation where I can find how that works? |
Unfortunately The PluginLoader discovers external plugins and merges them into the plugin catalog. The PluginLoader's Dispense method is kind of the bridge between internal and external plugins as it executes the binary for external plugins. Just brainstorming but an approach to migrating to shipping monolithic and minimal builds would be to create hidden Nomad subcommands for executing builtin plugins: eg |
Hmm, I like the idea of the hidden subcommands. It seems really clean from an interfaces perspective, if not an implementation one. |
I use a weird distribution of Linux that does't have interpreter, bash or libs on standard location. |
Interesting use of patchelf, I was unaware it was safe to use on Go compiled binaries. My use case has switched to running Nomad in a chroot that has the loader and libraries that Nomad expects to see, but I do still compile out the nvidia extensions as they are just more trouble than they're worth. |
While I don't think it entirely resolves this issue because we still have |
Per suggestion in #5537 and encouragement from @angrycub, I'm opening this issue for the purpose of collecting thumbs-ups for building Nomad against non-glibc C libraries. As the binaries currently provided on https://nomadproject.io/ are not statically linked they only work if the machine they're running on has the correct version of glibc available.
Given that the Go way is to provide binaries that will run anywhere the CPU arch matches, I personally find this surprising. I find is even more surprising given that the effort to build in a container that contains an alternate C library is minimal.
While not getting into the justification of why one would run Nomad on a non-glibc platform as it is beyond the scope of this issue, some quick reasons that jump to mind from my own environments where muslc is the library of choice:
If builds without an artificial dependency on glibc is something that is important to you, please thumbsup this issue (or even just the support to create these builds on your own, as you should be able to do already but can't with the current release). Don't reply with a +1 as that just clutters the thread and makes your support difficult to track.
The text was updated successfully, but these errors were encountered: