-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.9.0 appears fundamentally broken on musl/Alpine #5535
Comments
Thanks! |
Thanks @the-maldridge for raising this. I have added a It's a bit interesting though - I believe the nvidia library is a dynamically shared library, so its presence wasn't required for glibc based OSes. Do you know what might make musl compilation behave differently? |
I believe it is because its a shared library that this error occurs. Compilation occurs just fine, but the resulting binary contains dynamically loaded non-relocatable symbols. I would need to conduct more research to figure out why this is the case, but at the moment I'm still trying to determine if the frontend can be built with a modern yarn/nodejs. If I had to take a shot in the dark at why this is happening, I'd guess that nvml isn't correctly checking if there are build-time constraints that prevent it from working correctly and so it builds "blind" and then happens to work correctly on glibc systems. The proper fix is probably for them to return a nil implementation if the library can't be loaded. |
I see - thanks for the background.
This might be handy for you #5427 - you can skip building the frontend and use latest built release frontend assets by setting the |
@the-maldridge @notnoop we had this problem even on
|
Apologies, I didn't see that #5643 was already open, this is the bug I was looking for :). |
We had a similar issue on our RHEL7 distribution: I worked around this by building a temporary nvml.so that stubs out the functions, as follows:
Then run nomad with LD_PRELOAD: |
In fact, I discovered that the issue was caused with having environment variable LD_BIND_NOW set. Doing |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Can't get this far, the resultant binary is not usable. Release tag v0.9.0 though.
Operating system and Environment details
Alpine Linux 64-bit as available in the golang:1.12-alpine docker image.
Issue
Ignoring the FTBFS case if using the target make dev-ui (which fails due to the expected nodejs being very old compared to current LTS and current stable), the resulting binary is not usable due to problems in the nvidia support which doesn't appear possible to disable.
Reproduction steps
Attempting to run the built binary provides the following errors:
Given that these all seem related to GPU scheduling which my organization doesn't need, I assumed I could just pass a build tag and shut this off. To my great disappointment this "feature" seems to be impossible to disable!
Is my best bet to stay on 0.8.7 until these bugs can be resolved?
The text was updated successfully, but these errors were encountered: