-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set the default PATH to include the RAPIDS conda env #449
Conversation
Signed-off-by: Jacob Tomlinson <[email protected]>
Signed-off-by: Jacob Tomlinson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Since this puts the RAPIDS environment at the front of PATH, I believe this is technically a breaking change in some situation due to PATH ordering possibly resulting in different executables being run. Are breaking changes in containers handled similarly to libraries? |
The |
Makes sense, thanks for the additional context 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Editing the PATH
variable is never ideal, but from some offline discussions with @jacobtomlinson, it seems that this is necessary to unblock some folks at Google. So we can go ahead and merge this workaround until we're able to get a proper fix in place.
Thanks @ajschmidt8. Really appreciate the compromise here. |
Thanks @ajschmidt8 |
* Set the default PATH to include the RAPIDS conda env Signed-off-by: Jacob Tomlinson <[email protected]> * Generate dockerfiles Signed-off-by: Jacob Tomlinson <[email protected]>
* Set the default PATH to include the RAPIDS conda env Signed-off-by: Jacob Tomlinson <[email protected]> * Generate dockerfiles Signed-off-by: Jacob Tomlinson <[email protected]> Co-authored-by: Jacob Tomlinson <[email protected]>
@@ -58,6 +58,9 @@ WORKDIR ${RAPIDS_DIR} | |||
|
|||
COPY NVIDIA_Deep_Learning_Container_License.pdf . | |||
COPY entrypoint.sh /opt/docker/bin/entrypoint | |||
|
|||
ENV "PATH=/opt/conda/envs/rapids/bin:/opt/conda/condabin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do something very similar in conda-forge. Though we do this in the entrypoint script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the challenge that we are trying to deal with is that many container platforms do not call the entrypoint script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right had a similar discussion with @mmccarty offline
IIUC some call docker run
, but may docker exec
into the container as well. With docker exec
, it is possible to prefix any command with the entrypoint script as part of the command, which would handle this case
Manipulating the PATH
will largely work in most cases. Where it falls short is conda activate
injects things into the current shell, which is lost without activation. Also any packages that have activation scripts won't be run. Usually this isn't a big deal, but sometimes it can cause issues
While I agree adding PATH
was the right thing to do in the short-term, in the medium term we may want to improve on this further to ensure activation is used in these other cases. Solutions could range from having folks run the entrypoint and including that in docs to other fixes in the container that aid with activation. Not entirely sure all of the constraints, but maybe we can discuss those offline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I totally agree. I guess the issue is that the entrypoint is specifically a Docker thing, and there are many other container runtimes and platforms that folks are trying to use our images on (contained, cri-o, kubernetes, kubeflow, sagemaker, vertex, azureml, etc). Most do not call the entrypoint at all if a custom command is set, and many set custom commands that cannot be overridden. So we should try not to rely on the entrypoint as a way to correctly set up our environment.
I think this would mean we need to set things up in the base environment, or manually set the environment up to be in a "post conda activate` state when the container is started. Setting environment variables should be straight forwards, but I hadn't considered activation scripts.
Let's chat in the team meeting today and we can feedback here or in #438/#439.
As a workaround for issues raised in #438 and #439 this PR sets the
PATH
explicitly to include the RAPIDS conda environment.Many container platforms including Kubernetes override the default entrypoint, so this change ensures binaries like
jupyter
are still on the path even without the entrypoint being run. Also when callingdocker exec
on a container the environment is not copied from the main process and without the entrypoint the path is also wrong.