Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set the default PATH to include the RAPIDS conda env #449

Merged
merged 2 commits into from
Apr 6, 2022

Conversation

jacobtomlinson
Copy link
Member

As a workaround for issues raised in #438 and #439 this PR sets the PATH explicitly to include the RAPIDS conda environment.

Many container platforms including Kubernetes override the default entrypoint, so this change ensures binaries like jupyter are still on the path even without the entrypoint being run. Also when calling docker exec on a container the environment is not copied from the main process and without the entrypoint the path is also wrong.

@jacobtomlinson jacobtomlinson requested a review from a team as a code owner April 6, 2022 13:05
Signed-off-by: Jacob Tomlinson <[email protected]>
Copy link

@mmccarty mmccarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@beckernick
Copy link
Member

Since this puts the RAPIDS environment at the front of PATH, I believe this is technically a breaking change in some situation due to PATH ordering possibly resulting in different executables being run.

Are breaking changes in containers handled similarly to libraries?

@jacobtomlinson
Copy link
Member Author

jacobtomlinson commented Apr 6, 2022

The PATH here is exactly what gets set the entry point (I just copy/pasted it). And if the entry point is called then this will be overwritten anyway. So I don't think this is breaking. It just sets it for situations where the entry point isn't called.

@beckernick
Copy link
Member

Makes sense, thanks for the additional context 👍

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Editing the PATH variable is never ideal, but from some offline discussions with @jacobtomlinson, it seems that this is necessary to unblock some folks at Google. So we can go ahead and merge this workaround until we're able to get a proper fix in place.

@ajschmidt8 ajschmidt8 merged commit 814b719 into rapidsai:branch-22.04 Apr 6, 2022
@jacobtomlinson jacobtomlinson deleted the env-path branch April 6, 2022 15:02
@jacobtomlinson
Copy link
Member Author

Thanks @ajschmidt8. Really appreciate the compromise here.

@mmccarty
Copy link

mmccarty commented Apr 6, 2022

Thanks @ajschmidt8

ajschmidt8 pushed a commit to ajschmidt8/docker that referenced this pull request Apr 6, 2022
* Set the default PATH to include the RAPIDS conda env

Signed-off-by: Jacob Tomlinson <[email protected]>

* Generate dockerfiles

Signed-off-by: Jacob Tomlinson <[email protected]>
ajschmidt8 added a commit that referenced this pull request Apr 6, 2022
* Set the default PATH to include the RAPIDS conda env

Signed-off-by: Jacob Tomlinson <[email protected]>

* Generate dockerfiles

Signed-off-by: Jacob Tomlinson <[email protected]>

Co-authored-by: Jacob Tomlinson <[email protected]>
@@ -58,6 +58,9 @@ WORKDIR ${RAPIDS_DIR}

COPY NVIDIA_Deep_Learning_Container_License.pdf .
COPY entrypoint.sh /opt/docker/bin/entrypoint

ENV "PATH=/opt/conda/envs/rapids/bin:/opt/conda/condabin:/opt/conda/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do something very similar in conda-forge. Though we do this in the entrypoint script

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the challenge that we are trying to deal with is that many container platforms do not call the entrypoint script.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right had a similar discussion with @mmccarty offline

IIUC some call docker run, but may docker exec into the container as well. With docker exec, it is possible to prefix any command with the entrypoint script as part of the command, which would handle this case

Manipulating the PATH will largely work in most cases. Where it falls short is conda activate injects things into the current shell, which is lost without activation. Also any packages that have activation scripts won't be run. Usually this isn't a big deal, but sometimes it can cause issues

While I agree adding PATH was the right thing to do in the short-term, in the medium term we may want to improve on this further to ensure activation is used in these other cases. Solutions could range from having folks run the entrypoint and including that in docs to other fixes in the container that aid with activation. Not entirely sure all of the constraints, but maybe we can discuss those offline

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I totally agree. I guess the issue is that the entrypoint is specifically a Docker thing, and there are many other container runtimes and platforms that folks are trying to use our images on (contained, cri-o, kubernetes, kubeflow, sagemaker, vertex, azureml, etc). Most do not call the entrypoint at all if a custom command is set, and many set custom commands that cannot be overridden. So we should try not to rely on the entrypoint as a way to correctly set up our environment.

I think this would mean we need to set things up in the base environment, or manually set the environment up to be in a "post conda activate` state when the container is started. Setting environment variables should be straight forwards, but I hadn't considered activation scripts.

Let's chat in the team meeting today and we can feedback here or in #438/#439.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants