Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install conda envs in home directory by default #517

Closed

Conversation

consideRatio
Copy link
Contributor

@consideRatio consideRatio commented Jul 14, 2021

Related to 2i2c-org/docs#81

This is what we do in JMTE to help users create environments that persists and have the be detectable by nb_conda_kernels to show up as kernels to start jupyter notebooks in as well. Is nb_conda_kernels installed in this image btw? If not, it's worth installing.

@damianavila
Copy link
Contributor

I do not think we have nb_conda_kernels installed, I do agree it is a nice thing to have (in fact, I worked in the first iteration of that library years ago 😉)... regarding the conda config stuff, I have my concerns about that as I explained in 2i2c-org/docs#81 (comment).

@consideRatio
Copy link
Contributor Author

IMHO, that should be something users should decide upon.
There are several reproducibility/replicability workflows where starting from a fresh environment that is "codified" in some image/dockerfile somehow helps that others can do the same as you did...
In fact, I would be surprised to see some environments persisted by default after restarting my pod wink

I think the base environment should be fresh on startup, but I would consider it reasonable that whatever environment I've created myself is persisted. This is what this PR will accomplish. I added it to the JMTE deployment as it was requested and the expected behavior of a user or two if I recall correctly. It is what I would find most helpful as well.

I'm happy to close this PR if its considered unwanted, I just opened it instead of opening an issue to then open a PR separately as it was such a simple change.

@damianavila
Copy link
Contributor

I would consider it reasonable that whatever environment I've created myself is persisted.

Je... I would expect the opposite from a kube-based system, but I understand others might find it useful.
How do you feel about my response here: 2i2c-org/docs#81 (comment)?

@consideRatio
Copy link
Contributor Author

consideRatio commented Jul 15, 2021

I would expect the opposite from a kube-based system

Oh absolutely, I would also do that - but I'm a jupyterhub administrator that have deployed things on Kubernetes, and I know what Kubernetes is - the users though?

How do you feel about my response here: 2i2c-org/docs#81 (comment)?

I think using conda-store is preferred for everyone involved long term. At the same time I think letting users' explicitly created conda environments be persistent and accessible is also a reasonable step along the way. I'm not confident how conda-store environments are configured to be accessible, but I think they would be configurable to be accessible alongside user's personal environments and that both strategies could co-exist well. Due to that, I think it makes sense to aim for both.

To me, the biggest argument against this PRs change is that it would increase the NFS storage needs on average.

@damianavila
Copy link
Contributor

I know what Kubernetes is - the users though?

We tell them we use Kube: https://pilot.2i2c.org/en/latest/about/infrastructure.html#
But I acknowledge there could be a lot of users not knowing about that nor how Kube pods usually behave...
So you have a point there!

I think using conda-store is preferred for everyone involved long term.

I do not have experience with conda-store, eager to know more about it.

At the same time I think letting users' explicitly created conda environments be persistent and accessible is also a reasonable step along the way

I am not sure about that although I see value in specific cases... this is why I pushed in the issue for admin customization of the base image instead of shipping it from here 😉

Wondering what other's in the @2i2c-org/tech-team think about this concept.

To me, the biggest argument against this PRs change is that it would increase the NFS storage needs on average.

+1 on that additional argument 😜

@choldgraf
Copy link
Member

Just a quick thought here - I believe that @yuvipanda's reasoning here initially came from our experience with the Berkeley data hub. One of the most common problems we ran into were users installing their own environments in their filesystems, and this causing problems when the hub's base environment was updated, or caused students to write code that wouldn't work with other people on the same hub, etc.

All that is to say, I definitely agree this is useful but there is some UX to think about, particularly for less-experienced users that may not have a strong understanding of environments and anto-patterns there.

@damianavila
Copy link
Contributor

Just a quick thought here - I believe that @yuvipanda's reasoning here initially came from our experience with the Berkeley data hub. One of the most common problems we ran into were users installing their own environments in their filesystems, and this causing problems when the hub's base environment was updated, or caused students to write code that wouldn't work with other people on the same hub, etc.

@choldgraf, I have not seen any @yuvipanda's feedback on this PR nor the issue, but glad to know he agrees with me 😜
Btw, I have seen the same patterns (problems) in the past and this is why I am against this PR.

I think this decision should be made at the Hub customization level... I mean, if there is some hub/community that actually wants this behavior, let's document this piece of code for them... but I would not agree with shipping this "behavior" as is as the default experience.

@choldgraf
Copy link
Member

Hey all - I don't think that we have consensus yet on the right way to allow hubs to persist environments between user sessions.

I propose that we do the following:

If nobody objects then I'll close this PR tomorrow!

@consideRatio
Copy link
Contributor Author

Thanks @choldgraf that sounds good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants