Install conda envs in home directory by default #517

consideRatio · 2021-07-14T16:42:53Z

This is what we do in JMTE to help users create environments that persists and have the be detectable by nb_conda_kernels to show up as kernels to start jupyter notebooks in as well. Is nb_conda_kernels installed in this image btw? If not, it's worth installing.

damianavila · 2021-07-14T20:31:20Z

I do not think we have nb_conda_kernels installed, I do agree it is a nice thing to have (in fact, I worked in the first iteration of that library years ago 😉)... regarding the conda config stuff, I have my concerns about that as I explained in 2i2c-org/docs#81 (comment).

consideRatio · 2021-07-15T15:15:02Z

IMHO, that should be something users should decide upon.
There are several reproducibility/replicability workflows where starting from a fresh environment that is "codified" in some image/dockerfile somehow helps that others can do the same as you did...
In fact, I would be surprised to see some environments persisted by default after restarting my pod wink

I think the base environment should be fresh on startup, but I would consider it reasonable that whatever environment I've created myself is persisted. This is what this PR will accomplish. I added it to the JMTE deployment as it was requested and the expected behavior of a user or two if I recall correctly. It is what I would find most helpful as well.

I'm happy to close this PR if its considered unwanted, I just opened it instead of opening an issue to then open a PR separately as it was such a simple change.

damianavila · 2021-07-15T16:10:15Z

I would consider it reasonable that whatever environment I've created myself is persisted.

Je... I would expect the opposite from a kube-based system, but I understand others might find it useful.
How do you feel about my response here: 2i2c-org/docs#81 (comment)?

consideRatio · 2021-07-15T16:24:30Z

I would expect the opposite from a kube-based system

Oh absolutely, I would also do that - but I'm a jupyterhub administrator that have deployed things on Kubernetes, and I know what Kubernetes is - the users though?

How do you feel about my response here: 2i2c-org/docs#81 (comment)?

I think using conda-store is preferred for everyone involved long term. At the same time I think letting users' explicitly created conda environments be persistent and accessible is also a reasonable step along the way. I'm not confident how conda-store environments are configured to be accessible, but I think they would be configurable to be accessible alongside user's personal environments and that both strategies could co-exist well. Due to that, I think it makes sense to aim for both.

To me, the biggest argument against this PRs change is that it would increase the NFS storage needs on average.

damianavila · 2021-07-16T21:49:16Z

I know what Kubernetes is - the users though?

We tell them we use Kube: https://pilot.2i2c.org/en/latest/about/infrastructure.html#
But I acknowledge there could be a lot of users not knowing about that nor how Kube pods usually behave...
So you have a point there!

I think using conda-store is preferred for everyone involved long term.

I do not have experience with conda-store, eager to know more about it.

At the same time I think letting users' explicitly created conda environments be persistent and accessible is also a reasonable step along the way

I am not sure about that although I see value in specific cases... this is why I pushed in the issue for admin customization of the base image instead of shipping it from here 😉

Wondering what other's in the @2i2c-org/tech-team think about this concept.

To me, the biggest argument against this PRs change is that it would increase the NFS storage needs on average.

+1 on that additional argument 😜

choldgraf · 2021-07-20T05:55:46Z

Just a quick thought here - I believe that @yuvipanda's reasoning here initially came from our experience with the Berkeley data hub. One of the most common problems we ran into were users installing their own environments in their filesystems, and this causing problems when the hub's base environment was updated, or caused students to write code that wouldn't work with other people on the same hub, etc.

All that is to say, I definitely agree this is useful but there is some UX to think about, particularly for less-experienced users that may not have a strong understanding of environments and anto-patterns there.

damianavila · 2021-07-20T21:29:28Z

Just a quick thought here - I believe that @yuvipanda's reasoning here initially came from our experience with the Berkeley data hub. One of the most common problems we ran into were users installing their own environments in their filesystems, and this causing problems when the hub's base environment was updated, or caused students to write code that wouldn't work with other people on the same hub, etc.

@choldgraf, I have not seen any @yuvipanda's feedback on this PR nor the issue, but glad to know he agrees with me 😜
Btw, I have seen the same patterns (problems) in the past and this is why I am against this PR.

I think this decision should be made at the Hub customization level... I mean, if there is some hub/community that actually wants this behavior, let's document this piece of code for them... but I would not agree with shipping this "behavior" as is as the default experience.

choldgraf · 2021-08-26T21:26:03Z

Hey all - I don't think that we have consensus yet on the right way to allow hubs to persist environments between user sessions.

I propose that we do the following:

Close this PR (we have a reference to it in https://github.com/2i2c-org/pilot-hubs/issues/620 so we can refer to it for inspiration)
Make a decision about what this should look like in https://github.com/2i2c-org/pilot-hubs/issues/620
Open a new PR to implement (or re-open this PR if we think it is the right approach after discussion)

If nobody objects then I'll close this PR tomorrow!

consideRatio · 2021-08-26T22:25:06Z

Thanks @choldgraf that sounds good!

Install conda envs in home directory by default

dec8a04

damianavila mentioned this pull request Jul 20, 2021

Team Sync - Jul 19, 2021 2i2c-org/team-compass#156

Closed

choldgraf mentioned this pull request Aug 22, 2021

Persist a user's custom environment between sessions 2i2c-org/features#6

Open

2 tasks

consideRatio closed this Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Install conda envs in home directory by default #517

Install conda envs in home directory by default #517

consideRatio commented Jul 14, 2021 •

edited

Loading

damianavila commented Jul 14, 2021

consideRatio commented Jul 15, 2021

damianavila commented Jul 15, 2021

consideRatio commented Jul 15, 2021 •

edited

Loading

damianavila commented Jul 16, 2021

choldgraf commented Jul 20, 2021

damianavila commented Jul 20, 2021

choldgraf commented Aug 26, 2021

consideRatio commented Aug 26, 2021

Install conda envs in home directory by default #517

Install conda envs in home directory by default #517

Conversation

consideRatio commented Jul 14, 2021 • edited Loading

damianavila commented Jul 14, 2021

consideRatio commented Jul 15, 2021

damianavila commented Jul 15, 2021

consideRatio commented Jul 15, 2021 • edited Loading

damianavila commented Jul 16, 2021

choldgraf commented Jul 20, 2021

damianavila commented Jul 20, 2021

choldgraf commented Aug 26, 2021

consideRatio commented Aug 26, 2021

consideRatio commented Jul 14, 2021 •

edited

Loading

consideRatio commented Jul 15, 2021 •

edited

Loading