Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request deployment] New Hub: CosmicDS/Harvard #2128

Closed
8 tasks
colliand opened this issue Feb 1, 2023 · 77 comments · Fixed by #2443 or #2809
Closed
8 tasks

[Request deployment] New Hub: CosmicDS/Harvard #2128

colliand opened this issue Feb 1, 2023 · 77 comments · Fixed by #2443 or #2809
Assignees

Comments

@colliand
Copy link
Contributor

colliand commented Feb 1, 2023

Important dates

  • Target start date: 2023-03-01
  • Required start date: 2023-02-08
  • Any important dates for usage: No

Hub Authentication Type

Other (may not be possible, please specify in comments)

First Hub Administrators

@nmearl, Nicholas Earl
@patudom, Patricia Udomprasert

[GitHub Auth only] How would you like to manage your users?

None

[GitHub Teams Auth only] Profile restriction based on team membership

No response

Hub logo image URL

pending...

Hub logo website URL

pending...

Hub user image GitHub repository

pending...

Hub user image tag and name

pending...

Extra features you'd like to enable

  • Specific cloud provider or datacenter (otherwise GCP)
  • Dedicated Kubernetes cluster
  • Scalable Dask Cluster

(Optional) Preferred cloud provider

None

(Optional) Billing and Cloud account

None

Other relevant information to the features above

  1. access CosmicDS is deployed into classes at various institutions. As a result, the access/auth layer is more complicated than it would be for a single university. CosmicDS has built and is improving upon a web app that users log into and this web app will be used as the SSO source of truth. Jim advised that CosmicDS may want to investigate CIlogon since this integration may then allow learners at various colleges and universities to access CosmicDS using their institutions credentials instead of with a separate login. I invite @nmearl to provide additional guidance in comments below.
  2. will it work? CosmicDS is deployed using a docker image that specifies the software environment inside JupyterHub. Nick built this and has experimented with it atop k8s in AWS. A delineation that is not clear at this point is whether CosmicDS may require deeper access to the JupyterHub deployment than is typically in the “software environment” scope. Jim and Nick discussed the idea to pilot a deployment using an education hub on a (2i2c owned) shared cluster with the understanding that we may need to change course and deploy instead on a dedicated cluster depending on the result of the pilot. I invite @nmearl to provide additional guidance in comments below.

Navigating the complexities associated with auth and software environment are the responsibility of the CosmicDS team (with some input and suggestions from 2i2c's Engineering Team). If significant work and software development is required from 2i2c, we will need to revisit the business terms of this pilot.

Tasks to deploy the hub

  • 1. Deploy information filled in above
  • 2. Engineer who will deploy the hub is assigned
  • 3. If using GitHub Orgs/Teams Auth, Engineer is given Owner rights to the org to set this up.
  • 4. Initial Hub deployment PR
  • 5. Administrators able to log on -> Hub now in steady-state
@colliand
Copy link
Contributor Author

colliand commented Feb 2, 2023

Hi @nmearl! Following a suggestion from @choldgraf in a meeting earlier today, we will get a good signal to the will it work? item above if CosmicDS can be launched on binder. Do you know if that works? Or can you give it a try? @jmunroe pointed out that the Glue in Jupyter documentation includes examples that launch in binder so that's encouraging.

@colliand
Copy link
Contributor Author

colliand commented Feb 2, 2023

One other update for @nmearl @patudom: 2i2c Engineering is reviewing the deployment timeframe I suggested in the anchor issue. 2i2c may need to push this into the next sprint so end of February is more likely.

@patudom
Copy link

patudom commented Feb 6, 2023

@colliand, thank you so much for getting this process started for CosmicDS! I'm confirming that we are ok with the end of February timeframe for deployment.

@nmearl will reply soon about the binder-related technical questions.

@nmearl
Copy link

nmearl commented Feb 10, 2023

Apologies for the late follow up. Thanks again for getting this rolling, @colliand. Yes, the app works on Binder modulo some issues with pywwt being blocked from pulling in image data (I'll be working out this kink in the coming days).

Though, I caution taking a Binder deployment to be representative of the setup we've been working under with regards to kubernetes. In general, we use admin-defined dashboards setup through ContainDS to circumvent the normal jupyter environments. But having this on Binder does show, at least, the Voila rendering functions as intended.

@GeorgianaElena
Copy link
Member

Hey @colliand, @nmearl and @patudo ! I will be the one deploying the CosmicDS hub early next week. Can you help me get the info needed for this? (The info required is listed in the top comment)

Jim and Nick discussed the idea to pilot a deployment using an education hub on a (2i2c owned) shared cluster with the understanding that we may need to change course and deploy instead on a dedicated cluster depending on the result of the pilot.

Also, after reading through the discussion in this thread, esp the ⬆️, (though I might me missing additional context) I was thinking to start with:

  • a hub, on shared 2i2c infrastructure (we default to the shared 2i2c GCP cluster, but there's availability on AWS too if there are any requirements)
  • CILogon for authentication
  • try out the existing CosmicDS docker image on this hub given that it works on Binder. Btw do you have a link to the repo hosting it?

What do you think? Thank you!

@nmearl
Copy link

nmearl commented Mar 7, 2023

Hi @GeorgianaElena, thanks for taking this on! I'd like to start with some clarifications:

  • User management will be done preliminarily through JupyterHub, with local accounts created as needed. Because the users will mainly not be associated with defined institutions, we do not have plans at the moment to leverage an access management platform or OAuth system.
  • Our test deployment on binder does not use a pre-defined docker image; it uses a repository setup to take advantage of Binder generating its own image for the deployment (so it may not be quite as indicative of behavior on a cluster as intended).
  • Our docker image we've been testing with is available here. However, I should note that we weren't very successful in getting this setup with a test kubernetes cluster on our own, so we might need to iterate a bit on the implementation.
  • The docker image above was used for our single user image with the ideonate/cdsdashboards-jupyter-k8s-hub docker image for the hub, akin to the setup here. The hub image is necessary to get ContainDS installed, which we use to manage and generate the app dashboard.

Responses to the initial post:

  • [GitHub Auth only] How would you like to manage your users?
    See above. User management will be done through JupyterHub at the moment.

  • [GitHub Teams Auth only] Profile restriction based on team membership
    See above.

  • Hub logo image URL
    I cannot point you to "certified" image, but the image here can be used. (Perhaps @patudom can provide a better one.)

  • Hub logo website URL
    https://www.cosmicds.cfa.harvard.edu/

  • Hub user image GitHub repository
    https://github.com/nmearl/cds-constainds-docker

  • Hub user image tag and name
    cds-containds:latest (see here)

  • Extra features you'd like to enable

    • Specific cloud provider or datacenter (otherwise GCP)
    • Dedicated Kubernetes cluster
    • Scalable Dask Cluster

Thanks again!

@patudom
Copy link

patudom commented Mar 7, 2023

For the logo image, here are two options, depending on the background color:

@colliand
Copy link
Contributor Author

colliand commented Mar 9, 2023

Can @nmearl describe the anticipated users during the pilot? Can we expect them to have community college accounts? Google accounts? GitHub? The auth options 2i2c offers now are CIlogon and those described here: https://docs.2i2c.org/en/latest/admin/howto/manage-users.html#authentication

@patudom
Copy link

patudom commented Mar 9, 2023

@colliand, I understand that you and @nmearl have spoken about user accounts. I will check with the teachers in our pilot about what types of accounts they use to gauge how many will work with CILogon. (At least one of the schools was on the InCommon list of institutions).

@colliand
Copy link
Contributor Author

colliand commented Mar 9, 2023

Thanks @patudom. My conversation with @nmearl helped me understand that the scenarios for the CosmicDS pilot (classes in high schools and community colleges) do not have a natural authentication service endpoint. From earlier conversations, I understood that CosmicDS plans to build a web app that will manage logins. In the meantime, I am hopeful we can find an effective workaround. Possible workarounds include using membership in a GitHub organization, or an allow-list via CIlogon. There is a lot of flexibility through CIlogon! 2i2c does not recommend managing passwords inside the hub due to security concerns.

@GeorgianaElena
Copy link
Member

User management will be done preliminarily through JupyterHub, with local accounts created as needed. Because the users will mainly not be associated with defined institutions, we do not have plans at the moment to leverage an access management platform or OAuth system.

Given the current 2i2c's infrastructure model and upstream local JupyterHub authenticator options, managing hub authentication through a local authenticator is something that would require additional engineer work and possibly also some compromises.

But, as @colliand said, there is a lot of flexibility through CILogon. I think this might be worth exploring for this pilot hub, before pushing forward the more complex effort that comes with a local authenticator.

CILogon flexibility

When using CILogon with a 2i2c hub, you we can have the users authenticate using any institutional provider that CILogon supports. This includes campus identity providers, but also GitHub, Google, Microsoft, and ORCID. One can enable all, some or just one of them in a hub, but additional care should be taken to not have multiple hub usernames associated to the same person.

User's hub access can then be managed from the administrator panel https://docs.2i2c.org/en/latest/admin/howto/manage-users.html#manage-users-from-the-administrator-panel.

The above would be the most straightforward setup. But, more complex things can also be achieved.
For example:

  • present a shorter list of identity providers to a user as login options. For example, instead of showing all CILogon authnetication options, a user can just see the option to login with GitHub, Google and Microsoft.
  • only allow certain identity providers to be used for login
  • modify the user information that the we get from each allowed identity providers, to use as the hub username. For example, things like stripping domains from emails, adding prefixes to the usernames and specifying an identity provider specific username claim can be leveraged
  • restrict the access of the users into the hub based on an allowed_domains list

For example https://demo.2i2c.cloud only shows two identity providers as login options: Google + Texas Uni and only these two are permitted. If logging in through Google, only Google email address that end in 2i2c.org will be allowed to login, whereas all Texas uni accounts will be granted access into the hub (this can further be restricted to only a list of usernames, managed through the hub admin Panel).

@damianavila damianavila moved this from Todo 👍 to In Progress ⚡ in Sprint Board Mar 10, 2023
@damianavila
Copy link
Contributor

damianavila commented Mar 30, 2023

I think the next steps here are:

  • Deploy a "standard" hub on shared 2i2c infrastructure (shared 2i2c GCP/AWS cluster)
  • CILogon for authentication
  • Default 2i2c image

And we go from there!

@colliand
Copy link
Contributor Author

I agree. Let's proceed with an education hub on a shared cluster (GCP or AWS as chosen by 2i2c engineering). Use CIlogon with a GitHub auth layer for now. We'll work with @nmearl and @patudom to adapt this later.

@github-project-automation github-project-automation bot moved this from In Progress ⚡ to Done 🎉 in Sprint Board Mar 30, 2023
@github-project-automation github-project-automation bot moved this from Needs Shaping / Refinement to Complete in DEPRECATED Engineering and Product Backlog Mar 30, 2023
@GeorgianaElena
Copy link
Member

Hub is now running at https://cosmicds.2i2c.cloud 🎉 Check it out.

@jmunroe
Copy link
Contributor

jmunroe commented Mar 30, 2023

I used the configurator to change the default experience to JupyterLab from the classic notebook.

But I'm expecting that the CosmicDS/Harvard team will soon be trying out their own docker image.

Since we are deprecating 2i2c-hubs-image for non-data8 hubs, should we be changing the hub image to cds-containds:latest instead?

@nmearl -- you can change this using the "configurator" tool in the JupyterHub control panel but we can also hardcode the docker image to use. Please chime in here if it is not clear how to choose a different image for your hub.

@patudom
Copy link

patudom commented Mar 30, 2023

Thanks so much! I logged in using my github credentials but was not able to access the server. I tried twice and got two slightly different sets of error messages. I'll post them here in case it's useful:

Attempt 1:

Spawn failed: pod cosmicds/jupyter-patudom did not start in 600 seconds!

Event log
Server requested
2023-03-30T15:32:57Z [Normal] Successfully assigned cosmicds/jupyter-patudom to gke-pilot-hubs-cluster-nb-user-05192705-npnd
2023-03-30T15:32:58Z [Normal] Pulling image "busybox"
2023-03-30T15:32:58Z [Normal] Successfully pulled image "busybox" in 94.440927ms
2023-03-30T15:32:58Z [Normal] Created container volume-mount-ownership-fix
2023-03-30T15:32:58Z [Normal] Started container volume-mount-ownership-fix
2023-03-30T15:32:58Z [Normal] Container image "jupyterhub/k8s-network-tools:2.0.1-0.dev.git.5866.h7de20b77" already present on machine
2023-03-30T15:32:58Z [Normal] Created container block-cloud-metadata
2023-03-30T15:32:59Z [Normal] Started container block-cloud-metadata
2023-03-30T15:32:59Z [Normal] Pulling image "nmearl/cds-containds:latest"
2023-03-30T15:34:35Z [Normal] Successfully pulled image "nmearl/cds-containds:latest" in 1m35.245028196s
2023-03-30T15:34:37Z [Normal] Created container notebook
2023-03-30T15:34:37Z [Normal] Started container notebook
2023-03-30T15:34:38Z [Normal] Container image "nmearl/cds-containds:latest" already present on machine
2023-03-30T15:34:39Z [Warning] Back-off restarting failed container
Spawn failed: pod cosmicds/jupyter-patudom did not start in 600 seconds!

Attempt 2:

Spawn failed: Timeout

Event log
Server requested
2023-03-30T16:13:30Z [Normal] Successfully assigned cosmicds/jupyter-patudom to gke-pilot-hubs-cluster-nb-user-05192705-npnd
2023-03-30T16:13:31Z [Normal] Pulling image "busybox"
2023-03-30T16:13:31Z [Normal] Successfully pulled image "busybox" in 117.089344ms
2023-03-30T16:13:31Z [Normal] Created container volume-mount-ownership-fix
2023-03-30T16:13:31Z [Normal] Started container volume-mount-ownership-fix
2023-03-30T16:13:32Z [Normal] Container image "jupyterhub/k8s-network-tools:2.0.1-0.dev.git.5866.h7de20b77" already present on machine
2023-03-30T16:13:32Z [Normal] Created container block-cloud-metadata
2023-03-30T16:13:33Z [Normal] Started container block-cloud-metadata
2023-03-30T16:13:33Z [Normal] Container image "nmearl/cds-containds:latest" already present on machine
2023-03-30T16:13:34Z [Normal] Container image "nmearl/cds-containds:latest" already present on machine
2023-03-30T16:13:34Z [Normal] Started container notebook
2023-03-30T16:13:35Z [Warning] Back-off restarting failed container
Spawn failed: Timeout

@nmearl
Copy link

nmearl commented Mar 31, 2023

@jmunroe what's the image used for the hub setup? Is it the default jupyterhub/k8s-hub image, or something else?

@GeorgianaElena
Copy link
Member

@nmearl, the hub image is defined here.

@patudom
Copy link

patudom commented Jul 12, 2023

That would be great - thank you, @yuvipanda. Enabling the google/microsoft accounts as soon as the numerical ID is available (preferably, no later than Friday 7/14) would be great.

The google/microsoft account logon can be disabled any time after 5pm ET on Friday 7/21.

@yuvipanda
Copy link
Member

@patudom @nmearl alright, I did a bunch of research and spoke to some privacy / cryptography folks, and have a solution that would provide usernames like hwvicmxcpcurbifh3hfuwyhuyoymgio24tje5zhiv64vfqa4lm2a and do not store any PII. However, in our experience, this makes support difficult, because you must ask the end user 'what is your username?' and they have to login to the hub and find out. Is that acceptable?

I'll work on getting a PR up soon.

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jul 12, 2023
- Implement the feature
- Write documentation on how to enable it, and why not to
- Enable it for cosmicds

TODO:

- Describe *how* this works
- Validate that authorization can work with this at all

Ref 2i2c-org#2128 (comment)
@damianavila damianavila moved this from Done 🎉 to Waiting 🕛 in Sprint Board Jul 13, 2023
@nmearl
Copy link

nmearl commented Jul 13, 2023

@yuvipanda this sounds good, but I'm curious why not simply use one of the openid or CILogon-specific scope's unique identifier claims for this purpose (e.g. sub or oidc)? Seems it'd be a bit more tractable (re: shorter) without including identifiable information.

We do not mind having the user provide their username for support purposes.

@yuvipanda
Copy link
Member

@nmearl @patudom I've written up a short document here on why this is needed to not store PII at https://github.com/2i2c-org/infrastructure/pull/2809/files#diff-ac02bcb032968f861b77821bb371824dfa6bfad4da743117e853e463cb902682. Take a look and let me know if that helps answer your questions?

@patudom
Copy link

patudom commented Jul 13, 2023

Thanks, @yuvipanda - we appreciate your research on this and the detailed writeup on pros, cons, and limitations of this method. None of the downsides you listed will be an issue with our current setup, so please go ahead and enable the anonymization. Thank you.

@yuvipanda
Copy link
Member

Great, @patudom. I've updated #2809 to match, and tested it to work. The auth should be enabled once that gets merged, which will hopefully be tomorrow.

We'll need to come up with a revert plan for July 21 soon though. We'll have to keep an eye on this to look for cryptomining as well.

@patudom
Copy link

patudom commented Jul 13, 2023

Thank you, @yuvipanda.

@yuvipanda
Copy link
Member

yw, @patudom. I'll update the issue once it gets merged.

Note that all existing users' names will change, so the contents of their home directory will go missing, as that's tied to their name. So for any existing users, please ask them to backup / download their home directories so as to not lose anything.

@nmearl
Copy link

nmearl commented Jul 13, 2023

Just as an side: no user should have anything in their home directories. Users should never know there's a JupyterHub running. As discussed in our chats, we hope to disable the Lab/Notebook environment and simply use the proxy to serve the content the users should interact with.

@yuvipanda
Copy link
Member

@nmearl can we just turn off the persistent home directory completely? So nothing is preserved between server restarts.

@nmearl
Copy link

nmearl commented Jul 14, 2023

@yuvipanda Yes, that would be fine.

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Jul 14, 2023
@yuvipanda
Copy link
Member

@nmearl @patudom try it out now, this has been deployed

@patudom
Copy link

patudom commented Jul 14, 2023

Thank you, @yuvipanda! I've been able to log on with google credentials and have my username appear as the random constructed string. We'll work on our end on tying those usernames to our database where we will store the students' activity within the Data Story.

I did notice a couple additional issues. If there's the possibility of addressing this first one by Wednesday, that would be great, but we understand if that's not possible.

  • When I click on the Hubble story link within the jupyter lab, it always hangs on a blank white page (with no progress bar or any other information). If I refresh the page, it will eventually load, but it is still pretty slow. Is there any way to shorten the launch time? (@nmearl told me that you will be meeting with him next week, so maybe you and Nick could look at the server logs to troubleshoot this and improve the performance).

This 2nd issue can be addressed after next week's implementation:

  • While we were waiting for the 2i2c hub setup, we've been using an AWS (non-Kubernetes) hub to run the Hubble story in classrooms this spring. Nick was able to set that up so we could give the user a url that would take them straight to the app after logging on (instead of having to pass through the JupyterLab page and manually clicking on the Hubble button to launch the app). Would it be possible for our 2i2c hub to be setup that way (with direct links to specific data stories)?

@yuvipanda
Copy link
Member

@patudom for (2), you can use the user-redirect feature here. So if you want to send users to /hubble after login, the URL would be https://cosmicds.2i2c.cloud/hub/user-redirect/hubble/ - clicking that will take the user directly to hubble/.

As for (1), unfortunately I don't think that's something we can really help with, as I don't really know how the dashboards work or what is taking time. I can help answer specific questions if you have any though.

@GeorgianaElena GeorgianaElena removed their assignment Jul 21, 2023
@yuvipanda
Copy link
Member

@patudom @nmearl did this event go well? Can I remove the generic auth now?

@nmearl
Copy link

nmearl commented Aug 1, 2023

Hey @yuvipanda. Yes, you can go ahead and remove the generic auth now. The test did not go as well as hoped, and we were forced to revert back to our own deployment. There are a few facets to this that I had hoped we could talk about over our Zoom call, but many revolved around subpar performance when a handful of active users were on the hub, as well as a database communication issue which only seems to occur on the 2i2c hub and that we're hoping we can get access to the hub server logs to diagnose.

@yuvipanda
Copy link
Member

@nmearl ah sorry to hear that! Let's catch up and see what we can do

@damianavila
Copy link
Contributor

I will close this one now since a hub was deployed and other additional features deployed.
If further work/debugging is needed, it should be captured in another issue.
I have created a continuation/placeholder for it: #2986

Thanks!

@github-project-automation github-project-automation bot moved this from Waiting 🕛 to Done 🎉 in Sprint Board Aug 15, 2023
@nmearl
Copy link

nmearl commented Aug 23, 2023

[Report from email] @yuvipanda We’ve successfully transitioned to using the hub user image template you provided. I’m curious if you could point out how we might access the server logs using this approach?

For the integration with the front-end site and the CILogon authentication, in addition to 2i2c providing the client id and client secret to us, do we need to provide you a callback url?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
8 participants