Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authenticated and optionally private static websites served from a hub #1500

Open
1 of 2 tasks
choldgraf opened this issue Jul 5, 2022 · 15 comments · Fixed by #1502
Open
1 of 2 tasks

Authenticated and optionally private static websites served from a hub #1500

choldgraf opened this issue Jul 5, 2022 · 15 comments · Fixed by #1502
Assignees

Comments

@choldgraf
Copy link
Member

choldgraf commented Jul 5, 2022

Context

Our hosted documentation service describes how people can point the hub to a github repository branch with HTML files that it then serves as a service.

However, there are a few things unclear from the documentation:

  1. Will the hosted documentation be hidden from users until after log-in?
  2. How often will the hosted documentation update as changes are made?
  3. Can the documentation repository be private?

Proposal

We should answer each of the questions above, and potentially do some development work to make it possible if not already.

Updates and actions

References

@yuvipanda
Copy link
Member

The 'hosted documentation' is not available only to logged-in users. That would require an OAuth based proxy which we don't currently have setup.

My suggestion is to use nbgitpuller setup for private repository access - users can get a link they can click that can take them to markdown or a notebook file with information

@choldgraf
Copy link
Member Author

Is it possible for us to add a start script, similar to what we do in Binder? Then we could make a command run every time somebody's session starts, and this could pull down the latest version of content via nbgitpuller and display it or something.

The challenge I see with this approach is that the content will be attached to an event, and the organizers will want it to update quickly and somewhat often as they update the source content in the event. I worry that an nbgitpuller approach will require users to take extra "click" actions that in practice will be clunky.

@yuvipanda
Copy link
Member

The 'right' way to do this is to use https://oauth2-proxy.github.io/oauth2-proxy/ and have that authenticate against the JupyterHub as a service. We need to do that properly once and then it can be reused elsewhere!

More specifically, we'll need to:

This should allow us to host arbitrary static sites protected by JupyterHub login.

@yuvipanda
Copy link
Member

ok so I got nerdsniped into this. Here's a bunch of findings:

  1. Despite the name, oauth2-proxy doesn't actually support straight up oauth2! It only supports OIDC, which is built on top of oauth2.
  2. JupyterHub doesn't support OIDC, only oauth2
  3. However, OIDC is only a small layer on top of OAuth2, so if we can find some glue that can take in OAuth2 in one end and spit out OIDC on the other end...
  4. That 'something' is https://dexidp.io/.

I've this working locally!

A dex.yaml:

issuer: http://127.0.0.1:8000/services/dex
storage:
  type: sqlite3
  config:
    file: dex.sqlite
web:
  http: 0.0.0.0:5556

oauth2:
  skipApprovalScreen: true

connectors:
- type: oauth
  # ID of OAuth 2.0 provider
  id: hub
  # Name of OAuth 2.0 provider
  name: hub
  config:
    # Connector config values starting with a "$" will read from the environment.
    clientID: service-dex
    clientSecret: wateriswet
    redirectURI: http://127.0.0.1:8000/services/dex/callback
    userIDKey: name

    tokenURL: http://127.0.0.1:8000/hub/api/oauth2/token
    authorizationURL: http://127.0.0.1:8000/hub/api/oauth2/authorize
    userInfoURL: http://127.0.0.1:8000/hub/api/user

staticClients:
- id: oauth2-proxy
  redirectURIs:
  - 'http://127.0.0.1:9000/oauth2/callback'
  name: 'oauth2-proxy'
  secret: proxy

A jupyterhub_config.py file:

c.JupyterHub.authenticator_class = 'dummy'
c.JupyterHub.spawner_class = 'simple'

c.JupyterHub.load_roles = [
    {
        'name': 'user',
        'scopes': [
        # Allow all users access to 'services', which include the hubs that
        # use the main datahub for auth
        'access:services', 'self'
    ]
    }
]
c.JupyterHub.services = [
    {
        'name': 'dex',
        'url': 'http://0.0.0.0:5556',
        'api_token': 'wateriswet',
        'oauth_redirect_uri': 'http://127.0.0.1:8000/services/dex/callback'
    },
]

Then I can start the oauth2-proxy with:

oauth2-proxy --provider=oidc --cookie-secret=${COOKIE_SECRET} --client-id=oauth2-proxy --client-secret proxy --redirect-url=http://127.0.0.1:9000/oauth2/callback  --oidc-issuer-url=http://127.0.0.1:8000/services/dex --email-domain='*'   --http-address=http://127.0.0.1:9000 --upstream='file://$(pwd)#/' --insecure-oidc-allow-unverified-email --oidc-email-claim=sub

And now if I go to http://127.0.0.1:9000, I'll get redirected to the hub, asked to log in and then I'll see my static files! \o/

I'll have to port this to this repo, but that should be doable.

@yuvipanda
Copy link
Member

The one thing it's missing is JupyterHub's oauth_no_confirm that was implemented a while ago, removing an additional consent screen from this. That requires latest JupyterHub, and we can turn that on once it's deployed.

@yuvipanda
Copy link
Member

auth-oauth-proxy.mp4

here's a demo of the auth flow!

@yuvipanda yuvipanda self-assigned this Jul 6, 2022
@GeorgianaElena
Copy link
Member

here's a demo of the auth flow!

🎉 🎉 waaa, this is great @yuvipanda !!!

@yuvipanda
Copy link
Member

Note that this now enables two new use cases:

  1. Integrating as auth with any multi-user software (wordpress, discourse, drupal, etc) that supports OIDC as auth layer
  2. Putting any piece of web app (doesn't need to have authentication itself) behind auth here. This is basically a different tack to the problem jupyter-server-proxy solved.

@damianavila damianavila moved this from Ready to work to Needs Shaping / Refinement in DEPRECATED Engineering and Product Backlog Jul 6, 2022
@damianavila
Copy link
Contributor

This is great, @yuvipanda!!

@choldgraf choldgraf changed the title Make it possible to host *private* documentation via a hub Authenticated and optionally private static websites served from a hub Jul 8, 2022
Repository owner moved this from Needs Shaping / Refinement to Complete in DEPRECATED Engineering and Product Backlog Jul 11, 2022
@choldgraf
Copy link
Member Author

choldgraf commented Oct 18, 2022

Hey all - I think that this one was accidentally closed early. We implemented the feature here:

but, we still don't have user-facing documentation for this functionality. In my opinion, if it is not documented for a particular user archetype, then the feature doesn't exist from their perspective. In this case, we have documentation for our engineers, but not for any user or external community member. Given that this is a user-facing feature, we shouldn't consider this issue resolved until it is documented for users.

This is the issue that needs resolution:

@choldgraf choldgraf reopened this Oct 18, 2022
@arokem
Copy link
Contributor

arokem commented Oct 18, 2022

Lemme know if you want to hear more about how we used this in NeuroHackademy.

@yuvipanda
Copy link
Member

@arokem YESSSSS TELL ME!

@arokem
Copy link
Contributor

arokem commented Oct 18, 2022

Here's our use case: Because we had participants both in person and on zoom, and we sometimes had multiple rooms where activities were going on at the same time, we wanted one page that would have all the links to zoom, mapped to physical locations. So we used this website for that. The (rather unadorned) page had a list that looked roughly like:

[Alder auditorium](https://link.to.zoom)
[Alder 103](https://link.to.another.zoom)
[Alder 102](https://link.to.yet.another.zoon)

where Alder auditorium, Alder 103, etc. are physical spaces at UW's Alder Hall. That way, when we announced the schedule for the day, we could say something like "Ariel's lecture on data visualization will take place at Alder 103 at 10am" and that would mean both a physical space, as well as a zoom room.

The importance of this being behind the authentication is that we didn't want anyone zoom-bombing us.

One use case -- I imagine there are others.

@damianavila
Copy link
Contributor

@jmunroe, to make a decision on how to communicate this to the communities.

@damianavila
Copy link
Contributor

damianavila commented Aug 15, 2023

Update: This feature is currently broken, #2206.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

6 participants