Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Hub] Alabama Water Institute CIROH hub #1444

Closed
9 tasks done
colliand opened this issue Jun 21, 2022 · 49 comments
Closed
9 tasks done

[New Hub] Alabama Water Institute CIROH hub #1444

colliand opened this issue Jun 21, 2022 · 49 comments
Assignees

Comments

@colliand
Copy link
Contributor

colliand commented Jun 21, 2022

Hub Description

The Alabama Water Institute (AWI) is convening a consortium of 28 university partners to improve water management for the USA. The announcement of the award to support the collaboration called CIROH is available here.

2i2c has been engaged to provide interactive computing service supporting this collaboration.

The service will initially use GitHub auth using an allow list based on membership in AWI GitHub organization. As the service evolves, I anticipate we may move over to CIlogon.

Community Representative(s)

@jameshalgren

Important dates

Notes: dates are updated accordingly to new information and prioritization.

  • Target start date: 2022-08-01
  • Required start date: 2022-07-08

Hub Authentication Type

GitHub Authentication (e.g., @MyGitHubHandle)

Hub logo information

  • URL to Hub Image:
    ciroh

  • URL for Image Link: {{ URL HERE }}

Hub user image

  • Repository for user image: { REPO LINK IF IT EXISTS }
  • User image registry: { REGISTRY IF ONE ALREADY EXISTS }
  • User image tag and name: { NAME AND TAG IF IT EXISTS }

Extra features you'd like to enable

  • Specific cloud provider or datacenter: GCP us-central1
  • Dedicated Kubernetes cluster
  • Scalable Dask Cluster

Other relevant information

Let's get started with a Pangeo-style Daskhub. The capacity of the team at AWI is increasing and a customized software environment will likely be ready later in the year.

I suggest this hub offer the VNC/Linux desktop feature.

This hub should be hosted on GCP in a data center that hosts the National Water Model Data.

Hub URL

ciroh.awi.2i2c.cloud

Hub Type

daskhub

Tasks to deploy the hub

  • Engineer who will deploy the hub is assigned
  • Deploy information filled in above
  • Initial Hub deployment PR: Alabama Water Institute CIROH deployment #1553
  • Administrators able to log on
  • Community Representative satisfied with hub environment
  • Hub now in steady-state
@colliand
Copy link
Contributor Author

Based on today's call with @jameshalgren, I suggest the following onboarding process. CIROH and AWI have ambitious plans so it's important we get the initial conditions right.

  1. @colliand will work with @jameshalgren, Stefanie O'Neil and colleagues from 2i2c and CS&S to establish the business relationship. The path suggested by AWI is that 2i2c/CS&S provide a contract with an attached statement of work document. The statement of work will be phased.
  2. The hub will be launched sometime in July and @jameshalgren (with inputs from the 2i2c team) will seed a shared directory with some sample notebooks.
  3. After the hub is up and running, I suggest that @fperez virtually meet with the AWI/CIROH team and provide a ~1h demo of how to use the platform for open science. Others on the 2i2c team (e.g. @colliand and @jmunroe) should assist with Fernando's demo and learn his tricks so that we can provide similar demos in the future.

@damianavila
Copy link
Contributor

Suggested plan LGTM, @colliand.
I added the request to the backlog board and we will find the eng resources so we can push forward on step 2 in a timely manner.

@jameshalgren, we will ping you soon with some questions about the specific of the hub deployment.

@jameshalgren
Copy link

Thanks @colliand, @damianavila. Processing, will respond soon.

@jameshalgren
Copy link

jameshalgren commented Jun 23, 2022

A few questions, possibly specialized, probably going beyond the scope of this issue. Tagging @colliand to ask for redirection or moderation if necessary.

  • Lots of evolution in this terminology; this is just one link from my modest attempt to survey the panoply.

@jameshalgren
Copy link

Tagging @whitelightning450 @karnesh for situational awareness.

@colliand
Copy link
Contributor Author

colliand commented Jun 23, 2022

Hi James! Yes 2i2c has experience with the real-time-collaboration features in upstream Jupyter. Experiments have shown that feature is not ready for production deployments. There is ongoing work there and 2i2c will support RTC when we can do so securely and robustly.

Yes, our team is contributing to the "tantalizing future" you referenced. The pioneering work of the Pangeo community is an inspiration for the founding of 2i2c. We are in the process of on-boarding a new team member @jmunroe who has technical and community experience with big data geosciences. I spent some time briefing him on CIROH/AWI today and expect he will be an excellent resource for our collaboration.

@fperez
Copy link
Contributor

fperez commented Jul 1, 2022

@jameshalgren I haven't seen (which doesn't mean they don't exist, obviously) examples of hubs tightly integrated with ODCs. But from a quick look at the ODC setup, I see a key element of this is having an accessible Postrgres server to manage the actual data catalogs and serving.

Coincidentally, as part of the Jupyter Meets the Earth effort, with @consideRatio and @yuvipanda we're looking right now at how to most cleanly set up a persistent, robust and cost-effective Postgres server that can be accessed by all the users of a Hub. We happen to need that for one of our research projects, and our current solution (via sqlite) is sub-optimal.

We'll be happy to share any progress we make on that front back with the rest of the team - just today I was discussing with @consideRatio how this was very likely to be a use case that many others would be likely to encounter. So I'm delighted to see that intuition confirmed by your needs, and it means it's all the more timely that we make progress on it :)

@jmunroe
Copy link
Contributor

jmunroe commented Jul 7, 2022

Assuming the National Water Model will be a key dataset used by this hub, I'll note a few other links

This is in additional to the NWM data store on GCP linked above.

I am interested in identifying other key datasets that the community will anticipating using on this hub to ensure it is being set up in a way that accessing that data is straight forward for users.

@colliand
Copy link
Contributor Author

colliand commented Jul 8, 2022

Thanks @jmunroe! I'll add @jameshalgren here in case he can share any other input on important data sets for the emerging CIROH community.

@jameshalgren
Copy link

jameshalgren commented Jul 8, 2022

Thanks @colliand and @jmunroe. I've jotted down a few thoughts/responses to launch the weekend:

Assuming the National Water Model will be a key dataset used by this hub, I'll note a few other links

It will be the key dataset used in this hub, together with observation data initially from USGS, but from any valid source.

  • About the The National Water Model from the Office of Water Prediction

    • Includes links to HTTP and FTP sites of the last two days output of the NWM.

I think it is http only at this point. There are ftp-versions using the LDM protocol for direct sharing of data between NWS offices, but that's probably not relevant here for the moment.
FWIW, the NOMADS servers also host all of the NWS weather model output -- though the storage formats are far from optimal for cloud access, just like the NWM data.

There is a 1.2 GCP bucket of the same data (they use the label 'reanalysis' which is technically incorrect...). The AWS version of that data is more complete, with the 1.2, 2.0, and 2.1 versions of the retrospective data, along with experimental (?) versions with subsets of the data in zarr formats.

The GCP bucket mentioned is a superset of the S3 resource, with the analysis, short (on s3), medium, and long-range output. In fact, only a handful of specific derived products appear to be missing from the GCP bucket relative to what is available on the direct download from NOMADS.

Hopefully, some of what we make here can allow for Dr. Maidment's work to be more easily contributed back to the broader NWM community. He and his team were critical influencers in the initiation of the project and continue to generate great work!

This is in additional to the NWM data store on GCP linked above.

I am interested in identifying other key datasets that the community will anticipating using on this hub to ensure it is being set up in a way that accessing that data is straight forward for users.

I mentioned USGS data. There is a useful toolset for accessing USGS data and we may use that or replicate a portion into storage on the cloud backend. I am aware of a similar script by @groutr.

Those observed streamflow (which are really observed stream-stage data converted to estimated flow -- but the convention is to call them streamflow...) data will be the key initial dataset because they are the key output from the model . As we continue, additional variables will be examined and we will have to identify or create repositories of validation data to use for exploration.

@damianavila
Copy link
Contributor

A few questions for all of you 😉

Let's get started with a Pangeo-style Daskhub. The capacity of the team at AWI is increasing and a customized software environment will likely be ready later in the year.

OK, so starting with the pangeo-notebook image is enough to start with, I presume. Can you confirm?

I suggest this hub offer the VNC/Linux desktop feature.

IIRC, @yuvipanda set up this feature for the Jack Eddy symposium.

This hub should be hosted on GCP in a data center that hosts the National Water Model Data.

Are we talking about a dedicated cluster here? Or are you OK with the hub being deployed in a shared cluster?
(@colliand do you have any more information about this aspect from the lead process? Thanks!)

@sgibson91
Copy link
Member

I suggest this hub offer the VNC/Linux desktop feature.

IIRC, @yuvipanda set up this feature for the Jack Eddy symposium.

@yuvipanda it would be great if we could take this opportunity to document how to setup this feature in the hub features docs

@consideRatio
Copy link
Contributor

consideRatio commented Jul 18, 2022

I suggest this hub offer the VNC/Linux desktop feature.

IIRC, @yuvipanda set up this feature for the Jack Eddy symposium.

@yuvipanda it would be great if we could take this opportunity to document how to setup this feature in the hub features docs

For reference, I think this is solely something to setup in the user image. This is what JMTE has done to support this functionality.

  1. Install TurboVNC
  2. Install jupyterhub/jupyter-remote-desktop-proxy
  3. Install dependency: websockify

It is then represented as the "Desktop" icon in the JupyterLab launcher.

image

image

@colliand
Copy link
Contributor Author

Yes, I suggest that the AWI/CIROH hub be set up on a dedicated GKE cluster on the data center where the NWM data is hosted. I suggest that 2i2c manage the billing account for the cluster with the monthly cloud usage costs passed through to AWI. AWI/CIROH may choose to take over the billing account as the service and their devops capacity expands.

I like the advice shared by @consideRatio ratio that we set this hub to resemble the JMTE hub. The suite of integrated tools in that hub is tuned to support collaborations like those envisioned by CIROH.

@sgibson91
Copy link
Member

sgibson91 commented Jul 19, 2022

I suggest that 2i2c manage the billing account for the cluster with the monthly cloud usage costs passed through to AWI. AWI/CIROH may choose to take over the billing account as the service and their devops capacity expands.

This sounds like we should create a new billing account and not just use the two-eye-two-see one, no?

P.S. It also looks like I don't manage the two-eye-two-see billing account, so I can't create a project attached to that one in the interim

Screenshot 2022-07-19 at 10 26 20

@jameshalgren
Copy link

I like the advice shared by @consideRatio ratio that we set this hub to resemble the JMTE hub. The suite of integrated tools in that hub is tuned to support collaborations like those envisioned by CIROH.

Link to that hub for reference?

@jameshalgren
Copy link

@whitelightning450, @hellkite500, @aaraney, @karnesh, @mgdenno -- have been meaning to loop you in here so you can follow the development here.

@quebbs -- hello! -- tagging you ahead of upcoming discussion. This may be a tool to put to use.

@sgibson91
Copy link
Member

sgibson91 commented Jul 20, 2022

Ok, I have created a new GCP account to deploy this into. I have connected the 2i2c billing account for now, and we can decide to change that later if needed.

(Big gold star ⭐ to Chris for figuring that out!)

@sgibson91
Copy link
Member

on the data center where the NWM data is hosted

Can we be a bit more specific about this please? The NWM data is multi-regional in the US: so is us-central1-b ok? Do we envision this hub wanting to use GPUs in the future (then we should go with us-central1-c)?

@damianavila
Copy link
Contributor

Do we envision this hub wanting to use GPUs in the future (then we should go with us-central1-c)?

@colliand was that piece part of the conversation? @jameshalgren, any input about this one?

@jameshalgren
Copy link

jameshalgren commented Jul 20, 2022

I think we want to avoid f-35 syndrome. Let me check with a couple of others but I think we can do plenty with GPCPUs for now.

Having the option in the future might be useful. What are the trade-offs for going to the data center where GPUs are available?

@damianavila damianavila removed their assignment Jul 20, 2022
@sgibson91
Copy link
Member

I am struggling to install TurboVNC with the provided code snippet and receiving the following error:

  E: Invalid archive signature
  E: Internal error, could not locate member control.tar{.zst,.lz4,.gz,.xz,.bz2,.lzma,}
  E: Could not read meta data from /home/jovyan/turbovnc.deb
  E: The package lists or status file could not be parsed or opened.

PR: CIROH-UA/awi-ciroh-image#1

@consideRatio
Copy link
Contributor

@sgibson91 seems like you have the exact same code snippet and a similar base image as in https://github.com/pangeo-data/jupyter-earth/blob/master/hub.jupytearth.org-image/Dockerfile. So, maybe the apt install step crashes because of something missing, such as build-essential?

Hmmm, googling on the errors, I see notes about apt clean etc. Also, I note that you have a step before using apt update that didn't end with a cleanup step. Maybe that could help? This is a wild guess without motivation.

@sgibson91
Copy link
Member

sgibson91 commented Jul 25, 2022

Thanks @consideRatio. I added the clean-up step to the earlier apt update invocation, and that produced a new error related to "held broken packages". So I added an apt update and the clean-up step to the TurboVNC step and now it builds successfully 🤷🏻

Final commit looks like this: 2i2c-org/awi-ciroh-image@6d4f05c (#1)

@jameshalgren
Copy link

@jameshalgren Can you please provide the list of GitHub Teams you would like to have access to the hub?

@sgibson91 -- alabamawaterinstitute, please, and NOAA-OWP

Thanks!

@jameshalgren
Copy link

jameshalgren commented Jul 25, 2022

@colliand

This deck

... that is a link to a NASA ICESat-2 Hackweek quote.

@sgibson91
Copy link
Member

sgibson91 commented Jul 25, 2022

alabamawaterinstitute, please, and NOAA-OWP

These are organizations, I was under the impression you wanted specific teams to have access? E.g. the tech-team that is a member of the 2i2c org -> https://github.com/orgs/2i2c-org/teams/tech-team

@sgibson91
Copy link
Member

sgibson91 commented Jul 25, 2022

Ah pardon me, I think I'm misremembering another hub setup issue where a question was raised about subteams

@jameshalgren
Copy link

Ah pardon me, I think I'm misremembering another hub setup issue where a question was raised about subteams
@sgibson91
10-4 -- we may refine later, but I'm assuming that is a simple process.

@sgibson91
Copy link
Member

I'm assuming that is a simple process

Absolutely.

The hubs are available here:

Please note these docs about authorising the GitHub app for the first time: https://infrastructure.2i2c.org/en/latest/howto/configure/auth-management.html#follow-up-github-organization-administrators-must-grant-access

@sgibson91
Copy link
Member

sgibson91 commented Jul 25, 2022

@consideRatio are there any other setup steps regarding the VNC/Linux desktop? I would've expected a button on the Lab Launcher saying "Desktop", but it's not there. Also changing /lab to /desktop in the URL returns a 404 😕 Maybe @yuvipanda can help too?

Image repo: https://github.com/2i2c-org/awi-ciroh-image

@consideRatio
Copy link
Contributor

This is what is done in the JMTE image, which is based on a pangeo-notebook base image: #1444 (comment). I don't think anything else is needed!

@sgibson91
Copy link
Member

🤔 Hmmm ok, maybe Yuvi can help me debug when he's online then

@consideRatio
Copy link
Contributor

consideRatio commented Jul 25, 2022

@sgibson91 I would suspect CIROH-UA/awi-ciroh-image@7b080be#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R32-R33 could be to blame. I don't understand how jupyter-server-proxy registers things to show up in jupyterlab and start up properly, but jupyterlab presents icons for notebook / kernels etc, and maybe there is a common mechanism in play related to removing nb_conda_kernels.

Hmmm, thinking about it, if you don't succeed in accessing /user/some-name/desktop, it makes me think that jupyter-server-proxy has failed to start. That I know from experience can happen if some other jupyter-server-proxy package fails to load properly. So, something else registering itself with jupyter-server-proxy may be to blame.

@sgibson91
Copy link
Member

sgibson91 commented Jul 25, 2022

Yeah, tbh, I'm just guessing and used https://github.com/2i2c-org/coessing-image/blob/main/Dockerfile as a starting point (before the Julia addition :D)

@jameshalgren
Copy link

The hubs are available here:

Awesome! Does this mean we can get in a start trying things out (I assume this will begin to incur cloud costs...)?

@sgibson91
Copy link
Member

@jameshalgren yes and yes :) I'm still trying to figure out the VNC/Linux desktop feature though

@sgibson91
Copy link
Member

sgibson91 commented Jul 25, 2022

I made some progress in PR CIROH-UA/awi-ciroh-image#3 I now have the Desktop icon on JupyterLab's launcher (I'm testing this on the staging hub).

However when I click on it, I see "Something went wrong, connection is closed"

Logs from my user server (k logs jupyter-sgibson91) show:

[I 2022-07-25 16:05:12.314 SingleUserNotebookApp handlers:432] Trying to establish websocket connection to ws://localhost:5901/websockify
2022-07-25 16:05:12,316 - SingleUserNotebookApp - ERROR - Uncaught exception GET /user/sgibson91/desktop/websockify (10.128.0.3)
HTTPServerRequest(protocol='https', host='staging.ciroh.awi.2i2c.cloud', method='GET', uri='/user/sgibson91/desktop/websockify', version='HTTP/1.1', remote_ip='10.128.0.3')
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/tornado/tcpclient.py", line 138, in on_connect_done
    stream = future.result()
tornado.iostream.StreamClosedError: Stream is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/tornado/websocket.py", line 956, in _accept_connection
    await open_result
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 672, in open
    return await super().open(self.port, path)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 494, in open
    return await self.proxy_open('localhost', port, proxied_path)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 444, in proxy_open
    await start_websocket_connection()
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/jupyter_server_proxy/handlers.py", line 435, in start_websocket_connection
    self.ws = await pingable_ws_connect(request=request,
  File "/srv/conda/envs/notebook/lib/python3.9/asyncio/tasks.py", line 328, in __wakeup
    future.result()
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/tornado/iostream.py", line 1205, in connect
    self.socket.connect(address)
OSError: [Errno 99] Cannot assign requested address

@sgibson91
Copy link
Member

@GeorgianaElena suggested some missing packages in CIROH-UA/awi-ciroh-image#3 (review) and now the desktop feature is available!

@colliand
Copy link
Contributor Author

Thanks @jameshalgren. I fixed the link to point to the intended slide deck created by Fernando.

@colliand
Copy link
Contributor Author

Now that the production and staging hubs are available, I suggest to @jameshalgren that we organize a kickoff event for CIROH personnel who will manage the hub with @jmunroe @fperez (and perhaps others on the 2i2c team). Perhaps we can link up for a phone call to discuss some launch planning?

@jameshalgren
Copy link

jameshalgren commented Aug 1, 2022

@colliand -- targeting 23 August for a technically focused demo.

@damianavila
Copy link
Contributor

I think we can close this issue (new hub set up) by now (since I believe it is completed) and continue the conversation on new issues.

Repository owner moved this from In progress to Complete in DEPRECATED Engineering and Product Backlog Aug 3, 2022
@jameshalgren
Copy link

jameshalgren commented Aug 3, 2022

Thanks @damianavila -- new issues (as needed) are still posted under this repository, correct?
(and, for that matter, thanks @colliand, @sgibson91, @consideRatio, @fperez, and @jmunroe and all the rest -- we're excited!)

@damianavila
Copy link
Contributor

@jameshalgren, for follow-up questions/requests I would suggest using our support email channel.
Over there, we will be able to provide useful feedback and, in some cases, open issues in specific repositories accordingly to the topic you are rising in that conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

8 participants