Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Hub: OceanHackWeek 2021 #549

Closed
6 of 7 tasks
abkfenris opened this issue Jul 22, 2021 · 56 comments · Fixed by #554
Closed
6 of 7 tasks

New Hub: OceanHackWeek 2021 #549

abkfenris opened this issue Jul 22, 2021 · 56 comments · Fixed by #554
Assignees

Comments

@abkfenris
Copy link
Contributor

abkfenris commented Jul 22, 2021

Hub Description

OceanHackWeek (OHW) is a 4-day collaborative learning experience aimed at exploring, creating and promoting effective computation and analysis workflows for large and complex oceanographic data. It includes tutorials, data exploration, software development, collaborative projects and community networking.

We will be using the hub to teach tutorials and develop projects with both in-person (EST) and worldwide participants.

Community Representative

@ocefpaf

Important dates

  • 2021-07-29 - Some participants with complete async pre-week training (basics of Git, Github, Scientific Python), tutorial presenters can record tutorials.
  • 2021-08-03 - OHW starts in earnest with async and synchronous tutorials, and project development.
  • 2021-08-06 - Last day of tutorials.

Target start date

2021-07-28

Preferred Cloud Provider

No preference (default)

Do you have your own billing account?

  • Yes, I have my own billing account.

Hub Authentication Type

GitHub Authentication (e.g., @MyGitHubHandle)

Hub logo

No response

Hub logo URL

No response

Hub image service

hub.docker.com

Hub image

uwhackweeks/oceanhackweek:28d1c7b

Extra features you'd like to enable

Hub Engineer information

The Hub Engineer should fill in the metadata below when it available. The Community Representative shouldn't worry about this section, but may be asked to provide help answering some questions.

Deployment information

Hub ID: ohw

Hub Cluster: pilot

Hub URL: ohw.pilot.2i2c.cloud

Hub Template: daskhub

Actions to deploy

  • Deploy information filled in above
  • Initial Hub deployment PR: Add OHW hub #554
  • Administrators able to log on
  • Community Representative satisfied with hub environment
  • Hub now in steady-state
@ocefpaf
Copy link
Contributor

ocefpaf commented Jul 26, 2021

@choldgraf please let us know when/how we can test it. (Folks are getting anxious to run pre-test their tutorials on the hub.)

@choldgraf
Copy link
Member

Sounds good - will try and deploy the hub tomorrow. (We are all on a European time zone currently)

@choldgraf
Copy link
Member

(also just to clarify, the target start date listed was the 28th, do you need the hub earlier than this?)

@ocefpaf
Copy link
Contributor

ocefpaf commented Jul 26, 2021

(also just to clarify, the target start date listed was the 28th, do you need the hub earlier than this?)

If we can get it on the 27th, tomorrow, it would be nice so we can make the instructors test their notebooks against it. The 28th would be tight but it works too.

@choldgraf
Copy link
Member

choldgraf commented Jul 27, 2021

Not quite ready to close this yet! We need confirmation from @ocefpaf that all seems well :-)

@ocefpaf see the hub URL above (https://ohw.pilot.2i2c.cloud/) and confirm you can log in etc!

@ocefpaf
Copy link
Contributor

ocefpaf commented Jul 27, 2021

Awesome! I was able to login (super fast) and I'll play with it ASAP. I'll probably return with tons of questions. I'll try to read the docs first ;-p

@ocefpaf
Copy link
Contributor

ocefpaf commented Jul 27, 2021

@choldgraf first question, and a simple one, how can I add/auth people to login?


Edit: Sorry, read the docs and doing it now.

@choldgraf
Copy link
Member

choldgraf commented Jul 28, 2021

I hope the lack of extra questions means you figured out how to do stuff as an admin, and not that things have gone down in flaming glory 😬🔥

also I added @GeorgianaElena on this one to track who is working on this hub deploy!

@ocefpaf
Copy link
Contributor

ocefpaf commented Jul 28, 2021

I hope the lack of extra questions means you figured out how to do stuff as an admin, and not that things have gone down in flaming glory

Yes. Surprisingly easy to manage so far. Great work! I'll experiment with adding package today.

also I added @GeorgianaElena on this one to track who is working on this hub deploy!

Good to know. Thanks @GeorgianaElena!

@choldgraf
Copy link
Member

@ocefpaf is this hub now ready to go from your end? we'd like to close out this issue if all looks OK

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 1, 2021

Yes. Please close it. I cannot because @abkfenris created it. Any comments/feedback @abkfenris ?

@abkfenris
Copy link
Contributor Author

We have some late breaking issues with Dask, but that may be a package we need in our image.

@yuvipanda
Copy link
Member

You can experimentally change the image deployed to your hub at https://ohw.pilot.2i2c.cloud/services/configurator/. After building and pushing your image, try the new image tag there? Some preliminary docs at https://pilot.2i2c.org/en/latest/admin/howto/configurator.html

@abkfenris
Copy link
Contributor Author

Ya, we've been playing with adjusting the image in configurator as we get requests for new packages. I think we were missing dask-gateway and distributed, which I'm building an image for now. Anything else that we may be missing from our environment?

It would be sweet if there was a webhook endpoint for the configurator we could use to adjust the image, or if we could do gitops-ish things against https://github.com/2i2c-org/pilot-hubs/blob/cc71cbd47bf79c90e96a86d2983bfaed51ba3703/config/hubs/2i2c.cluster.yaml#L108-L110

@abkfenris
Copy link
Contributor Author

After

from dask_gateway import GatewayCluster
cluster = GatewayCluster()
cluster.scale(4)

it can take about 5 min to scale up since we basically conda install * in our image.

I've done some work trying to slim down the image (it's 5.5 GB now), but it's mainly the variety of conda packages that our tutorials or dask users may need.

The other way to speed things up would be to have images closer to the hub. From poking around the repo, that looks like it zone us-central1-b right?

Does 2i2c have a Google Artifact/Container Registry that we could push images too? I'm also inquiring about if we have access to a Google Cloud project that we could access to run one ourselves.

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 2, 2021

@GeorgianaElena I'm getting a dead kernel when I try to load the dataset in the last line of this notebook:

https://nbviewer.jupyter.org/gist/ocefpaf/d9253a4dcd74ee651bf55598044d9cf1

Everything works OK in a fresh pull of our image locally.

@yuvipanda
Copy link
Member

@ocefpaf I'm guessing that's because you don't have enough RAM. Do you have a sense of how much RAM your notebooks might need? I think the default is pretty small (1G) and that might be it?

I'm bumping it up to 4G for the duration of the workshop - turn your server on / off and give it a shot?

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 2, 2021

I'm bumping it up to 4G for the duration of the workshop - turn your server on / off and give it a shot?

4G sound reasonable. I'm testing it and I'll get back to you.

yuvipanda added a commit to yuvipanda/pilot-hubs that referenced this issue Aug 3, 2021
We want spinups of dask and notebook nodes to be much
faster.

Ref 2i2c-org#549
@yuvipanda
Copy link
Member

Unfortunately I won't be able to set up the node placeholders until later today. The quotas and stuff are set up tho, and I tested that we can scale up to at least 50 nodes

@yuvipanda yuvipanda self-assigned this Aug 3, 2021
@abkfenris
Copy link
Contributor Author

A slow startup on the first day will help drive the point in that folks should log in early.

I think getting crazy with Dask doesn't happen until the visualization session tomorrow, but we haven't structured our schedule around which exact packages/resources are getting used by what tutorial.

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 4, 2021

Folks, we are hitting an odd issue. There is a data source, very common for oceanographic data, named OPeNDAP. It works locally on the same docker image, exactly the same packages but it fails in the jupyterhub. The steps to reproduce are:

from netCDF4 import Dataset
url = "http://goosbrasil.org:8080/pirata/B19s34w.nc"  # any OPenDAP URL will fail with an odd curl error.
nc = Dataset(url)

Any advice on how we can even debug this?

@abkfenris
Copy link
Contributor Author

Hmm, if I try to use r = request.get("http://goosbrasil.org:8080/pirata/B19s34w.nc") I get a Max reties exceeded & TimeoutErrors.

@abkfenris
Copy link
Contributor Author

If I try from our OpenDAP server (which I have never actually used in anger before), that works for me:

ds = xr.open_dataset("http://www.neracoos.org/opendap/A0143/A0143.met.realtime.nc")
ds
nc = Dataset("http://www.neracoos.org/opendap/A0143/A0143.met.realtime.nc")
nc

@yuvipanda
Copy link
Member

@abkfenris it's possible that port 8080 outbound is turned off, let me investigate

@yuvipanda
Copy link
Member

@abkfenris @ocefpaf there was an outbound port restriction. I opened port 8080 and 22 (#576), and this seems to work now.

@yuvipanda
Copy link
Member

ok so I've setup node placeholders (PR coming soon) to have 2 spare notebook nodes and 3 spare dask worker nodes, with the images pre-pulled. Can you test out dask spinup time now?

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 4, 2021

Thanks so much Yuvi!

@yuvipanda
Copy link
Member

@ocefpaf yw! How was the dask-gateway spinup time?

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 4, 2021

How was the dask-gateway spinup time?

I did not test it myself but the projects will start today and folks will report how it goes. I'll be sure to get back to you as soon as we know.

PS: Quick question. What is the best practice to allow folks to create conda environments in the hub? Giving them permission to write at /srv/conda does not sound like a good idea :-/

@yuvipanda
Copy link
Member

Giving them permission to write at /srv/conda does not sound like a good idea :-/

This is actually my preferred method - repo2docker does this too. Putting it in $HOME is probably just going to be super slow thanks to NFS. If their container goes wonky, they can simply restart the server. It won't persist past restarts though :(

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 4, 2021

Good to know that not all my ideas are bad 😄

(I tired and it worked. Thanks!)

BTW, we have two OPeNDAP server in our demo that use 808 port. One worked, the other one ("http://goosbrasil.org:8080/pirata/B19s34w.nc") still times out. Not sure if that is a problem with the server or the hub. It does work locally for me. However, this is not a pressing issue and do not worry too much about it unless it is an easy fix.

@yuvipanda
Copy link
Member

@ocefpaf I can't access http://goosbrasil.org:8080/pirata/B19s34w.nc from my local computer either - it just hands and times out. Maybe it's restricted to specific networks if it works for yu?

@yuvipanda
Copy link
Member

If we can skip the port 8080 one, I'd like to leave that be until after the workshop is over. That sound ok?

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 4, 2021

Don't worry about it. It is a problematic server anyway. I'll try to re-write the example. (Although the whole point of that example is to show a bad data/metadata out there. And guess what? Now I have another point to make with it 😄)

@abkfenris
Copy link
Contributor Author

When a user creates an environment, I believe it just touches some metadata in /srv/conda, the environments themselves are in /home/jovyan/my-conda-envs/${ENV_NAME}/

@ocefpaf
Copy link
Contributor

ocefpaf commented Aug 4, 2021

Indeed. I believe it reads and updates the url.txt file in there. There is probably a way to make conda read that from another directory :-/

@yuvipanda
Copy link
Member

so how did the hackathon go?

@yuvipanda yuvipanda mentioned this issue Aug 9, 2021
4 tasks
@yuvipanda
Copy link
Member

I'm actually going to close this, as the hub itself is set up! I opened #595 to debrief and learn about how the hub went.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants