Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EVENT] NeuroHackademy 2022 #1300

Closed
4 of 12 tasks
choldgraf opened this issue May 11, 2022 · 44 comments
Closed
4 of 12 tasks

[EVENT] NeuroHackademy 2022 #1300

choldgraf opened this issue May 11, 2022 · 44 comments
Assignees

Comments

@choldgraf
Copy link
Member

choldgraf commented May 11, 2022

Summary

NeuroHackademy is a distributed collaboration / training event for the neuroscience community, led by @arokem and others.

@consideRatio ran a hub for this event last year, and this year we are going to support it via 2i2c.

Description of hub needs

We'll need to deploy a new hub for this event. Below is a description of needs that @consideRatio shared from last year:

Event Info

  • Community Representative: @arokem
  • Event begin: July 25th
  • Event end: August 5th
  • Active times: US/Pacific
  • Number of attendees: @arokem can you provide guidance?
  • Hub Events Calendar

Hub info

  • Hub URL: neurohackademy.2i2c.cloud
  • Hub decommisioned after event?: yes

A few other questions for @arokem (or @consideRatio if he can remember from last year):

  • Do you have cloud credits?
  • Does the hub need to run in a specific cloud provider?
  • Does this hub need to be on a dedicated cluster or can it be on a shared cluster?
  • Do you need a dask gateway cluster?

Task List

Before the event

  • Confirm the deliverables and turn this into a short statement of work
  • Create a price quote for @arokem
  • Send to Ann @ UW to incorporate into our contract w/ them.
  • Dates confirmed with the community representative and added to Hub Events Calendar.
  • Quotas from the cloud provider are high-enough to handle expected usage.
  • One week before event Hub is running.
  • Confirm with Community Representative that their workflows function as expected.
    • 👉Template message to send to community representative
      Hey {{ COMMUNITY REPRESENTATIVE }}, the date of your event is getting close!
      
      Could you please confirm that your hub environment is ready-to-go, and matches your hub's infrastructure setup, by ensuring the following things:
      - [ ] Confirm that the "Event Info" above is correct
      - [ ] On your hub: log-in and authentication works as-expected
      - [ ] `nbgitpuller` links you intend to use resolve properly
      - [ ] Your notebooks and content run as-expected
      
  • 1 day before event, either a separate nodegroup is provisioned for the event or the cluster is scaled up.

During and after event

  • Confirm event is finished.
  • Nodegroup created for the hub is decommissioned / cluster is scaled down.
  • Hub decommissioned (if needed).
  • Debrief with community representative.
    • 👉Template debrief to send to community representative
      Hey {{ COMMUNITY REPRESENTATIVE }}, your event appears to be over 🎉
      
      We hope that your hub worked out well for you! We are trying to understand where we can improve our hub infrastructure and setup around events, and would love any feedback that you're willing to give. Would you mind answering the following questions? If not, just let us know and that is no problem!
      
      - Did the infrastructure behave as expected?
      - Anything that was confusing or could be improved?
      - Any extra functionality you wish you would have had?
      - Could you share a story about how you used the hub?
      
      - Any other feedback that you'd like to share?
      
      
@arokem
Copy link
Contributor

arokem commented May 16, 2022

Hello!

A few comments and answers to questions brought up above:

Auth via GitHub Teams membership.

We'd actually rather not do that and use individual GitHub user-names instead, if that's OK.

User profile options (small, medium, large).

We don't need that.

by default hub admins have write access to a folder, while users only have read access. Is that sufficient?

Who is defined as a hub admin? Is that us or you? Can we delegate this among multiple users ("instructors")?

we recommend communities build off of the repo2docker user image template - would that work for you?

We'd love to have something like this: https://github.com/neurohackademy/nh2021-jupyterhub/blob/main/deployments/hub-neurohackademy-org/image/Dockerfile, if that's not too hard.

Number of attendees:

We expect no more than 125 users overall.

Do you have cloud credits?

No.

Does the hub need to run in a specific cloud provider?

No. Although GCP or AWS is preferable.

Does this hub need to be on a dedicated cluster or can it be on a shared cluster?

No problem for this to be on a shared cluster.

Do you need a dask gateway cluster?

No.

@damianavila
Copy link
Contributor

Thanks for providing the additional information, @arokem. Super useful.

Who is defined as a hub admin? Is that us or you? Can we delegate this among multiple users ("instructors")?

We can define multiple admin users for you.

@choldgraf, let's sync about this one in our next encounter so I can understand better the context and we can plan the new hub specification accordingly.

@yuvipanda
Copy link
Member

Just wanted to check - what's the next step here?

@damianavila
Copy link
Contributor

I think we have enough information to set up the hub to serve this event but I also think it is still a little bit early.
I would suggest creating a new hub issue (and migrating the hub requirements there) and planning the timeline accordingly.

From the budget perspective, @choldgraf and @colliand might have more information. IIRC, this is a new hub on a series of hubs we are going to deploy for the UW.

@arokem
Copy link
Contributor

arokem commented Jun 3, 2022

Since there's some activity on the issue: could I please raise something that we brought up in the document that we sent @choldgraf as a draft statement of work, and I was wondering whether it would require some discussion here as well.

We would like to have a landing page in the hub that would contain some text and links that we would be able to customize. Is that something that is easy to do? Or, if it requires some work, is that work you'd be interested in and able to undertake as part of this deployment? Thanks!

@choldgraf
Copy link
Member Author

@arokem I've added a point about the landing page to the top comment as well.

The basic landing page for the hubs looks like this: https://snowex.uwhackweeks.2i2c.cloud/hub/login?next=%2Fhub%2F

Is that basic structure OK, and you just want to customize some of the content under the launch button etc?

@yuvipanda
Copy link
Member

@arokem you should be able to fork https://github.com/2i2c-org/default-hub-homepage and make changes to the template as you want, without needing to add any 2i2c engineers to the loop there!

@damianavila
Copy link
Contributor

without needing to add any 2i2c engineers to the loop there!

[Internal] IIRC (and I might be misremembering), we are "loading" the template from specific branches in that repository, so maybe we should do something on our side to "publish" the proposed changes... but it should be pretty straightforward.

@arokem
Copy link
Contributor

arokem commented Jun 3, 2022

@arokem you should be able to fork https://github.com/2i2c-org/default-hub-homepage and make changes to the template as you want, without needing to add any 2i2c engineers to the loop there!

Fantastic. I will take a look. IIUC, the procedure would be to fork that repo and make the changes and then make a PR against a designated branch on the repo? At some point, it would be good for us to understand how we can "refresh" the landing page template on our hub, as we might need to make changes while the event is ongoing. Is it as simple as pushing to this designated branch?

@yuvipanda
Copy link
Member

Is it as simple as pushing to this designated branch?

yes! I think it auto pulls master every 5 min

@arokem
Copy link
Contributor

arokem commented Jun 10, 2022

OK - one further wrinkle about this is that one of the things that we'd like to add to the landing page are links to zoom calls, but we don't want to post these to an open repo on GitHub. Is there some way to make this private?

@choldgraf
Copy link
Member Author

Can somebody on the @2i2c-org/tech-team please answer @arokem's question above? cc @damianavila in case we need to track this one in our board.

@yuvipanda
Copy link
Member

@arokem we can figure a way out to do that, but if it is on the hub home page, wouldn't it be publicly visible anyway?

@arokem
Copy link
Contributor

arokem commented Jul 5, 2022

Thanks for re-raising @choldgraf! And just to add that by "landing page" we don't necessarily mean that this has to be the page that users see when they first navigate to the website. This can also be a webpage that is displayed in some easy-to-find location within the jupyterlab interface. That is, it can be pulled from a repo in the same manner that we will want to pull down the curriculum materials (which will come from https://github.com/neurohackademy/nh2022-curriculum, by the way), except that this repo has to be private, because it will include zoom call information.

@arokem
Copy link
Contributor

arokem commented Jul 5, 2022

So, it's a page that users of the hub can see only after they've authenticated.

@yuvipanda
Copy link
Member

Alternatively we can put it in the spawner options page that users see after they log in. The link can be sops encrypted in this repo. So you would need a 2i2c engineer to change it but that should be ok?

@choldgraf
Copy link
Member Author

Since this needs to be accessible after a log-in, I wonder if we could use our "hosted documentation service" functionality for this?

https://docs.2i2c.org/en/latest/admin/howto/content.html#serve-static-web-content-with-your-hub

@arokem
Copy link
Contributor

arokem commented Jul 5, 2022

Yes. That looks right! Does the documentation in this case come from a repo that can be updated frequently into the static web page? Can that repo by private, and this webpage be behind the authentication? If the answers to these are all yes, then this would work well.

@choldgraf
Copy link
Member Author

choldgraf commented Jul 5, 2022

I'm not sure - but perhaps that's something worth exploring. I believe that @GeorgianaElena was the one that set up this functionality so maybe she could provide some guidance?

I've created an issue to track this in case it needs dedicated discussion / follow-up. Also added it to our project backlog (cc @damianavila )

@yuvipanda
Copy link
Member

@choldgraf the hosted documentation functionality isn't behind auth.

@choldgraf
Copy link
Member Author

Ah hmm, so then that wouldn't solve our problem either. Would it be possible to do this?

Alternatively: what is the best way to let a community representative provide private, authenticated, one-way, highly visible, regularly updatable content their users?

@consideRatio
Copy link
Contributor

Note that there is a jupyterhub "announcement" feature available as well - could be relevant. I think that is post-login but im not sure.

@damianavila damianavila self-assigned this Jul 5, 2022
@yuvipanda
Copy link
Member

yuvipanda commented Jul 5, 2022

@arokem @choldgraf nbgitpuller can support pulling from private repos! So we can do that? So users would click an nbgitpuller link to fetch the private repo, and it'll put them down in a notebook.

@arokem
Copy link
Contributor

arokem commented Jul 5, 2022

@yuvipanda : yes, I think that an nbgitpuller setup from a repo that just has a markdown file with the relevant links could work.

@consideRatio : can you point out documentation about the "announcement" feature? Could be useful for many other things, even if we end up with an nbgitpuller solution for the zoom link page.

@arokem
Copy link
Contributor

arokem commented Jul 24, 2022

Hello! Is this the place to post questions and/or raise issues about the hub? Or should I send an email to "support"? I just realized that we never discussed resources per user, and I also just realized that we have the hub configured to ~1GB RAM per user. Can we increase that? Users of the hub are likely going to be working with large image datasets that can take up a lot more than that. I'd like to start by increasing this to at least 4GB/user for the first week (if only because a tutorial that I am teaching on Thursday is already not working as planned. Working on this tutorial is how I realized where the limit is currently set 😬 ). And it would be nice if I could dial that up as needed later on as well. Is that something that could be configured in the configurator as well? Thanks!

@yuvipanda
Copy link
Member

@arokem we've #1554 figuring out resources! Giving users 4GB guarantee and an 8GB limit. I'll get that merged now. Unfortunately this can't be set in the configurator yet...

@yuvipanda
Copy link
Member

@arokem set up to have a 8G limit and 4GB guarantee now!

@arokem
Copy link
Contributor

arokem commented Jul 25, 2022

Thanks @yuvipanda! Works great.

@damianavila
Copy link
Contributor

damianavila commented Jul 25, 2022

Or should I send an email to "support"?

@arokem, yes. That is the canonical channel to report problem to us.
Please, report any new hub issues (or questions) via the support email.
If you ask here, we will eventually reply, but the response time (and prioritization) should be definitely better via the canonical channel. Thanks!!

@damianavila damianavila moved this from Waiting to In progress in DEPRECATED Engineering and Product Backlog Jul 27, 2022
@arokem
Copy link
Contributor

arokem commented Aug 8, 2022

Hey folks! The event is over... Thanks for all the work on this! It went remarkably smooth.

I am wondering (and some participants are as well): how much longer will the hub be up and running?

@damianavila
Copy link
Contributor

I am wondering (and some participants are as well): how much longer will the hub be up and running?

@arokem, my recollection about this hub nature is a transient/event-focused one. If that is the case, we would need to decommission it ASAP. But, do not worry, we will agree on a date that works for everyone involved.

cc @colliand and @choldgraf who might have more details about the existing agreements for this hub.

Btw, in the meantime, would you mind answering the following questions? If not, just let us know and that is no problem!

  • Did the infrastructure behave as expected?
  • Anything that was confusing or could be improved?
  • Any extra functionality you wish you would have had?
  • Could you share a story about how you used the hub?
  • Any other feedback that you'd like to share?

Thanks!

@damianavila damianavila moved this from In progress to Waiting in DEPRECATED Engineering and Product Backlog Aug 8, 2022
@arokem
Copy link
Contributor

arokem commented Aug 8, 2022

Thanks @damianavila! You are correct that the hub is transient and event-focused, and the event is fortunately over...

That said, if we could keep it up and running through August 10th, that would give participants enough time to get there stuff off of the hub, if they haven't done so yet.

@arokem
Copy link
Contributor

arokem commented Aug 9, 2022

Did the infrastructure behave as expected?

Mostly yes. Here are the issues that we encountered:

  1. On Thursday of week 1 of the event. According to @yuvipanda, "This corelates to a new pod spinning up to support onflux of users.". We did not experience that on other days.
  2. On Sunday, user servers were not starting up . According to @yuvipanda, this is related to Move pilot-hubs cluster to a regional k8s cluster for better availability #1102.

Anything that was confusing or could be improved?

  1. Overall, the onboarding process was pleasant and (very!) helpful, but I have to admit that I wonder how a community representative who doesn't have much experience setting up their own hubs would fare.
    1. The configurator UI is confusing because pushing the button affects changes in configuration, but there is no visible indicator whether the button was pressed or not.
  2. We can't set the image tag to always point to "latest", which results in the need for a manual update of the tag every time a change to the image container is pushed to the hub image repo. If there was some way to update the tag in the hub configuration automatically, that would be nice.

Any extra functionality you wish you would have had?

The ability to configure user RAM allocation and shared disk space size myself.

Could you share a story about how you used the hub?

  1. The hub was used was to facilitate "master classes" where several leading researchers in data science and neuroscience walked participants through a notebook where they had implemented analysis methods to accompany their research talks. Because hub configuration and curriculum could be updated rapidly, some of these master classes came together in the very last minute, with new notebooks and changes to hub configuration being pushed even 10 minutes before the lecture started.
  2. Participants in the event used the shared data storage to work with more than 1.3 TB of openly available neuroimaging data that were updated by the event organizers even while the event was taking place (up to the last day!).
  3. The hub was used to teach a range of lectures in Python, git, machine learning and statistics.

Any other feedback that you'd like to share?

You folks are amazing! 🧠 ❤️ 📓

@damianavila
Copy link
Contributor

That said, if we could keep it up and running through August 10th, that would give participants enough time to get there stuff off of the hub, if they haven't done so yet.

Sure, then I will request for a decommission for next week, so they have a few more days 😉.

@damianavila
Copy link
Contributor

Btw, thank you for all that feedback, @arokem. It is super appreciated!!
About the improvements you have requested, we are already working in those directions and hopefully will get a better experience the next time to run with us!

@arokem
Copy link
Contributor

arokem commented Aug 9, 2022

Thank you! BTW: me and others are currently experiencing some server startup issues. Have you already started pumping the breaks on it?

@yuvipanda
Copy link
Member

@arokem we brought the minimum number of running nodes to 0, so it means you'd have to wait for the server to spin up first time. What server startup issues are you having?

@arokem
Copy link
Contributor

arokem commented Aug 9, 2022

We're seeing these kinds of time-outs on server start:
Screen Shot 2022-08-09 at 3 48 26 PM

@yuvipanda
Copy link
Member

@arokem all the 1.5T was used up! I just added another 500G, and should be back up now?

@arokem
Copy link
Contributor

arokem commented Aug 9, 2022

Oh weird. Thanks for doing that! Yes - we're back up.

I wonder how that got filled up today. I don't think we added any new data since Friday. But it's also possible that we filled it up on Friday and no one noticed until now. As always, thanks for jumping on it so quickly!

@damianavila
Copy link
Contributor

OK, I have created the decommissioning issue: #1620, with a target date of Aug 15th. Let's continue that specific conversation over there.

Closing this one now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

6 participants