-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup shared cluster on AWS and deploy 'researchdelight' hub #1967
Conversation
(Updated!) tf plan output:
|
@2i2c-org/tech-team I am requesting early review of the eksctl/terraform files before I fully deploy and begin on the hubs |
eksctl/shared-hubs-cluster.jsonnet
Outdated
// Warning: version 1.23 introduces some breaking changes | ||
// Checkout the docs before upgrading | ||
// ref: https://docs.aws.amazon.com/eks/latest/userguide/ebs-csi-migration-faq.html | ||
version: '1.22' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explicitly stating that this is ok for now to get this out the door, and we should work on upgrading in the new year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Scale to zero node groups!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @sgibson91. The only suggestion I have is that we name this to match the other shared projects we have, and call it 2i2c-aws
or 2i2c-aws-us
maybe? I think shared-cluster
is a bit too broad.
Yeah, I was trying to look to the two-eye-two-see GCP project for inspiration but that's still called |
shared-hubs-cluster was too broad, 2i2c-aws-us is more specific
Latest commit renames the cluster to |
6083e3d
to
d163ed0
Compare
I just deployed a staging hub to this cluster more or less copying the staging hub from the 2i2c/GCP shared cluster folder. It deployed fine, but I am not an admin when I login? Even though I have provided the correct config to make me an admin |
@sgibson91 what does the hub logs say? |
Nothing interesting at all. The config to make us admins runs without error. |
There's this?
|
Initial start up logs:
|
From when I logged in
|
My suggestion is to put some print statements here https://github.com/2i2c-org/infrastructure/blob/master/helm-charts/basehub/values.yaml#L492 and see what shows up in the hub logs. |
hyphen is not in the hub image name James has created, so let's be consistent
updates: - [github.com/pycqa/isort: 5.11.0 → v5.11.3](PyCQA/isort@5.11.0...v5.11.3)
We would like to be able to select the ML specific images not just for the GPU specific server, but also for the medium server. This option is available for more members and will in particular be used for an upcoming workshop. I am not sure if this simple change does the trick or if anything elsewhere needs to be specified.
updates: - [github.com/pycqa/isort: v5.11.3 → 5.11.4](PyCQA/isort@v5.11.3...5.11.4)
Bumps [rich](https://github.com/Textualize/rich) from 12.6.0 to 13.0.0. - [Release notes](https://github.com/Textualize/rich/releases) - [Changelog](https://github.com/Textualize/rich/blob/master/CHANGELOG.md) - [Commits](Textualize/rich@v12.6.0...v13.0.0) --- updated-dependencies: - dependency-name: rich dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
There are new changes here that will help a lot with memory usage! See https://gateway.dask.org/changelog.html#id1
Brings in fix to allow opening .json files that have hard tabs
This restores admin access to 2i2c staff members
for more information, see https://pre-commit.ci
Omg, I did |
Ok, the dask-staging hub is now timing out "waiting for the condition" from helm and I don't know why because all the pods are up and running, so I'm just going to purge that one in view of getting this damn PR merged. staging hub completes fine. |
Ugh, now the PR has changed 73 files, what the hell has happened 😭 |
Closing in favour of #2022 which isn't so much of a mess |
This file was missed in 2i2c-org#2022, and recovered from 2i2c-org#1967
ref #1949
Since this is a new shared cluster, I will deploy a staging and dask-staging hub alongside the research delight hub requested, mimicking the GCP shared cluster setup.
Hubs added: