-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deploying NASA Cryo cluster and hub #1768
Conversation
Updated to addThis issue was resolved by #1770 I'm at the point in the documentation where I need to use terraform to generate a CI/CD key: https://infrastructure.2i2c.org/en/latest/howto/operate/new-cluster/aws.html#create-account-with-finely-scoped-permissions-for-automatic-deployment However, AWS terraform now seems to be doing way more than just creating a CI/CD key and I don't know where that is documented. (PR #1767 is my attempt to improve documentation on this whole process but I could use help filling in the stuff I don't know.) The .tfvars file generate by infrastructure/terraform/aws/variables.tf Lines 66 to 72 in 566c2a7
But my terraform plan is failing trying to set a root password for a mysql database - as far as I know, I don't want this feature enabled and it shouldn't be running!
|
- Don't set up mysql provider when db is not enabled - Make sure all db related resources are conditional on db being enabled - Switch to a maintained fork of the mysql provider Unblocks 2i2c-org#1768
Output of
|
@yuvipanda @damianavila The next bug is that the deployer credentials created by terraform don't seem to work
I'm trying to deploy the support chart but can't. I can successfully authenticate against the cluster myself using my environment variables generated from following these docs
But whatever gets exported from |
@sgibson91, can you check if the rows in the json that stores the creds are tabs instead of spaces? Update: |
This is all correct. But also, the deployer opens json files with the json library (instead of yaml) precisely because of the hard tabs. I will dig out the PR where Yuvi reintroduced this. ETA: This commit from this PR. |
AFAIR, the terraform pieces create the deployer IAM user but you still need to give it access manually: https://infrastructure.2i2c.org/en/latest/howto/operate/new-cluster/aws.html#grant-access-to-other-users. Maybe this is the underlying issue? |
+1 to what @damianavila linked to. Might be the cause. Unfortunately only you can do this, @sgibson91 because of https://infrastructure.2i2c.org/en/latest/howto/operate/new-cluster/aws.html?highlight=access#grant-access-to-other-users. |
I did that though? Edited to add: Just checked my command history and apparently I granted it access already |
@sgibson91 oh damn, that sucks. Did you also do https://infrastructure.2i2c.org/en/latest/howto/operate/new-cluster/aws.html?highlight=access#grant-access-to-other-users? I can't seem to get access. |
Yes, I looped through everyone's user names that I found here: https://us-east-1.console.aws.amazon.com/iamv2/home?region=us-east-1#/groups/details/tech?section=users |
@sgibson91 ok that's super strange! I'll investigate. THANK YOU! |
@sgibson91 ah, my credentials had just expired i think. I can now reproduce your error, getting:
I'll look at that now. |
@sgibson91 which terraform workspace is this in? I don't see a nasa-cryo one in:
|
Oh no, I fucked up and didn't create a new one! 🤦🏻♀️ It could be in default since I didn't run any workspace commands after initialising? |
Two hubs are new up and running so I am marking this PR ready for review |
For some reason,
Applying the "usual" hack to fix it (below) didn't work infrastructure/deployer/tests/test_hub_health.py Lines 44 to 48 in bf6ae45
|
RE: #1768 (comment) This isn't health check related, I can't spawn a server either. Something about the cloud-user-sa in these logs:
ETA: I removed |
I think this was copy-pasta from openscapes and was causing user servers to not spawn
Now my spawn is just hanging
I think it might be trying to auto-scale but can't? Not seeing the kind of scaling failed messages I'd expect though.
|
Pinging @damianavila in case he might be able to help with ☝🏼 |
@sgibson91 @GeorgianaElena on EKS, the cluster autoscaler needs to be explicitly enabled by us in our support chart. I did so in fedcfb2 and it's on now! The We should definitely have templates here for sure 100%, including for support charts. |
I opened #1800 to get rid of cloud-user-sa ( |
As for quotas, we need mostly EC2 quotas (https://us-east-1.console.aws.amazon.com/servicequotas/home/services/ec2/quotas) for new nodes to come up. In particular:
|
Thanks @yuvipanda - I updated our docs around cluster-autoscaler and quotas as well. It's probably fine for now, and I'll give flow a rethink when I begin tackling #1757 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay awesome!
Will hold off merging this until Tasha has set the CNAME they would like to use :) |
I updated the teams to match the capitalisation in the slugs not the display names: #1702 (comment) I also updated the domains so the hubs are available at the desired CNAMEs of the community. Merging now! |
🎉🎉🎉🎉 Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/3296485343 |
Deployment-related changes in this PR
addresses #1702
Other changes in this PR