Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.4.0 testing tracker #976

Closed
80 of 92 tasks
iameskild opened this issue Dec 17, 2021 · 16 comments
Closed
80 of 92 tasks

v0.4.0 testing tracker #976

iameskild opened this issue Dec 17, 2021 · 16 comments
Assignees
Labels
area: testing ✅ Testing type: release 🏷 Items related to Nebari releases

Comments

@iameskild
Copy link
Member

iameskild commented Dec 17, 2021

Checklist:

Validate successful qhub deploy and qhub destroy for each provider:

  • AWS

    • password
    • github
    • auth0

    Validate the following services:

  • Azure

    • password
    • github
    • auth0

    Validate the following services:

    • Log into keycloak as root user and add user
    • Add user from command line
    • Launch JupyterLab session with new user
    • Launch dask-cluster and test auto-scaler
    • Launch dask-gateway dashboard
    • Validate conda-store environments are created and available
    • Launch basic CDS Dashboard
    • Launch Grafana (validate SSO)
    • Qhub destroy
  • DO

    • password
    • github
    • auth0

    Validate the following services:

    • Log into keycloak as root user and add user
    • Add user from command line
    • Launch JupyterLab session with new user
    • Launch dask-cluster and test auto-scaler
    • Launch dask-gateway dashboard
    • Validate conda-store environments are created and available
    • Launch basic CDS Dashboard
    • Launch Grafana (validate SSO)
    • Qhub destroy
  • GGP

    • password
    • github
    • auth0

    Validate the following services:

    • Log into keycloak as root user and add user
    • Add user from command line
    • Launch JupyterLab session with new user
    • Launch dask-cluster and test auto-scaler
    • Launch dask-gateway dashboard
    • Validate conda-store environments are created and available
    • Launch basic CDS Dashboard
    • Launch Grafana (validate SSO)
    • Qhub destroy
  • local/existing kubernetes cluster/minikube

    • password
    • github
    • auth0

    Validate the following services:

    • Log into keycloak as root user and add user
    • Add user from command line
    • Launch JupyterLab session with new user
    • Launch dask-cluster and test auto-scaler
    • Launch dask-gateway dashboard
    • Validate conda-store environments are created and available
    • Launch basic CDS Dashboard
    • Launch Grafana (validate SSO)
    • Qhub destroy

Validate qhub upgrade is successful for each provider:

  • AWS

    • Upgrade from v0.3.12/v0.3.13/v0.3.14 to v0.4.0
  • Azure

    • Upgrade from v0.3.12/v0.3.13/v0.3.14 to v0.4.0
  • DO

    • Upgrade from v0.3.12/v0.3.13/v0.3.14 to v0.4.0
  • GCP

    • Upgrade from v0.3.12/v0.3.13/v0.3.14 to v0.4.0
  • local/existing kubernetes deployment/minikube

    • Upgrade from v0.3.12/v0.3.13/v0.3.14 to v0.4.0

Validate qhub-ops.yaml workflow

  • github-actions
  • gitlab-ci

(outdated)

#1003 Testing

  • Minikube deployment
  • AWS
  • Azure
  • Digital Ocean
  • GCP

Keycloak

  • Use keycloak for user authentication, test:
    • AWS
      • password
      • github
      • auth0
    • Azure
      • password
      • github
      • auth0
    • Digital Ocean
      • password
      • github
      • auth0
    • GCP
      • password
      • github
      • auth0

Azure deployments fail see #978 for more details.

qhub upgrade

AWS

  • upgrade from qhub v0.3.12 to main - password auth
  • upgrade from qhub v0.3.12 to main - auth0 auth

DO

  • upgrade from qhub v0.3.12 to main - password auth
  • upgrade from qhub v0.3.12 to main - auth0 auth
@iameskild iameskild added this to the Release v0.4.0 milestone Dec 17, 2021
@iameskild iameskild self-assigned this Dec 17, 2021
@trallard trallard added the type: release 🏷 Items related to Nebari releases label Dec 22, 2021
@trallard trallard moved this to Needs Triage 🔍 in QHub Project Mangement 🚀 Dec 22, 2021
@trallard trallard moved this from Needs Triage 🔍 to In Progress 🏃🏽‍♀️ in QHub Project Mangement 🚀 Dec 22, 2021
@trallard trallard pinned this issue Dec 22, 2021
@iameskild
Copy link
Member Author

iameskild commented Jan 3, 2022

I've been having trouble upgrading from 0.3.12 on AWS (using Auth0) to the version of qhub on main (ie. export QHUB_GH_BRANCH=main). On the deploy step, the error I keep running into is the following:

[terraform]: │ Error: Get "http://localhost/api/v1/namespaces/dev": dial tcp [::1]:80: connect: connection refused

I've seen errors like this in past but I haven't been able to get around it. @danlester do you have any idea why this might be failing or if there are additional steps I need to take?

@danlester
Copy link
Contributor

@iameskild Not too sure, but we can have a call if you want to look together.

@iameskild
Copy link
Member Author

@danlester I've attempted another upgrade with the same results. I will try to perform an upgrade from 0.3.13 to main for another cloud provider and see if I get it working. I'm free to jump on a call whenever is convenient for you, thanks for you help!

@danlester
Copy link
Contributor

I don't think there will be much difference, but I would suggest also trying 0.3.12 to main for another cloud provider, so you're changing less for comparison.

It could also be worth trying with password instead of auth0 to see if that works - I have done most testing under password.

@iameskild
Copy link
Member Author

@danlester I was able to upgrade from 0.3.12 to 0.4.0 (main) running on DO using password. I made the following adjustments

  • no changing the underlying general node instance type
  • reinstalled qhub (bumped version to v0.4.0) into qhub-main conda env
  • manually updated the image tags to v0.3.14

Unfortunately the hub pod never came back up. This made it so I couldn't test importing existing users or verify that the user data is still intact.

hub pod logs:

Loading /usr/local/etc/jupyterhub/secret/values.yaml
No config at /usr/local/etc/jupyterhub/existing-secret/values.yaml
Loading extra config: jupyterhub_extra_config
[E 2022-01-11 04:12:52.994 JupyterHub app:2973]
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.7/site-packages/jupyterhub/app.py", line 2970, in launch_instance_async
        await self.initialize(argv)
      File "/opt/conda/lib/python3.7/site-packages/jupyterhub/app.py", line 2461, in initialize
        self.load_config_file(self.config_file)
      File "/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py", line 88, in inner
        return method(app, *args, **kwargs)
      File "/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py", line 777, in load_config_file
        raise_config_file_errors=self.raise_config_file_errors,
      File "/opt/conda/lib/python3.7/site-packages/traitlets/config/application.py", line 738, in _load_config_files
        config = loader.load_config()
      File "/opt/conda/lib/python3.7/site-packages/traitlets/config/loader.py", line 614, in load_config
        self._read_file_as_dict()
      File "/opt/conda/lib/python3.7/site-packages/traitlets/config/loader.py", line 646, in _read_file_as_dict
        exec(compile(f.read(), conf_filename, 'exec'), namespace, namespace)
      File "/usr/local/etc/jupyterhub/jupyterhub_config.py", line 446, in <module>
        exec(config_py)
      File "<string>", line 28, in <module>
    ImportError: cannot import name 'theme_extra_handlers' from 'qhub_jupyterhub_theme' (/opt/conda/lib/python3.7/site-packages/qhub_jupyterhub_theme/__init__.py)
    
stream closed

@danlester
Copy link
Contributor

@iameskild This is the same problem that Vini faced: #967 (comment)

I'm not too sure why you manually updated the image tags to v0.3.14. The qhub upgrade should have already set them to v0.3.14 - but only if they started off as v0.3.12 in the qhub-config.yaml file. Ultimately, when qhub (Python module) has its internal version number at v0.4.0, qhub upgrade should end up at v0.4.0 for the image tags instead.

But since the qhub repo doesn't yet have a v0.4.0 tag, no corresponding images exist in Docker Hub, so you would really need to (manually) use main as the image tag to get the versions based on our latest code.

If you still have the broken site running, try updating the image tags in qhub-config.yaml and redeploy - it will still be a helpful test I think.

Still happy to have a call to go through all of this together.

@iameskild
Copy link
Member Author

Redeploying with image tags set to main resolves this issue. After importing the users and logging in, the user data remains intact :)

I still want to go back and test upgrading a QHub instance that uses Auth0.

@iameskild
Copy link
Member Author

Upgrading qhub (on AWS, using Auth0) from v0.3.12 to v0.4.0 failed during the deployment process. I tried the same upgrade and deploy on DO and while it successfully deployed and I could import users, I could login due to the following:
Screen Shot 2022-01-12 at 22 46 11

I also noticed a few bizarre Terraform outputs:

[terraform]: Note: Objects have changed outside of Terraform
[terraform]: 
[terraform]: Terraform detected the following changes made outside of Terraform since the
[terraform]: last "terraform apply":
[terraform]: 
[terraform]:   # module.kubernetes-conda-store-mount.kubernetes_persistent_volume_claim.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-conda-store-mount.kubernetes_storage_class.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-conda-store-mount.kubernetes_persistent_volume.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.forwardauth.kubernetes_deployment.forwardauth-deployment previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.forwardauth.kubernetes_service.forwardauth-service previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_deployment.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_config_map.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_cluster_role.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_service.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_secret.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_config_map.controller previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_service_account.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_cluster_role_binding.gateway previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_cluster_role.controller previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_cluster_role_binding.controller previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_service_account.controller previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-dask-gateway.kubernetes_deployment.controller previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-conda-store-server.kubernetes_config_map.conda-environments previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-conda-store-server.kubernetes_deployment.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-conda-store-server.kubernetes_persistent_volume_claim.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-conda-store-server.kubernetes_service.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_deployment.jupyterhub-sftp previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_deployment.jupyterhub-ssh previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_secret.jupyterhub-sftp previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_secret.jupyterhub-ssh previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_service.jupyterhub-sftp previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_service.jupyterhub-ssh previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.qhub.module.kubernetes-jupyterhub-ssh.kubernetes_config_map.jupyterhub-ssh previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-nfs-server.kubernetes_deployment.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-nfs-server.kubernetes_persistent_volume_claim.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"
[terraform]:   # module.kubernetes-nfs-server.kubernetes_service.main previous run state doesn't conform to current schema; this is a Terraform bug
[terraform]:   # unsupported attribute "self_link"

@danlester are you available to troubleshoot together tomorrow after the QHub sync?

@iameskild
Copy link
Member Author

iameskild commented Jan 13, 2022

@danlester capturing the Terraform logs led me to:

Invalid provider configuration was supplied. Provider operations likely to fail: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable:

Googling this, I found an issue on the terraform-aws-eks repo. Here, one on of the top recommendations was terraform-aws-modules/terraform-aws-eks#1234 (comment).

export KUBE_CONFIG_PATH=/Users/eskild/.kube/config

With this trick, the deployment seemed to be working but then it started deleting subnet resources and errored out, leaving the cluster in an half-deleted state.

Logs in this gist.

@danlester
Copy link
Contributor

danlester commented Jan 14, 2022

@iameskild I believe I've solved this particular problem (Terraform trying to access localhost cluster) in the following issue which gives more details. It has a corresponding PR - please review:

Kubeconfig state unavailable, Terraform defaults to localhost

However, (in AWS) it leads me to the problem you were seeing about subnet resources being replaced. (Some outputs below). Once it wants to replace the node groups, the apply will never finish since the nodes can't be destroyed until the cluster has its contents removed safely.

By the way, I tried the upgrade on AWS and got the same localhost error using password auth (not Auth0) - I don't think the auth type has anything to do with it, and you were just lucky if you got password upgrade to work before - or maybe something has changed since!

As discussed, the login problem you saw with Auth0 above is because the callback URL needs to be changed, and we need to advise the user in qhub upgrade - issue #991 for you.

Terraform AWS subnet replacement logs
[terraform]:   # module.kubernetes.aws_eks_cluster.main must be replaced
[terraform]: -/+ resource "aws_eks_cluster" "main" {
[terraform]:       ~ arn                       = "arn:aws:eks:eu-west-2:892486800165:cluster/qhubawsdslc-dev" -> (known after apply)
[terraform]:       ~ certificate_authority     = [
[terraform]:           - {
[terraform]:               - data = "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1ERXhOREE1TlRFd05Wb1hEVE15TURFeE1qQTVOVEV3TlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTVFUCjFnclQ0UFhZb1FwUXpGTVY3WHZ5bGtQdjg3ZTA3Nyt2SzhTZnpndkNmRFRrcVFLV093bDk1UmQ2RmtOS2JvMWQKY2cwL1ZSSVBkcWlBS0liVlBLWnVlcDE1UGEyUHNjNEZaRG5EMXdKd1BlMllPQWlmS2p3M1Z0dkxHRHVJZnF3Zgo0eHA5cm5IZHl6MytMMGdQaGZXaTZ3R3NZeUxJbmt0VUg2YzdGYlQzaUplbUEwTVI4dGVRZGVMaklac3BoMk9zCkNPMzFiYi84bEVrRlBZS2paZDhMNE9kM3ZzcnM2cURhbVh6ZWhCWDJpZlY1bWJuOW5iWnlLWGNXTTFaQmdaN2sKQ1ZxdjJLbytZaWZCeWlzTm15RHNpZjBMQW1Obk5Fa1RNclJudlREUmEvWGxSdkVJT3hBT054ZGdSU1JHekFsMApkOTJGSVhTUlpLaG1oS2VHM21VQ0F3RUFBYU5DTUVBd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZMejlIUWtvSTEweTF0a0xYc0t4MUlWVW50aGdNQTBHQ1NxR1NJYjMKRFFFQkN3VUFBNElCQVFBYStjN2NkRCt0Qi9WblNlK1FJN2pyYnZ4WHNHbHVaRzZ4czRVdUluWXdTemFBRUU0egpOcVRybXh2WWJJYm43QUpGTU9jVW9pVTNFekVOMi85K1MrQ1VlWHp6WUdwOWFvUWR6NFR0Nlp5L2VxK083dFAzCnBTcUJnaTFRY1Z0MUdNcHNQYTBQS1dCVzl1TzVCZ3FmUE52UUVrWStab3dWZEJ4ZW9EL2Evb2hCSEgxSHFEbloKN2tGbTVXR2tOdHJabUU1ZUt0K1ExWjV6dWcvNGhpemc0UTdrMC9kR2FwcDlVaFN5ejQvaFpCOVp5VDNmanpFNgp6SnlWVHIyUHlRbzZma3BpSmJ2U2pqTlBDaFdVdXN6MzRZTVJFRDFFbjA3SVA5WDlaY1ZyUWpXajIyVmpBNnhoCkl6d1FmVmwzQnZXb3JsRFBHMVlZMDUzbzJWbXNkVVR5UFRXeQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="
[terraform]:             },
[terraform]:         ] -> (known after apply)
[terraform]:       ~ created_at                = "2022-01-14 09:45:08.526 +0000 UTC" -> (known after apply)
[terraform]:       - enabled_cluster_log_types = [] -> null
[terraform]:       ~ endpoint                  = "https://6E72EBCB3DA4098E9A8F2EC38D1146A0.gr7.eu-west-2.eks.amazonaws.com" -> (known after apply)
[terraform]:       ~ id                        = "qhubawsdslc-dev" -> (known after apply)
[terraform]:       ~ identity                  = [
[terraform]:           - {
[terraform]:               - oidc = [
[terraform]:                   - {
[terraform]:                       - issuer = "https://oidc.eks.eu-west-2.amazonaws.com/id/6E72EBCB3DA4098E9A8F2EC38D1146A0"
[terraform]:                     },
[terraform]:                 ]
[terraform]:             },
[terraform]:         ] -> (known after apply)
[terraform]:         name                      = "qhubawsdslc-dev"
[terraform]:       ~ platform_version          = "eks.4" -> (known after apply)
[terraform]:       ~ status                    = "ACTIVE" -> (known after apply)
[terraform]:         tags                      = {
[terraform]:             "Environment" = "dev"
[terraform]:             "Name"        = "qhubawsdslc-dev"
[terraform]:             "Owner"       = "terraform"
[terraform]:             "Project"     = "qhubawsdslc"
[terraform]:         }
[terraform]:       ~ version                   = "1.21" -> (known after apply)
[terraform]:         # (2 unchanged attributes hidden)
[terraform]:
[terraform]:       ~ kubernetes_network_config {
[terraform]:           ~ service_ipv4_cidr = "172.20.0.0/16" -> (known after apply)
[terraform]:         }
[terraform]:
[terraform]:       ~ vpc_config {
[terraform]:           ~ cluster_security_group_id = "sg-007603ed388248252" -> (known after apply)
[terraform]:           ~ public_access_cidrs       = [
[terraform]:               - "0.0.0.0/0",
[terraform]:             ] -> (known after apply)
[terraform]:           ~ subnet_ids                = [
[terraform]:               - "subnet-0063ecc4391bfbca7",
[terraform]:               - "subnet-0aede967b72f0907b",
[terraform]:             ] -> (known after apply) # forces replacement
[terraform]:           ~ vpc_id                    = "vpc-09af9d5ed405a83ed" -> (known after apply)
[terraform]:             # (3 unchanged attributes hidden)
[terraform]:         }
[terraform]:     }
[terraform]:
[terraform]:   # module.kubernetes.aws_eks_node_group.main[0] must be replaced
[terraform]: -/+ resource "aws_eks_node_group" "main" {
[terraform]:       ~ arn                    = "arn:aws:eks:eu-west-2:892486800165:nodegroup/qhubawsdslc-dev/general/14bf2c01-b6a2-2c2a-2b93-c9c96a608e41" -> (known after apply)
[terraform]:       ~ capacity_type          = "ON_DEMAND" -> (known after apply)
[terraform]:       ~ disk_size              = 20 -> 50 # forces replacement
[terraform]:       ~ id                     = "qhubawsdslc-dev:general" -> (known after apply)
[terraform]:       - labels                 = {} -> null
[terraform]:       + node_group_name_prefix = (known after apply)
[terraform]:       ~ release_version        = "1.21.5-20220112" -> (known after apply)
[terraform]:       ~ resources              = [
[terraform]:           - {
[terraform]:               - autoscaling_groups              = [
[terraform]:                   - {
[terraform]:                       - name = "eks-general-14bf2c01-b6a2-2c2a-2b93-c9c96a608e41"
[terraform]:                     },
[terraform]:                 ]
[terraform]:               - remote_access_security_group_id = ""
[terraform]:             },
[terraform]:         ] -> (known after apply)
[terraform]:       ~ status                 = "ACTIVE" -> (known after apply)
[terraform]:       ~ subnet_ids             = [
[terraform]:           - "subnet-0063ecc4391bfbca7",
[terraform]:           - "subnet-0aede967b72f0907b",
[terraform]:         ] -> (known after apply) # forces replacement
[terraform]:         tags                   = {
[terraform]:             "Environment"                           = "dev"
[terraform]:             "Owner"                                 = "terraform"
[terraform]:             "Project"                               = "qhubawsdslc"
[terraform]:             "kubernetes.io/cluster/qhubawsdslc-dev" = "shared"
[terraform]:         }
[terraform]:       ~ version                = "1.21" -> (known after apply)
[terraform]:         # (6 unchanged attributes hidden)
[terraform]:
[terraform]:
[terraform]:       ~ update_config {
[terraform]:           ~ max_unavailable            = 1 -> (known after apply)
[terraform]:           ~ max_unavailable_percentage = 0 -> (known after apply)
[terraform]:         }
[terraform]:         # (1 unchanged block hidden)
[terraform]:     }
[terraform]:
[terraform]:   # module.kubernetes.aws_eks_node_group.main[1] must be replaced
[terraform]: -/+ resource "aws_eks_node_group" "main" {
[terraform]:       ~ arn                    = "arn:aws:eks:eu-west-2:892486800165:nodegroup/qhubawsdslc-dev/user/06bf2c01-b6a2-7df4-2dd9-085b9d1c86fa" -> (known after apply)
[terraform]:       ~ capacity_type          = "ON_DEMAND" -> (known after apply)
[terraform]:       ~ disk_size              = 20 -> 50 # forces replacement
[terraform]:       ~ id                     = "qhubawsdslc-dev:user" -> (known after apply)
[terraform]:       - labels                 = {} -> null
[terraform]:       + node_group_name_prefix = (known after apply)
[terraform]:       ~ release_version        = "1.21.5-20220112" -> (known after apply)
[terraform]:       ~ resources              = [
[terraform]:           - {
[terraform]:               - autoscaling_groups              = [
[terraform]:                   - {
[terraform]:                       - name = "eks-user-06bf2c01-b6a2-7df4-2dd9-085b9d1c86fa"
[terraform]:                     },
[terraform]:                 ]
[terraform]:               - remote_access_security_group_id = ""
[terraform]:             },
[terraform]:         ] -> (known after apply)
[terraform]:       ~ status                 = "ACTIVE" -> (known after apply)
[terraform]:       ~ subnet_ids             = [
[terraform]:           - "subnet-0063ecc4391bfbca7",
[terraform]:           - "subnet-0aede967b72f0907b",
[terraform]:         ] -> (known after apply) # forces replacement
[terraform]:         tags                   = {
[terraform]:             "Environment"                           = "dev"
[terraform]:             "Owner"                                 = "terraform"
[terraform]:             "Project"                               = "qhubawsdslc"
[terraform]:             "kubernetes.io/cluster/qhubawsdslc-dev" = "shared"
[terraform]:         }
[terraform]:       ~ version                = "1.21" -> (known after apply)
[terraform]:         # (6 unchanged attributes hidden)
[terraform]:
[terraform]:
[terraform]:       ~ update_config {
[terraform]:           ~ max_unavailable            = 1 -> (known after apply)
[terraform]:           ~ max_unavailable_percentage = 0 -> (known after apply)
[terraform]:         }
[terraform]:         # (1 unchanged block hidden)
[terraform]:     }
[terraform]:
[terraform]:   # module.kubernetes.aws_eks_node_group.main[2] must be replaced
[terraform]: -/+ resource "aws_eks_node_group" "main" {
[terraform]:       ~ arn                    = "arn:aws:eks:eu-west-2:892486800165:nodegroup/qhubawsdslc-dev/worker/f2bf2c01-b695-d752-dff8-df37ec98f1ab" -> (known after apply)
[terraform]:       ~ capacity_type          = "ON_DEMAND" -> (known after apply)
[terraform]:       ~ disk_size              = 20 -> 50 # forces replacement
[terraform]:       ~ id                     = "qhubawsdslc-dev:worker" -> (known after apply)
[terraform]:       - labels                 = {} -> null
[terraform]:       + node_group_name_prefix = (known after apply)
[terraform]:       ~ release_version        = "1.21.5-20220112" -> (known after apply)
[terraform]:       ~ resources              = [
[terraform]:           - {
[terraform]:               - autoscaling_groups              = [
[terraform]:                   - {
[terraform]:                       - name = "eks-worker-f2bf2c01-b695-d752-dff8-df37ec98f1ab"
[terraform]:                     },
[terraform]:                 ]
[terraform]:               - remote_access_security_group_id = ""
[terraform]:             },
[terraform]:         ] -> (known after apply)
[terraform]:       ~ status                 = "ACTIVE" -> (known after apply)
[terraform]:       ~ subnet_ids             = [
[terraform]:           - "subnet-0063ecc4391bfbca7",
[terraform]:           - "subnet-0aede967b72f0907b",
[terraform]:         ] -> (known after apply) # forces replacement
[terraform]:         tags                   = {
[terraform]:             "Environment"                           = "dev"
[terraform]:             "Owner"                                 = "terraform"
[terraform]:             "Project"                               = "qhubawsdslc"
[terraform]:             "kubernetes.io/cluster/qhubawsdslc-dev" = "shared"
[terraform]:         }
[terraform]:       ~ version                = "1.21" -> (known after apply)
[terraform]:         # (6 unchanged attributes hidden)
[terraform]:
[terraform]:
[terraform]:       ~ update_config {
[terraform]:           ~ max_unavailable            = 1 -> (known after apply)
[terraform]:           ~ max_unavailable_percentage = 0 -> (known after apply)
[terraform]:         }
[terraform]:         # (1 unchanged block hidden)
[terraform]:     }

@danlester
Copy link
Contributor

I think it's something to do with CIDR changes:

[terraform]:   # module.network.aws_subnet.main[0] must be replaced
[terraform]: -/+ resource "aws_subnet" "main" {
[terraform]:       ~ arn                             = "arn:aws:ec2:eu-west-2:892486800165:subnet/subnet-0aede967b72f0907b" -> (known after apply)
[terraform]:       ~ availability_zone_id            = "euw2-az2" -> (known after apply)
[terraform]:       ~ cidr_block                      = "10.10.0.0/20" -> "10.10.0.0/18" # forces replacement
[terraform]:       ~ id                              = "subnet-0aede967b72f0907b" -> (known after apply)
[terraform]:       + ipv6_cidr_block_association_id  = (known after apply)

I would take a look where these have been changed (e.g. vpc_cidr_newbits and vpc_cidr_block in the code), find out why, and see if they can at least be preserved for old installations.

@viniciusdc
Copy link
Contributor

@iameskild just to keep in mind during tests

@viniciusdc viniciusdc self-assigned this Feb 18, 2022
@iameskild
Copy link
Member Author

CICD workflows have been tested and a PR for the relevant bug fixes/modifications has been opened:
#1086

@viniciusdc
Copy link
Contributor

Azure issues seem in integration tests does not affect fresh local deployments

@viniciusdc
Copy link
Contributor

@danlester @HarshCasper Have you tested the qhub upgrade command for the above version migrations? just to know if that still needs to be tested 😄

@iameskild
Copy link
Member Author

v0.4.0 released. Closing issue 🙌

Repository owner moved this from In Progress 🏃🏽‍♀️ to Done 💪🏾 in QHub Project Mangement 🚀 Mar 22, 2022
@iameskild iameskild unpinned this issue Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: testing ✅ Testing type: release 🏷 Items related to Nebari releases
Projects
None yet
Development

No branches or pull requests

4 participants