Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform destroy Invalid index issue #1161

Closed
armujahid opened this issue Nov 11, 2022 · 16 comments
Closed

terraform destroy Invalid index issue #1161

armujahid opened this issue Nov 11, 2022 · 16 comments
Assignees

Comments

@armujahid
Copy link
Contributor

armujahid commented Nov 11, 2022

Description

Deploy karpenter example with released version using v4.15.0 (git checkout tags/v4.15.0)
Destroy cluster in multiple steps.
This last step (terraform destroy -auto-approve) throws this error

module.eks_blueprints.data.aws_iam_policy_document.eks_key: Reading...
module.eks_blueprints.data.aws_iam_policy_document.eks_key: Read complete after 0s [id=1575875048]
╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints.aws_eks/main.tf line 125, in locals:
│  125:   cluster_security_group_id = local.create_cluster_sg ? aws_security_group.cluster[0].id : var.cluster_security_group_id
│     ├────────────────
│     │ aws_security_group.cluster is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.
╵
╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints.aws_eks/node_groups.tf line 59, in locals:
│   59:   node_security_group_id = local.create_node_sg ? aws_security_group.node[0].id : var.node_security_group_id
│     ├────────────────
│     │ aws_security_group.node is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.

Versions

  • Module version [Required]: v4.15.0

  • Terraform version:

Terraform v1.3.4
on linux_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.39.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.15.0
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.2.0
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1
  • Provider version(s):
Terraform v1.3.4
on linux_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.39.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.15.0
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.2.0
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Reproduction Code [Required]

Steps to reproduce the behavior:

  1. git checkout tags/v4.15.0
  2. cd examples/karpenter
  3. terraform init
  4. terraform apply (1st apply will fail with this error but 2nd apply will work)
╷
│ Error: default failed to create kubernetes rest client for update of resource: Unauthorized
│ 
│   with kubectl_manifest.karpenter_provisioner["apiVersion: karpenter.sh/v1alpha5\nkind: Provisioner\nmetadata:\n  name: default\nspec:\n  requirements:\n    - key: \"topology.kubernetes.io/zone\"\n      operator: In\n      values: [us-west-2a,us-west-2b,us-west-2c]\n    - key: \"karpenter.sh/capacity-type\"\n      operator: In\n      values: [\"spot\", \"on-demand\"]\n  limits:\n    resources:\n      cpu: 1000\n  provider:\n    instanceProfile: karpenter-managed-ondemand\n    subnetSelector:\n      Name: \"karpenter-private*\"\n    securityGroupSelector:\n      karpenter.sh/discovery/karpenter: 'karpenter'\n  labels:\n    type: karpenter\n    provisioner: default\n  taints:\n    - key: default\n      value: 'true'\n      effect: NoSchedule\n  ttlSecondsAfterEmpty: 120"],
│   on main.tf line 203, in resource "kubectl_manifest" "karpenter_provisioner":
│  203: resource "kubectl_manifest" "karpenter_provisioner" {
│ 
╵
╷
│ Error: default-lt failed to create kubernetes rest client for update of resource: Unauthorized
│ 
│   with kubectl_manifest.karpenter_provisioner["apiVersion: karpenter.sh/v1alpha5\nkind: Provisioner\nmetadata:\n  name: default-lt\nspec:\n  requirements:\n    - key: \"topology.kubernetes.io/zone\"\n      operator: In\n      values: [us-west-2a,us-west-2b,us-west-2c]                               #Update the correct region and zones\n    - key: \"karpenter.sh/capacity-type\"\n      operator: In\n      values: [\"spot\", \"on-demand\"]\n    - key: \"node.kubernetes.io/instance-type\"              #If not included, all instance types are considered\n      operator: In\n      values: [\"m5.2xlarge\", \"m5.4xlarge\"]\n    - key: \"kubernetes.io/arch\"                            #If not included, all architectures are considered\n      operator: In\n      values: [\"arm64\", \"amd64\"]\n  limits:\n    resources:\n      cpu: 1000\n  provider:\n    launchTemplate: \"karpenter-karpenter\"     # Used by Karpenter Nodes\n    subnetSelector:\n      Name: \"karpenter-private*\"\n  labels:\n    type: karpenter\n    provisioner: default-lt\n  taints:\n    - key: default-lt\n      value: 'true'\n      effect: NoSchedule\n  ttlSecondsAfterEmpty: 120"],
│   on main.tf line 203, in resource "kubectl_manifest" "karpenter_provisioner":
│  203: resource "kubectl_manifest" "karpenter_provisioner" {
│ 
╵
╷
│ Error: Unauthorized
│ 
│   with kubernetes_secret_v1.datadog_api_key,
│   on main.tf line 214, in resource "kubernetes_secret_v1" "datadog_api_key":
│  214: resource "kubernetes_secret_v1" "datadog_api_key" {
│ 
╵

  1. terraform apply (2nd time)

At this point cluster will be available

  1. Now destroy it
terraform destroy -target="module.eks_blueprints_kubernetes_addons" -auto-approve
terraform destroy -target="module.eks_blueprints" -auto-approve
terraform destroy -target="module.vpc" -auto-approve
  1. Run this final step to get above mentioned error
terraform destroy -auto-approve

Expected behaviour

Cluster should destroy without any error and terraform.tfstate shouldn't have any residual resource

Actual behaviour

error mentioned above is thrown and terraform.tfstate still has some residual resources although the cluster has been removed along with the VPC

Terminal Output Screenshot(s)

image

Additional context

Note that I am migrating from v4.5.0 to 4.15.0 and I am getting similar Invalid index issues during cluster tear down. I have other helm releases in my project but this minimal, reproducible example is also exhibiting similar behavior. Examples of some errors that I am getting in my project are

╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/aws-ebs-csi-driver/main.tf line 83, in module "irsa_addon":
│   83:   irsa_iam_policies                 = concat([aws_iam_policy.aws_ebs_csi_driver[0].arn], try(var.addon_config.additional_iam_policies, []))
│     ├────────────────
│     │ aws_iam_policy.aws_ebs_csi_driver is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.
╵
╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/aws-for-fluentbit/outputs.tf line 3, in output "cw_log_group_name":
│    3:   value       = var.create_cw_log_group ? aws_cloudwatch_log_group.aws_for_fluent_bit[0].name : local.log_group_name
│     ├────────────────
│     │ aws_cloudwatch_log_group.aws_for_fluent_bit is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.
╵
╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints_kubernetes_addons/modules/kubernetes-addons/aws-for-fluentbit/outputs.tf line 8, in output "cw_log_group_arn":
│    8:   value       = var.create_cw_log_group ? aws_cloudwatch_log_group.aws_for_fluent_bit[0].arn : null
│     ├────────────────
│     │ aws_cloudwatch_log_group.aws_for_fluent_bit is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.

I am not sure how to recover from this situation.

@armujahid armujahid changed the title terraform destroy issue terraform destroy Invalid index issue Nov 11, 2022
@ayeks
Copy link
Contributor

ayeks commented Nov 15, 2022

I have the same problem Error: Invalid index aws_security_group.cluster is empty tuple aws_security_group.node is empty tuple

❯ terraform -chdir=tf-aws-eks-blueprint destroy -var-file=../cluster-k8s-sbx/variables.tfvars -target="module.eks_blueprints"
module.eks_blueprints.data.aws_partition.current: Reading...
module.eks_blueprints.module.aws_eks.module.kms.data.aws_caller_identity.current: Reading...
module.eks_blueprints.module.aws_eks.module.kms.data.aws_partition.current: Reading...
data.aws_subnets.private: Reading...
module.eks_blueprints.data.aws_region.current: Reading...
module.eks_blueprints.module.aws_eks.data.aws_partition.current: Reading...
data.aws_vpc.selected: Reading...
module.eks_blueprints.module.aws_eks.data.aws_caller_identity.current: Reading...
module.eks_blueprints.data.aws_caller_identity.current: Reading...
module.eks_blueprints.data.aws_partition.current: Read complete after 0s [id=aws]
module.eks_blueprints.module.aws_eks.data.aws_partition.current: Read complete after 0s [id=aws]
module.eks_blueprints.data.aws_region.current: Read complete after 0s [id=eu-central-1]
module.eks_blueprints.module.aws_eks.module.kms.data.aws_partition.current: Read complete after 0s [id=aws]
module.eks_blueprints.module.aws_eks.data.aws_iam_policy_document.assume_role_policy[0]: Reading...
module.eks_blueprints.module.aws_eks.data.aws_iam_policy_document.assume_role_policy[0]: Read complete after 0s [id=XXXXXXXX]
data.aws_subnets.private: Read complete after 0s [id=eu-central-1]
data.aws_vpc.selected: Read complete after 0s [id=vpc-XXXXXXXX]
module.eks_blueprints.module.aws_eks.module.kms.data.aws_caller_identity.current: Read complete after 0s [id=XXXXXXXX]
module.eks_blueprints.data.aws_caller_identity.current: Read complete after 0s [id=XXXXXXXX]
module.eks_blueprints.module.aws_eks.data.aws_caller_identity.current: Read complete after 0s [id=XXXXXXXX]
module.eks_blueprints.data.aws_iam_session_context.current: Reading...
module.eks_blueprints.data.aws_iam_session_context.current: Read complete after 1s [id=arn:aws:sts::XXXXXXXX:assumed-role/XXXXXXXXXXXX]
module.eks_blueprints.data.aws_iam_policy_document.eks_key: Reading...
module.eks_blueprints.data.aws_iam_policy_document.eks_key: Read complete after 0s [id=XXXXXXXXXXXXX]
╷
│ Warning: Resource targeting is in effect
│ 
│ You are creating a plan with the -target option, which means that the result of this plan may not represent all of the changes requested by the current configuration.
│ 
│ The -target option is not for routine use, and is provided only for exceptional situations such as recovering from errors or mistakes, or when Terraform specifically suggests to use it as part of an error message.
╵
╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints.aws_eks/main.tf line 125, in locals:
│  125:   cluster_security_group_id = local.create_cluster_sg ? aws_security_group.cluster[0].id : var.cluster_security_group_id
│     ├────────────────
│     │ aws_security_group.cluster is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.
╵
╷
│ Error: Invalid index
│ 
│   on .terraform/modules/eks_blueprints.aws_eks/node_groups.tf line 59, in locals:
│   59:   node_security_group_id = local.create_node_sg ? aws_security_group.node[0].id : var.node_security_group_id
│     ├────────────────
│     │ aws_security_group.node is empty tuple
│ 
│ The given key does not identify an element in this collection value: the collection has no elements.
╵

After having the problem the first time, I destroyed TF resources manually until I reach this minimal set. The issue still persists:

❯ terraform -chdir=tf-aws-eks-blueprint state list
data.aws_ami.eks
data.aws_autoscaling_groups.eks_node_group
data.aws_availability_zones.available
data.aws_caller_identity.current
data.aws_eks_addon_version.default["kube-proxy"]
data.aws_eks_addon_version.latest["coredns"]
data.aws_eks_addon_version.latest["vpc-cni"]
data.aws_eks_cluster.cluster
data.aws_eks_cluster_auth.cluster
data.aws_iam_policy_document.argocd_reposerver_iam_policy_document
data.aws_iam_policy_document.aws_ebs_csi_driver_iam_policy_document
data.aws_iam_policy_document.aws_lb_iam_policy_document
data.aws_iam_policy_document.crossplane_iam_policy_document
data.aws_iam_policy_document.external_dns_iam_policy_document
data.aws_iam_policy_document.karpenter_controller_iam_policy_document
data.aws_partition.current
data.aws_region.current
data.aws_route53_zone.public-dns-zone
data.aws_security_groups.gitlab-forref
data.aws_subnets.private
data.aws_vpc.selected
data.kubectl_path_documents.crds_monitoring_coreos_com
aws_acm_certificate_validation.private-cluster-cert
aws_acm_certificate_validation.public-cluster-cert
aws_route53_record.private-cluster-cert["*.k8s.private.XXXXXXXXXXXX.com"]
aws_route53_record.public-cluster-cert["*.k8s.XXXXXXXXXXXX.com"]
module.eks_blueprints.data.aws_caller_identity.current
module.eks_blueprints.data.aws_partition.current
module.eks_blueprints.data.aws_region.current

Problem exists with module version: v4.12.2 and v4.16.0

TF provider plugin versions: latest

- Using previously-installed hashicorp/time v0.9.1
- Using previously-installed okta/okta v3.38.0
- Using previously-installed hashicorp/random v3.4.3
- Using previously-installed gavinbunney/kubectl v1.14.0
- Using previously-installed hashicorp/helm v2.7.1
- Using previously-installed hashicorp/cloudinit v2.2.0
- Using previously-installed terraform-aws-modules/http v2.4.1
- Using previously-installed hashicorp/kubernetes v2.15.0
- Using previously-installed hashicorp/aws v4.39.0
- Installing hashicorp/tls v4.0.4...
- Installed hashicorp/tls v4.0.4 (signed by HashiCorp)
- Using previously-installed hashicorp/local v2.2.3
- Using previously-installed hashicorp/null v3.2.0

Edit 2022-11-16:

After some more research it seems that this is rather a problem with the AWS EKS TF module: https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/main.tf#L125

It looks related to this: terraform-aws-modules/terraform-aws-eks#568

The cluster is already deleted but still appears in the state of terraform state pull.

@timblaktu
Copy link

timblaktu commented Nov 16, 2022

I started seeing this after a recent upgrade from 4.14.0 to 4.15.0 of terraform-aws-eks-blueprints.

The error I'm seeing is:

main  | 2022-11-16T06:37:24.404214100Z ╷
main  | 2022-11-16T06:37:24.404241500Z │ Error: Invalid index
main  | 2022-11-16T06:37:24.404244300Z │
main  | 2022-11-16T06:37:24.404245900Z │   on .terraform/modules/eks_blueprints.aws_eks/node_groups.tf line 59, in locals:
main  | 2022-11-16T06:37:24.404247800Z │   59:   node_security_group_id = local.create_node_sg ? aws_security_group.node[0].id : var.node_security_group_id
main  | 2022-11-16T06:37:24.404249700Z │     ├────────────────
main  | 2022-11-16T06:37:24.404251500Z │     │ aws_security_group.node is empty tuple
main  | 2022-11-16T06:37:24.404253500Z │
main  | 2022-11-16T06:37:24.404255100Z │ The given key does not identify an element in this collection value: the collection has no elements.
main  | 2022-11-16T06:37:24.404256800Z ╵
main  | 2022-11-16T06:37:24.805390800Z Releasing state lock. This may take a few moments...

EDIT: I see now that my mention of cluster SG is probably irrelevant here; the issue is related to the node security group.

In contrast with @armujahid's environment, I'm creating my own Security Group to pass to eks_blueprints.cluster_security_group_id, to work around the well-known "leaky ENI issue":

module "eks_blueprints" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints?ref=v4.15.0"
  cluster_name                            = local.name
  cluster_version                         = "1.23"
  vpc_id                                  = module.vpc.vpc_id
  private_subnet_ids                      = module.vpc.private_subnets
  create_cluster_security_group           = false
  cluster_security_group_id               = aws_security_group.eks.id
  .
  .
  .
}
# Cluster Security Group - manage ourselves to avoid:
#   - https://github.com/aws-ia/terraform-aws-eks-blueprints/issues/968
#   - https://github.com/terraform-aws-modules/terraform-aws-vpc/issues/283#issuecomment-914758128
#   - https://github.com/aws/amazon-vpc-cni-k8s/issues/1447
resource "aws_security_group" "eks" {
  name_prefix = local.name
  description = "EKS cluster security group."
  vpc_id = module.vpc.vpc_id
  tags = merge(
    local.tags,
    { "Name" = "${local.name}-eks_cluster_sg" },
  )
}

@timblaktu
Copy link

timblaktu commented Nov 16, 2022

@ayeks @armujahid are you letting eks_blueprints create_cluster_security_group, or passing one in via cluster_security_group_id?

EDIT: I see now that cluster SG is probably irrelevant here; the issue is related to the node security group.

@ayeks what specifically are you trying to link to in aws_eks module code and that closed issue.

@armujahid
Copy link
Contributor Author

@timblaktu I am letting eks_blueprint to create everything from scratch by passing nothing (since create_cluster_security_group is true by default ). You can check reproduction steps mentioned above.

@timblaktu
Copy link

timblaktu commented Nov 16, 2022

@ayeks I see in the changeset that resolved aws-eks#568 (same issue, but for cluster sg, not node sg) they basically changed references to the cluster list to use cluster[*] instead of cluster[0] to be robust to the case where the list is empty (after a destroy).

So, it would seem the same needs to be done for aws_security_group.node in eks_blueprints.aws_eks/node_groups.tf.

However, this implies that this problem is long-standing (since that issue is a couple years old), yet I only started seeing the issue recently after upgrading from 4.14.0 to 4.15.0 of terraform-aws-eks-blueprints. I can see that the 4.14.0-->4.15.0 upgrade includes an upgrade of the aws_eks module from 18.26.6-->18.29.1.

Tracing through the terraform-aws-eks changelog, I see that there was a bugfix in the penultimate release 18.30.2, that addressed this issue, which seems relevant because the changeset touches node security groups and cluster security group rules.

Since I'm already using a fork of eks-blueprints to work around other issues, I'm going to upgrade to terraform-aws-eks 18.30.2 as a simple test.

timblaktu pushed a commit to timblaktu/terraform-aws-eks-blueprints that referenced this issue Nov 16, 2022
@timblaktu
Copy link

Latest module version didn't fix it for me, so I created this fork/branch with the fix that I think is appropriate. Testing now.

@ayeks
Copy link
Contributor

ayeks commented Nov 17, 2022

@timblaktu thanks for the detailed response and investigation! I let eks_blueprints create the security groups by using the default true for the variable create_cluster_security_group.

However, I pass the SG ID to module eks_blueprint_kubernetes_addons as defined here:

eks_worker_security_group_id = module.eks_blueprints.worker_node_security_group_id

module "eks_blueprints_kubernetes_addons" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons?ref=v4.16.0"
  eks_cluster_id = module.eks_blueprints.eks_cluster_id
  eks_worker_security_group_id = module.eks_blueprints.worker_node_security_group_id
  ..

And I also store the SG ID in a AWS Secrets Manager Secret - but this object is already deleted in AWS and does not exist in TF state anymore. So I think its not relevant.

resource "aws_secretsmanager_secret_version" "k8s-generic-cluster-secret" {
  secret_id = aws_secretsmanager_secret.k8s-generic-cluster-secret.id
  secret_string = jsonencode({
    "karpenter-endpoint": data.aws_eks_cluster.cluster.endpoint,
    "gitlab-runner-sg": "${data.aws_eks_cluster.cluster.vpc_config[0].cluster_security_group_id},${data.aws_security_groups.gitlab-forref.ids[0]}"
  })
}

I deleted everything and deployed the cluster again to try out your change: https://github.com/timblaktu/terraform-aws-eks/tree/568-redux

In addition to your changes I had to replace this line in main.tf locals to get rid of the aws_security_group.cluster is empty tuple error:

locals {
  cluster_sg_name = coalesce(var.cluster_security_group_name, "${var.cluster_name}-cluster")
  create_cluster_sg = local.create && var.create_cluster_security_group

  #cluster_security_group_id = local.create_cluster_sg ? aws_security_group.cluster[0].id : var.cluster_security_group_id
  cluster_security_group_id = local.create_cluster_sg ? coalescelist(aws_security_group.cluster[*].id, [""])[0] : var.cluster_security_group_id

But know the eks_blueprints module complains because the outputs of aws_eks are empty. The cluster is already deleted, but somehow that is not recognized by eks_blueprints:

❯ terraform -chdir=tf-aws-eks-blueprint destroy -var-file=../cluster-k8s-sbx/variables.tfvars -target="module.eks_blueprints" -auto-approve
data.aws_subnets.private: Reading...
module.eks_blueprints.module.aws_eks.module.kms.data.aws_partition.current: Reading...
module.eks_blueprints.data.aws_region.current: Reading...
module.eks_blueprints.module.aws_eks.data.aws_partition.current: Reading...
data.aws_vpc.selected: Reading...
module.eks_blueprints.data.aws_partition.current: Reading...
module.eks_blueprints.module.aws_eks.module.kms.data.aws_caller_identity.current: Reading...
module.eks_blueprints.module.aws_eks.data.aws_caller_identity.current: Reading...
module.eks_blueprints.module.aws_eks.data.aws_partition.current: Read complete after 0s [id=aws]
module.eks_blueprints.data.aws_region.current: Read complete after 0s [id=eu-central-1]
module.eks_blueprints.module.aws_eks.module.kms.data.aws_partition.current: Read complete after 0s [id=aws]
module.eks_blueprints.data.aws_partition.current: Read complete after 0s [id=aws]
module.eks_blueprints.data.aws_caller_identity.current: Reading...
module.eks_blueprints.module.aws_eks.data.aws_iam_policy_document.assume_role_policy[0]: Reading...
module.eks_blueprints.module.aws_eks.data.aws_iam_policy_document.assume_role_policy[0]: Read complete after 0s [id=276XXXXXXXX]
data.aws_subnets.private: Read complete after 0s [id=eu-central-1]
data.aws_vpc.selected: Read complete after 0s [id=vpc-0adXXXXXXXXXXXXXX]
module.eks_blueprints.module.aws_eks.data.aws_caller_identity.current: Read complete after 0s [id=4842XXXXXXX]
module.eks_blueprints.data.aws_caller_identity.current: Read complete after 0s [id=4842XXXXXXXX]
module.eks_blueprints.module.aws_eks.module.kms.data.aws_caller_identity.current: Read complete after 0s [id=4842XXXXXXXX]
module.eks_blueprints.data.aws_iam_session_context.current: Reading...
module.eks_blueprints.data.aws_iam_session_context.current: Read complete after 1s [id=arn:aws:sts::48XXXXXXXXXX:assumed-role/XXXXXXXXXXXXXXXXXXXXXXXXXX]
module.eks_blueprints.data.aws_iam_policy_document.eks_key: Reading...
module.eks_blueprints.data.aws_iam_policy_document.eks_key: Read complete after 0s [id=1297XXXXXXXXXX]
module.eks_blueprints.module.kms[0].aws_kms_key.this: Refreshing state... [id=f4XXXXXXXXXXXXXXX]
╷
│ Warning: Resource targeting is in effect
│ 
│ You are creating a plan with the -target option, which means that the result of this plan may not represent all of the changes requested by the current configuration.
│ 
│ The -target option is not for routine use, and is provided only for exceptional situations such as recovering from errors or mistakes, or when Terraform specifically suggests to use it as part of an error message.
╵
╷
│ Error: "name" length must be between 1-100 characters: ""
│ 
│   with data.aws_eks_cluster.cluster,
│   on data.tf line 8, in data "aws_eks_cluster" "cluster":
│    8:   name = module.eks_blueprints.eks_cluster_id
│ 
╵
╷
│ Error: "name" doesn't comply with restrictions ("^[0-9A-Za-z][A-Za-z0-9\\-_]+$"): ""

│   with data.aws_eks_cluster.cluster,
│   on data.tf line 8, in data "aws_eks_cluster" "cluster":
│    8:   name = module.eks_blueprints.eks_cluster_id



│ Error: name must not be empty, got 

│   with data.aws_eks_cluster_auth.cluster,
│   on data.tf line 12, in data "aws_eks_cluster_auth" "cluster":
│   12:   name = module.eks_blueprints.eks_cluster_id



│ Error: Invalid URL

│   with module.eks_blueprints.module.aws_eks.data.tls_certificate.this[0],
│   on .terraform/modules/eks_blueprints.aws_eks/main.tf line 206, in data "tls_certificate" "this":
│  206:   url = coalescelist(aws_eks_cluster.this[*].identity[0].oidc[0].issuer, [""])[0]

│ URL "" contains no host


│ Error: "name" length must be between 1-100 characters: ""

│   with module.eks_blueprints.data.aws_eks_cluster.cluster[0],
│   on .terraform/modules/eks_blueprints/data.tf line 7, in data "aws_eks_cluster" "cluster":
│    7:   name  = module.aws_eks.cluster_id



│ Error: "name" doesn't comply with restrictions ("^[0-9A-Za-z][A-Za-z0-9\\-_]+$"): ""
│ 
│   with module.eks_blueprints.data.aws_eks_cluster.cluster[0],
│   on .terraform/modules/eks_blueprints/data.tf line 7, in data "aws_eks_cluster" "cluster":
│    7:   name  = module.aws_eks.cluster_id
│ 
╵

I am not sure if this is the right direction. I will leave the TF in this broken state, so that I can test additional things.


Edit:

I just rolled back all changes and deployed an older state to a new cluster, which works flawlessly - including destroy.

Here is the full list of versions:

Component Failing Destroy Working Destroy
terraform-aws-eks-blueprints v4.16.0 (v4.12.2) v4.12.2
hashicorp/null v3.2.0 v3.1.1
okta/okta v3.38.0
hashicorp/helm v2.7.1 v2.7.0
hashicorp/local v2.2.3 v2.2.3
hashicorp/cloudinit v2.2.0 v2.2.0
hashicorp/kubernetes v2.15.0 v2.14.0
hashicorp/time v0.9.1 v0.9.0
hashicorp/tls v4.0.4 v3.4.0
terraform-aws-modules/http v2.4.1 v2.4.1
gavinbunney/kubectl v1.14.0 v1.14.0
hashicorp/aws v4.39.0 v4.34.0
hashicorp/random v3.4.3

I am now introducing the changes step by step to identify the source. This takes quite some time.

@timblaktu
Copy link

@ayeks thanks for testing and reporting! I can confirm I'm getting same behavior as you.

Can you confirm whether you are using terraform 1.3.4? After playing whack-a-mole on this problem for a couple days and still finding it popping up even in modules I haven't changed, I have come to realize that much of this is a known issue in terraform 1.3.4. See my comment for details.

@ayeks
Copy link
Contributor

ayeks commented Nov 17, 2022

@timblaktu Oh my god. I just switched to terraform 1.3.3 from 1.3.4 and ran the destroy command locally without any issues. If you ever come to Nuremberg, I will definitely buy you a beer for that hint. Thank you so much!

I will run our end to end testing pipeline with the pinned version, just to make sure that I have not missed anything. What a crazy bug. Dependency hell at its finest.

Edit: end2end test worked fine. I think this issue can be closed. Thanks again!

@bryantbiggs
Copy link
Contributor

it looks like fixes have been made to Terraform core now:

@armujahid
Copy link
Contributor Author

Awesome. Let me test again using Terraform 1.35 (released with these 2 fixes) and terraform-aws-eks-blueprints v4.17.0

@armujahid
Copy link
Contributor Author

Now I am getting different errors ( Error: Unsupported attribute) on final terraform destroy

Terraform v1.3.5
on linux_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.40.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.16.0
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

terraform-aws-eks-blueprints v4.17.0
Example: same (examples/karpenter)

module.eks_blueprints_kubernetes_addons.module.aws_ebs_csi_driver[0].data.aws_eks_addon_version.this: Read complete after 2s [id=aws-ebs-csi-driver]
╷
│ Error: Unsupported attribute
│ 
│   on main.tf line 21, in provider "kubectl":
│   21:   host                   = module.eks_blueprints.eks_cluster_endpoint
│     ├────────────────
│     │ module.eks_blueprints is object with 12 attributes
│ 
│ This object does not have an attribute named "eks_cluster_endpoint".
╵
╷
│ Error: Unsupported attribute
│ 
│   on main.tf line 22, in provider "kubectl":
│   22:   cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
│     ├────────────────
│     │ module.eks_blueprints is object with 12 attributes
│ 
│ This object does not have an attribute named "eks_cluster_certificate_authority_data".

@timblaktu
Copy link

timblaktu commented Dec 1, 2022

@armujahid I'm using eks_blueprints as well and saw those same errors on my first destroy sequence following the upgrade to 1.3.5. FWIW, those went away after I (in an unknown order):

  • upgraded to today's terraform 1.3.6
  • ran terraform destroy sequence again

Since, I have run my end-to-end lifecycle pipeline (apply sequence, some testing, destroy sequence) a few times with no errors.

I should note here that before @jbardin merged that PR that closed the 1.3.4 destroy refresh issue I linked earlier, he had recommended to add -refresh=true to all terraform destroy calls to avoid the issue. It's unclear to me whether this workaround is still necessary, but for now things seem to be working better for me with 1.3.6 and the extra -refreshes.

p.s. @ayeks thanks for the offer - may take you up on that sometime - my wife and son just got their German citizenship and we're considering moving to Bavaria in a couple years. :-)

@armujahid
Copy link
Contributor Author

armujahid commented Dec 1, 2022

Tested again by creating and destroying fresh cluster using terraform v.1.3.6

Terraform v1.3.6
on linux_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v4.44.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.16.0
+ provider registry.terraform.io/hashicorp/local v2.2.3
+ provider registry.terraform.io/hashicorp/null v3.2.1
+ provider registry.terraform.io/hashicorp/random v3.4.3
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

terraform-aws-eks-blueprints v4.17.0
Example: same (examples/karpenter)
Got same error "Error: Unsupported attribute" on last terraform destroy -auto-approve -refresh

module.eks_blueprints_kubernetes_addons.module.aws_ebs_csi_driver[0].data.aws_eks_addon_version.this: Read complete after 1s [id=aws-ebs-csi-driver]
╷
│ Error: Unsupported attribute
│ 
│   on main.tf line 21, in provider "kubectl":
│   21:   host                   = module.eks_blueprints.eks_cluster_endpoint
│     ├────────────────
│     │ module.eks_blueprints is object with 12 attributes
│ 
│ This object does not have an attribute named "eks_cluster_endpoint".
╵
╷
│ Error: Unsupported attribute
│ 
│   on main.tf line 22, in provider "kubectl":
│   22:   cluster_ca_certificate = base64decode(module.eks_blueprints.eks_cluster_certificate_authority_data)
│     ├────────────────
│     │ module.eks_blueprints is object with 12 attributes
│ 
│ This object does not have an attribute named "eks_cluster_certificate_authority_data".
╵

Should I create a new issue? Since invalid index issue is no longer occurring since terraform v.1.3.5 and this seems like a different issue

@timblaktu
Copy link

@armujahid I agree that that seems like a separate issue from the original indexing issue.

@armujahid
Copy link
Contributor Author

Created a separate issue for "Error: Unsupported attribute". Closing since original issue is no longer reproducible. Thanks everyone :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants