Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data source for current AWS partition breaks modules #1753

Closed
watsonjm opened this issue Jan 7, 2022 · 30 comments
Closed

Data source for current AWS partition breaks modules #1753

watsonjm opened this issue Jan 7, 2022 · 30 comments

Comments

@watsonjm
Copy link

watsonjm commented Jan 7, 2022

Description

Please provide a clear and concise description of the issue you are encountering, your current setup, and what steps led up to the issue. If you can provide a reproduction, that will help tremendously.

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

Terraform v1.1.2
on linux_amd64

  • provider registry.terraform.io/hashicorp/aws v3.71.0
  • provider registry.terraform.io/hashicorp/cloudinit v2.2.0
  • provider registry.terraform.io/hashicorp/helm v2.2.0
  • provider registry.terraform.io/hashicorp/http v2.1.0
  • provider registry.terraform.io/hashicorp/local v2.1.0
  • provider registry.terraform.io/hashicorp/null v3.1.0
  • provider registry.terraform.io/hashicorp/random v3.1.0
  • provider registry.terraform.io/hashicorp/template v2.2.0
  • provider registry.terraform.io/hashicorp/tls v3.1.0
  • Module: 18.0.4

Reproduction

Steps to reproduce the behavior:

No Workspaces

I have cleared the local cache

steps to reproduce:
terraform init
terraform plan

Code Snippet to Reproduce

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "18.0.4"

  cluster_name                    = local.eks_cluster_name
  cluster_version                 = "1.21"
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true
  vpc_id                          = module.vpc.0.vpc_id
  subnet_ids                      = local.private_subnets_ids
  cluster_enabled_log_types       = ["api", "audit", "authenticator", "controllerManager", "scheduler"]
  cluster_security_group_name     = "${local.name_tag}-eks-cluster"
  cluster_service_ipv4_cidr       = "10.100.0.0/16"
  cluster_addons = {
    coredns = {
      resolve_conflicts = "OVERWRITE"
    }
    kube-proxy = {}
    vpc-cni = {
      resolve_conflicts = "OVERWRITE"
    }
  }

  # cluster_encryption_config = [{
  #   provider_key_arn = "ac01234b-00d9-40f6-ac95-e42345f78b00"
  #   resources        = ["secrets"]
  # }]

  eks_managed_node_groups = {
    blue = {}
    green = {
      min_size     = 1
      max_size     = 10
      desired_size = 1

      instance_types = ["t3.large"]
      capacity_type  = "SPOT"
      labels = {
        Environment = "test"
        GithubRepo  = "terraform-aws-eks"
        GithubOrg   = "terraform-aws-modules"
      }
      taints = {
        dedicated = {
          key    = "dedicated"
          value  = "gpuGroup"
          effect = "NO_SCHEDULE"
        }
      }
      tags = {
        ExtraTag = "example"
      }
    }
  }
}

Expected behavior

The terraform plan should be successful

Actual behavior

Terraform plan fails with 3 versions of this error (one for the cluster and one per node group):

│ Error: Invalid for_each argument
│
│   on .terraform/modules/eks/modules/eks-managed-node-group/main.tf line 427, in resource "aws_iam_role_policy_attachment" "this":
│  427:   for_each = var.create && var.create_iam_role ? toset(compact(distinct(concat([
│  428:     "${local.policy_arn_prefix}/AmazonEKSWorkerNodePolicy",
│  429:     "${local.policy_arn_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  430:     "${local.policy_arn_prefix}/AmazonEKS_CNI_Policy",
│  431:   ], var.iam_role_additional_policies)))) : toset([])
│     ├────────────────
│     │ local.policy_arn_prefix is a string, known only after apply
│     │ var.create is true
│     │ var.create_iam_role is true
│     │ var.iam_role_additional_policies is empty list of string
│
│ The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to  
│ first apply only the resources that the for_each depends on.

Terminal Output Screenshot(s)

Additional context

There is 1 error for the cluster, and one for each node group. The error is from this line of code:
policy_arn_prefix = "arn:${data.aws_partition.current.partition}:iam::aws:policy"
Changing it in .terraform/modules/eks/main.tf line 173 & .terraform/modules/eks/modules/eks-managed-node-group/main.tf line 393 to policy_arn_prefix = "arn:aws:iam::aws:policy" will give a successful terraform plan. I've ran into this before and had to make the partition a variable and pass in the partition value from data "aws_partition" "current" {} when creating other modules. For whatever reason it is not called during a plan if it is in a module.

@blima-medicallyhome
Copy link

Same problem here

@bryantbiggs
Copy link
Member

what region is being used? just stating that the partition breaks doesn't help much - this is what this data source was intended for and its used in a similar fashion in other modules so if we could get more information maybe we can help track down why its not working for you all but works for others

@watsonjm
Copy link
Author

watsonjm commented Jan 10, 2022

Hello, I'm just using the default "aws" partition in us-east-1. The lookup itself doesn't work when it's in a module, it works fine in the rest of my code but doesn't work in any modules.

@bryantbiggs
Copy link
Member

Hello, I'm just using the default "aws" partition. The lookup itself doesn't work when it's in a module, it works fine in the rest of my code but doesn't work in any modules.

again, this doesn't tell us anything - the use of this data source is well known and used through our modules as well as within testing the AWS provider

@watsonjm
Copy link
Author

Is there anything specific I can provide that hasn't been provided yet in order to help figure out the issue?

@bryantbiggs
Copy link
Member

what region is being used?

but I can see you edited above now to add us-east-1

we will leave this issue open since it seems others are 👍🏽 - but I would suggest filing a bug ticket with the upstream AWS provider (note - they will want a way to reproduce the error)

@watsonjm
Copy link
Author

what region is being used?

but I can see you edited above now to add us-east-1

we will leave this issue open since it seems others are 👍🏽 - but I would suggest filing a bug ticket with the upstream AWS provider (note - they will want a way to reproduce the error)

Apologies, I added in the region when I re-read it properly after posting.

I was trying to re-create my code with only relevant pieces to submit a bug ticket with the upstream provider, and the issue is caused by a depends_on that I have in the module. Taking the depends_on out fixes the issue! Thank you for time and for taking a look at this.

@bryantbiggs
Copy link
Member

Awesome - glad it's resolved! I've never seen that data source not work and it had me stumped 😅

@lure
Copy link

lure commented Jan 10, 2022

@bryantbiggs would you kindly explain, why is that this code

resource "aws_iam_policy" "worker_ingress_enabled_policy" {
  name        = "${local.cluster_name}-ingress-worker"
  description = "Worker policy for the ALB Ingress"

  policy = file("iam-ingress-worker.json")
  tags   = local.tags
}

module "eks" {
...
eks_managed_node_groups = {
    green = {
       iam_role_additional_policies = [aws_iam_policy.worker_ingress_enabled_policy.arn]
    }
}

results in

Acquiring state lock. This may take a few moments...
╷
│ Error: Invalid for_each argument
│
│   on .terraform/modules/eks/modules/eks-managed-node-group/main.tf line 427, in resource "aws_iam_role_policy_attachment" "this":
│  427:   for_each = var.create && var.create_iam_role ? toset(compact(distinct(concat([
│  428:     "${local.policy_arn_prefix}/AmazonEKSWorkerNodePolicy",
│  429:     "${local.policy_arn_prefix}/AmazonEC2ContainerRegistryReadOnly",
│  430:     "${local.policy_arn_prefix}/AmazonEKS_CNI_Policy",
│  431:   ], var.iam_role_additional_policies)))) : toset([])
│     ├────────────────

Is there any workaround to use the roles created just in the same module?
Thank you.

@bryantbiggs
Copy link
Member

@lure I suspect its due to hashicorp/terraform#4149 - could you try it with creating the policy first and then creating the cluster with the additional policy? I suspect this might pass without issue

@jeroen-s
Copy link

@bryantbiggs I am of the opinion that this is still an issue as it means that it is now no longer possible to use depends_on with this module.

@bryantbiggs
Copy link
Member

@bryantbiggs I am of the opinion that this is still an issue as it means that it is now no longer possible to use depends_on with this module.

I would suggest filing an issue with the upstream Terraform repository for this - as far as I am aware, there are no restrictions on using a data source with a module depends_on block and I think this is a bug somewhere in the Terraform evaluation logic

@jeroen-s
Copy link

The problem is that the result of the data source is used in a for_each. The data source is not evaluated until the dependant resource is created which causes the problem described in the issue.

@paulbsch
Copy link
Contributor

what region is being used?

but I can see you edited above now to add us-east-1
we will leave this issue open since it seems others are 👍🏽 - but I would suggest filing a bug ticket with the upstream AWS provider (note - they will want a way to reproduce the error)

Apologies, I added in the region when I re-read it properly after posting.

I was trying to re-create my code with only relevant pieces to submit a bug ticket with the upstream provider, and the issue is caused by a depends_on that I have in the module. Taking the depends_on out fixes the issue! Thank you for time and for taking a look at this.

Removing depends_on worked for me initially, but on subsequent runs of 'terraform apply', the issue returned.

@jeroen-s
Copy link

@paulbsch are you using iam_role_additional_policies by any chance? specifying aws_iam_policy.your_policy.arn there leads to the same type of problem

@paulbsch
Copy link
Contributor

@jeroen-s I am using that, however, I just tried removing it and the issue still persists. Currently, I can't even run a 'terraform destroy' to remove my cluster as the issue shows up when doing that also.

@bryantbiggs
Copy link
Member

so after digging into this, it is in fact due to hashicorp/terraform#4149 and not the data source

there are two ways that I found to work around this:

  1. terraform apply --target <your policy> and then terraform apply => create the dependent resources before the cluster
  2. Attach the additional policies externally, see below

(truncated for brevity)

module "eks" {
  eks_managed_node_groups = {
  ...
  }
}

resource "aws_iam_policy" "node_additional" {
  name        = "${local.name}-additional"
  description = "Example usage of node additional policy"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = [
          "ec2:Describe*",
        ]
        Effect   = "Allow"
        Resource = "*"
      },
    ]
  })

  tags = local.tags
}

resource "aws_iam_role_policy_attachment" "additional" {
  for_each = module.eks.eks_managed_node_groups

  policy_arn = aws_iam_policy.node_additional.arn
  role       = each.value.iam_role_arn
}

@bryantbiggs
Copy link
Member

I've opened a PR to add a clear notification to the README regarding this issue https://github.com/bryantbiggs/terraform-aws-eks/tree/fix/remote-access-sgs#%E2%84%B9%EF%B8%8F-error-invalid-for_each-argument-

@jeroen-s
Copy link

jeroen-s commented Jan 12, 2022

Just to be clear, there are two separate problems discussed in this issue:

  • for_each error regarding var.iam_role_additional_policies when using aws_iam_policy.your_policy.arn in iam_role_additional_policies
  • for_each error regarding local.policy_arn_prefix when using depends_on = [anything] in the module

Both are due to hashicorp/terraform#4149. The first one has nothing to do with the data source and can be handled as @bryantbiggs showed above.The second one is what I think the original issue is about and what I was referring to with:

The problem is that the result of the data source is used in a for_each. The data source is not evaluated until the dependant resource is created which causes the problem described in the issue.

@ulm0
Copy link

ulm0 commented Feb 21, 2022

  • for_each error regarding local.policy_arn_prefix when using depends_on = [anything] in the module

moreover wherever anything within a for_each loop references data.aws_partition.current.partition such as policy_arn_prefix, iam_role_policy_prefix

@FernandoMiguel
Copy link
Contributor

FernandoMiguel commented Feb 24, 2022

disregard my comment.
kubernetes_config_map has a weird dependency

@CSimpiFoN
Copy link

Why is this ticket closed? This is still an issue. If you add depends_on = [whatever], still breaks the setup.
Like I wanted to depends_on = [module.launch_template.template] cause the node groups will need the launch template before deployment, and the module fails as described above.

@fletcher-malone
Copy link

Bump

@mohnishbasha
Copy link

bump - i am running into this issue as well for module the above scenarios.

  • for_each with depends_on
  • for_each due to data partitions use with local.policy_arn_prefix

@Reasonably
Copy link

Reasonably commented May 16, 2022

I think this issue or #1891 should be reopened. This is still an issue. As @jeroen-s said, there are two issues. the first issue can be avoided and the guide is merged. But the second one not is resolved and can't be avoided.

@chasezheng
Copy link

Can I suggest a solution?

The offending data source "aws_partition" is simply return a string "aws" vs "aws-cn". It's not an error that this cannot be determined until apply. To work around, we can simply take "aws-cn" from a variable.

@mmerickel
Copy link

Just ran into this when creating a new cluster from terraform config that worked on older versions of the eks module... This is broken in the module itself, and should be considered a bug and fixed. In my example I am supplying no additional iam policies. See the error below of var.iam_role_additional_policies is empty list of string:

│ Error: Invalid for_each argument
│
│   on .terraform/modules/eks.eks/main.tf line 257, in resource "aws_iam_role_policy_attachment" "this":
│  257:   for_each = local.create_iam_role ? toset(compact(distinct(concat([
│  258:     "${local.policy_arn_prefix}/AmazonEKSClusterPolicy",
│  259:     "${local.policy_arn_prefix}/AmazonEKSVPCResourceController",
│  260:   ], var.iam_role_additional_policies)))) : toset([])
│     ├────────────────
│     │ local.create_iam_role is true
│     │ local.policy_arn_prefix is a string, known only after apply
│     │ var.iam_role_additional_policies is empty list of string
│
│ The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to first apply only the resources that
│ the for_each depends on

@archoversight
Copy link

Can I suggest a solution?

The offending data source "aws_partition" is simply return a string "aws" vs "aws-cn". It's not an error that this cannot be determined until apply. To work around, we can simply take "aws-cn" from a variable.

Could also return a aws-gov for govCloud users, as well as us-iso and us-isob. It's not just aws and aws-cn.

@watsonjm
Copy link
Author

watsonjm commented Jun 15, 2022

Correct, but the user could pass in their own partition = data.aws_partition.current.partition value as a variable instead of being forced into the module's, and it would have the same result, minus the current issue.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests