Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect eks service principal for AWS China #1904

Closed
yuan-fei opened this issue Feb 28, 2022 · 10 comments · Fixed by #1905
Closed

incorrect eks service principal for AWS China #1904

yuan-fei opened this issue Feb 28, 2022 · 10 comments · Fixed by #1905

Comments

@yuan-fei
Copy link

Description

When creating EKS cluster IAM role in AWS China account, AWS complains the service principal is incorrect.

Error: failed creating IAM Role (wood-cluster-20220228072223989200000001): MalformedPolicyDocument: Invalid principal in policy: "SERVICE":"eks.amazonaws.com.cn"
│       status code: 400, request id: 4f866481-fd28-4baf-b433-447c82ca4e09
│ 
│   with module.eks.aws_iam_role.this[0],
│   on ../../main.tf line 191, in resource "aws_iam_role" "this":
│  191: resource "aws_iam_role" "this" {
│

The EKS service principal should be eks.amazonaws.com for AWS China (see the AWS China EKS doc) as well

EKS service principal is specified in the TF code here

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

  • Terraform: Terraform v1.0.11 on darwin_amd64
  • Provider(s):
provider registry.terraform.io/hashicorp/aws v4.2.0
provider registry.terraform.io/hashicorp/cloudinit v2.2.0
provider registry.terraform.io/hashicorp/tls v3.1.0

Reproduction

Steps to reproduce the behavior:

  1. create test code as bellow
  2. terraform apply

Code Snippet to Reproduce

provider "aws" {
  region                  = var.region
  shared_credentials_file = var.shared_credentials_file
}

locals {
  cluster_name                          = "wood"
  cluster_version                       = var.k8s_version
  cluster_additional_security_group_ids = var.cluster_additional_security_group_ids
  tags = {
    Example    = local.cluster_name
    GithubRepo = "terraform-aws-eks"
    GithubOrg  = "terraform-aws-modules"
  }

  vpc = {
    vpc_id          = var.vpc_id
    private_subnets = var.private_subnet_ids
    subnets         = var.subnet_ids
  }


  node = {
    instance_types         = var.instance_types
    node_role_arn          = var.node_role_arn
    vpc_security_group_ids = var.vpc_security_group_ids
    disk_size              = var.disk_size
    remote_access = {
      ec2_ssh_key               = var.node_ssh_key_name
      source_security_group_ids = var.remote_access_source_security_group_ids
    }
  }

}

data "aws_caller_identity" "current" {}

################################################################################
# EKS Module
################################################################################

module "eks" {
  source                          = "../.."
  create                          = true
  cluster_name                    = local.cluster_name
  cluster_version                 = local.cluster_version
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true
  create_cloudwatch_log_group     = false

  cluster_addons = {
    coredns = {
      resolve_conflicts = "OVERWRITE"
    }
    kube-proxy = {}
    vpc-cni = {
      resolve_conflicts = "OVERWRITE"
    }
  }

  vpc_id                                = local.vpc.vpc_id
  subnet_ids                            = local.vpc.private_subnets
  cluster_additional_security_group_ids = local.cluster_additional_security_group_ids
  create_cluster_security_group         = false
  create_node_security_group            = false

  eks_managed_node_group_defaults = {
    ami_type = "AL2_x86_64"
    # ami_id         = data.aws_ami.eks_default.image_id
    disk_size      = 50
    instance_types = ["c5.large", "c6i.large", "c6d.large"]

    # We are using the IRSA created below for permissions
    iam_role_attach_cni_policy = false
  }

  eks_managed_node_groups = {

    # Complete
    complete = {
      name            = "complete-eks-mng"
      use_name_prefix = true
      ami_type        = "AL2_x86_64"

      # We are using the IRSA created below for permissions
      iam_role_attach_cni_policy = false
      subnet_ids                 = local.vpc.private_subnets

      min_size     = 1
      max_size     = 2
      desired_size = 1


      enable_bootstrap_user_data = false
      pre_bootstrap_user_data    = <<-EOT
#!/bin/bash
# config dhcpconfig
service network restart
      EOT
      #   bootstrap_extra_args       = "--kubelet-extra-args '--max-pods=20'"

      #   user_data_template_path = "../change_resolv.tpl"

      capacity_type        = "ON_DEMAND"
      disk_size            = local.node.disk_size
      force_update_version = true
      labels = {
        GithubRepo = "terraform-aws-eks"
        GithubOrg  = "terraform-aws-modules"
      }


      update_config = {
        max_unavailable_percentage = 10 # or set `max_unavailable`
      }

      description = "EKS managed node group example launch template"

      ebs_optimized           = true
      vpc_security_group_ids  = local.node.vpc_security_group_ids
      disable_api_termination = false
      enable_monitoring       = true

      block_device_mappings = {
        xvda = {
          device_name = "/dev/xvda"
          ebs = {
            volume_size           = 75
            volume_type           = "gp3"
            iops                  = 3000
            throughput            = 150
            delete_on_termination = true
          }
        }
      }

      metadata_options = {
        http_endpoint               = "enabled"
        http_tokens                 = "required"
        http_put_response_hop_limit = 2
        instance_metadata_tags      = "disabled"
      }

      instance_types        = local.node.instance_types
      create_security_group = false

      tags = {
        ExtraTag = "EKS managed node group complete example"
      }
    }
  }

  tags = local.tags
}



data "aws_ami" "eks_default" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amazon-eks-node-${local.cluster_version}-v*"]
  }
}

Expected behavior

EKS Cluster IAM role created successfully.

Actual behavior

EKS Cluster IAM role creation failed.

Terminal Output Screenshot(s)

Additional context

@bryantbiggs
Copy link
Member

bryantbiggs commented Feb 28, 2022

ugh, doesn't look like its consistent across services. Thanks for filing the issue, will have a temp fix up shortly

@bryantbiggs
Copy link
Member

actually, I take that back. This is patching an issue on AWS - could you please file a ticket with AWS regarding the endpoint for EKS. Per their docs and their API, it should be eks.amazonaws.com.cn

@yuan-fei
Copy link
Author

could you please share the AWS doc link?

@bryantbiggs
Copy link
Member

could you please share the AWS doc link?

link for what exactly?

@bryantbiggs
Copy link
Member

Not sure if this is what you are looking for but:

@yuan-fei
Copy link
Author

thanks a lot! It seems principal is not identical to endpoint. Form the AWS CN service principal doc,

The identifier for a service principal includes the service name, and is usually in the following format:

service-name.amazonaws.com

However, some services might use the following format instead of or in addition to the usual format:

service-name.amazonaws.com.cn

@bryantbiggs
Copy link
Member

bryantbiggs commented Feb 28, 2022

@yuan-fei can check if this resolves your issue #1905 (not sure if other endpoints will complain as well)

You will have to set:

	cluster_iam_role_dns_suffix = "amazonaws.com"

@yuan-fei
Copy link
Author

yuan-fei commented Mar 1, 2022

@bryantbiggs sorry for the late response. just take a look and it's working for AWS China. Thank you very much for the fix!

@antonbabenko
Copy link
Member

This issue has been resolved in version 18.7.3 🎉

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants