Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update Terraform getting started guide to default to multi-cluster tagging scheme #1668

Merged
merged 4 commits into from
Jul 11, 2022
Merged

feat: Update Terraform getting started guide to default to multi-cluster tagging scheme #1668

merged 4 commits into from
Jul 11, 2022

Conversation

bryantbiggs
Copy link
Member

@bryantbiggs bryantbiggs commented Apr 12, 2022

1. Issue, if available:
#1488

2. Description of changes:

3. How was this change tested?

  • Code provided in Terraform getting started guide was used to provision a cluster and Karpenter components. The example deployment was used to validate Karpenter launching instances as requested

4. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@bryantbiggs bryantbiggs requested a review from a team as a code owner April 12, 2022 19:43
@bryantbiggs bryantbiggs requested a review from suket22 April 12, 2022 19:43
@netlify
Copy link

netlify bot commented Apr 12, 2022

Deploy Preview for karpenter-docs-prod ready!

Name Link
🔨 Latest commit 1f9beb4
🔍 Latest deploy log https://app.netlify.com/sites/karpenter-docs-prod/deploys/62964600365c84000868f0b6
😎 Deploy Preview https://deploy-preview-1668--karpenter-docs-prod.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@@ -361,11 +362,11 @@ resource "kubectl_manifest" "karpenter_provisioner" {
cpu: 1000
provider:
subnetSelector:
karpenter.sh/discovery: ${local.cluster_name}
karpenter.sh/discovery: ${module.eks.cluster_id}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local.cluster_name and module.eks.cluster_id are one in the same. Its better to use the module output though to ensure proper dependency ordering

@dewjam
Copy link
Contributor

dewjam commented Apr 14, 2022

Planning to test this procedure today. I'll let you know what I find.

Copy link
Contributor

@dewjam dewjam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran through these changes and am getting permissions denied errors. I'm fairly certain this is due to the tags which are set on the Subnets and Security groups when creating the EKS cluster.

Subnets and SecurityGroups have a label of karpenter.sh/discovery: <cluster_name>, while the condition in the IRSA module is using karpenter.sh/discovery/cluster_name/<cluster_name>: <cluster_name.

https://github.com/terraform-aws-modules/terraform-aws-iam/blob/master/modules/iam-role-for-service-accounts-eks/policies.tf#L548-L561

@bryantbiggs bryantbiggs marked this pull request as draft April 15, 2022 16:52
@bryantbiggs
Copy link
Member Author

Let me take a closer look at this - I thought this would be a no-op change but looks like thats not the case

private_subnet_tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "owned"
# Tags subnets for Karpenter auto-discovery
"karpenter.sh/discovery" = local.cluster_name
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This karpenter unique tag requirement is lifted with terraform-aws-modules/terraform-aws-iam#247 - users just need to specify a subnetSelector that will match the subnets they intend to use. VPCs and Subnets are shared resources and we shouldn't impose specific tagging on those

create_cluster_security_group = false
create_node_security_group = false
node_security_group_additional_rules = {
ingress_nodes_karpenter_port = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has come up a bit so I thought it would be better to show the minimum additional security group rules required to run Karpenter. Hopefully this highlights to users that in addition to the minimum ports that Kubernetes/EKS requires, port 8443/TCP from the control plane to the node group is required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.

securityGroupSelector:
karpenter.sh/discovery: ${local.cluster_name}
karpenter.sh/discovery/${module.eks.cluster_id}: ${module.eks.cluster_id}
Copy link
Member Author

@bryantbiggs bryantbiggs May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster unique key is directly related to #1488 - this allows users to have multiple clusters in a single account without tag conflicts and still maintain properly scoped permissions.

@bryantbiggs bryantbiggs marked this pull request as ready for review May 10, 2022 18:00
armujahid added a commit to armujahid/terraform-aws-eks-blueprints that referenced this pull request May 23, 2022
armujahid added a commit to armujahid/terraform-aws-eks-blueprints that referenced this pull request May 23, 2022

cluster_name = local.cluster_name
cluster_version = "1.21"
cluster_version = "1.22"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the ownership model of this file? How can we ensure this and L125 are up-to-date?

I can see the cluster_version being updated when the default cluster versions need to be updated, but for the TF version, do we need to specify a version if the terraform-aws-modules on L123 are referred to as latest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for L237

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can extract the HCL out of the README and put it into actual *.tf files somewhat similar to what was done for CloudFormation where the CloudFormation is injected via the template tags, that will give us 2 things:

  1. Its easier to test and validate these changes because we can cd to the directory that holds the Terraform file and simply terraform init && terraform apply
  2. Once the contents are back into Terraform files, we could enable Renovate which will track module updates and submit PRs to update Update Terraform terraform-aws-modules/iam/aws to v5 DrFaust92/terraform-kubernetes-ebs-csi-driver#78

In general though, the values are set statically to ensure that the guidance for users has been validated first. In theory, we can leave off line 128 and the underlying EKS module/service will pull the latest K8s version offered. However, this can be problematic if changes are required for supporting the latest version.

Copy link
Member Author

@bryantbiggs bryantbiggs May 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theres also this that I created re: ownership #1574 - from an outsider perspective, all I really need is one thing: the IAM policy required to run Karpenter (the most minimal, restrictive, and finely scoped IAM policy). With that I can host that policy in the IRSA role for Karpenter and then find the appropriate location to host the example usage instead of always trying to update the getting started guides here for the various ways one can provision an EKS cluster+Karpenter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general though, the values are set statically to ensure that the guidance for users has been validated first.

I agree this is definitely a good system. It's not easy to make sure it's always up to date and works, which gives value to hosting this in another repo like you've done. Was mostly curious if you had a good solution :)

Comment on lines +147 to +152
node_security_group_tags = {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. - at most, only one security group should have this tag in your account)
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include what the consequences of not following this note are?

Suggested change
node_security_group_tags = {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. - at most, only one security group should have this tag in your account)
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}
node_security_group_tags = {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. at most, only one security group should have this tag in your account
# as <insert_reason_for_note>)
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}

@dewjam
Copy link
Contributor

dewjam commented Jun 15, 2022

I've tested this procedure and it works as expected. @bryantbiggs I think there is one comment from @njtran which could be addressed, otherwise it looks good to me.

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bwagner5 bwagner5 merged commit 3643910 into aws:main Jul 11, 2022
@bryantbiggs bryantbiggs deleted the feat/custom-tag-key branch July 11, 2022 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants