Skip to content

Commit

Permalink
- Worker locals/defaults moved to workers submodule
Browse files Browse the repository at this point in the history
- Create separate defaults for node groups
- Workers IAM management left outside of module as both node_group and worker_groups uses them
  • Loading branch information
Grzegorz Lisowski committed Aug 3, 2020
1 parent 3d2f7d2 commit 87edcca
Show file tree
Hide file tree
Showing 33 changed files with 984 additions and 1,415 deletions.
23 changes: 5 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,12 @@ module "my-cluster" {
subnets = ["subnet-abcde012", "subnet-bcde012a", "subnet-fghi345a"]
vpc_id = "vpc-1234556abcdef"
worker_groups = [
{
worker_groups = {
group = {
instance_type = "m4.large"
asg_max_size = 5
}
]
}
}
```
## Conditional creation
Expand Down Expand Up @@ -150,8 +150,6 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| kubernetes | >= 1.11.1 |
| local | >= 1.4 |
| null | >= 2.1 |
| random | >= 2.1 |
| template | >= 2.1 |

## Inputs

Expand Down Expand Up @@ -205,8 +203,7 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| worker\_create\_cluster\_primary\_security\_group\_rules | Whether to create security group rules to allow communication between pods on workers and pods using the primary cluster security group. | `bool` | `false` | no |
| worker\_create\_initial\_lifecycle\_hooks | Whether to create initial lifecycle hooks provided in worker groups. | `bool` | `false` | no |
| worker\_create\_security\_group | Whether to create a security group for the workers or attach the workers to `worker_security_group_id`. | `bool` | `true` | no |
| worker\_groups | A list of maps defining worker group configurations to be defined using AWS Launch Configurations. See workers\_group\_defaults for valid keys. | `any` | `[]` | no |
| worker\_groups\_launch\_template | A list of maps defining worker group configurations to be defined using AWS Launch Templates. See workers\_group\_defaults for valid keys. | `any` | `[]` | no |
| worker\_groups | A map of maps defining worker group configurations to be defined using AWS Launch Templates. See workers\_group\_defaults for valid keys. | `any` | `{}` | no |
| worker\_security\_group\_id | If provided, all workers will be attached to this security group. If not given, a security group will be created with necessary ingress/egress to work with the EKS cluster. | `string` | `""` | no |
| worker\_sg\_ingress\_from\_port | Minimum port number from which pods will accept communication. Must be changed to a lower value if some pods in your cluster will expose a port lower than 1025 (e.g. 22, 80, or 443). | `number` | `1025` | no |
| workers\_additional\_policies | Additional policies to be added to workers | `list(string)` | `[]` | no |
Expand Down Expand Up @@ -235,17 +232,7 @@ MIT Licensed. See [LICENSE](https://github.com/terraform-aws-modules/terraform-a
| node\_groups | Outputs from EKS node groups. Map of maps, keyed by var.node\_groups keys |
| oidc\_provider\_arn | The ARN of the OIDC Provider if `enable_irsa = true`. |
| security\_group\_rule\_cluster\_https\_worker\_ingress | Security group rule responsible for allowing pods to communicate with the EKS cluster API. |
| worker\_iam\_instance\_profile\_arns | default IAM instance profile ARN for EKS worker groups |
| worker\_iam\_instance\_profile\_names | default IAM instance profile name for EKS worker groups |
| worker\_iam\_role\_arn | default IAM role ARN for EKS worker groups |
| worker\_iam\_role\_name | default IAM role name for EKS worker groups |
| worker\_groups | Outputs from EKS worker groups. Map of maps, keyed by var.worker\_groups keys |
| worker\_security\_group\_id | Security group ID attached to the EKS workers. |
| workers\_asg\_arns | IDs of the autoscaling groups containing workers. |
| workers\_asg\_names | Names of the autoscaling groups containing workers. |
| workers\_default\_ami\_id | ID of the default worker group AMI |
| workers\_launch\_template\_arns | ARNs of the worker launch templates. |
| workers\_launch\_template\_ids | IDs of the worker launch templates. |
| workers\_launch\_template\_latest\_versions | Latest versions of the worker launch templates. |
| workers\_user\_data | User data of worker groups |

<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
42 changes: 2 additions & 40 deletions aws_auth.tf
Original file line number Diff line number Diff line change
@@ -1,48 +1,10 @@
data "aws_caller_identity" "current" {
}
data "aws_caller_identity" "current" {}

locals {
auth_launch_template_worker_roles = [
for index in range(0, var.create_eks ? local.worker_group_launch_template_count : 0) : {
worker_role_arn = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/${element(
coalescelist(
aws_iam_instance_profile.workers_launch_template.*.role,
data.aws_iam_instance_profile.custom_worker_group_launch_template_iam_instance_profile.*.role_name,
[""]
),
index
)}"
platform = lookup(
var.worker_groups_launch_template[index],
"platform",
local.workers_group_defaults["platform"]
)
}
]

auth_worker_roles = [
for index in range(0, var.create_eks ? local.worker_group_count : 0) : {
worker_role_arn = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:role/${element(
coalescelist(
aws_iam_instance_profile.workers.*.role,
data.aws_iam_instance_profile.custom_worker_group_iam_instance_profile.*.role_name,
[""]
),
index,
)}"
platform = lookup(
var.worker_groups[index],
"platform",
local.workers_group_defaults["platform"]
)
}
]

# Convert to format needed by aws-auth ConfigMap
configmap_roles = [
for role in concat(
local.auth_launch_template_worker_roles,
local.auth_worker_roles,
module.worker_groups.aws_auth_roles,
module.node_groups.aws_auth_roles,
) :
{
Expand Down
2 changes: 1 addition & 1 deletion cluster.tf
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ resource "aws_security_group_rule" "cluster_https_worker_ingress" {
description = "Allow pods to communicate with the EKS cluster API."
protocol = "tcp"
security_group_id = local.cluster_security_group_id
source_security_group_id = local.worker_security_group_id
source_security_group_id = module.worker_groups.worker_security_group_id
from_port = 443
to_port = 443
type = "ingress"
Expand Down
137 changes: 0 additions & 137 deletions data.tf
Original file line number Diff line number Diff line change
Expand Up @@ -13,33 +13,6 @@ data "aws_iam_policy_document" "workers_assume_role_policy" {
}
}

data "aws_ami" "eks_worker" {
filter {
name = "name"
values = [local.worker_ami_name_filter]
}

most_recent = true

owners = [var.worker_ami_owner_id]
}

data "aws_ami" "eks_worker_windows" {
filter {
name = "name"
values = [local.worker_ami_name_filter_windows]
}

filter {
name = "platform"
values = ["windows"]
}

most_recent = true

owners = [var.worker_ami_owner_id_windows]
}

data "aws_iam_policy_document" "cluster_assume_role_policy" {
statement {
sid = "EKSClusterAssumeRole"
Expand All @@ -55,119 +28,9 @@ data "aws_iam_policy_document" "cluster_assume_role_policy" {
}
}

data "template_file" "userdata" {
count = var.create_eks ? local.worker_group_count : 0
template = lookup(
var.worker_groups[count.index],
"userdata_template_file",
file(
lookup(var.worker_groups[count.index], "platform", local.workers_group_defaults["platform"]) == "windows"
? "${path.module}/templates/userdata_windows.tpl"
: "${path.module}/templates/userdata.sh.tpl"
)
)

vars = merge({
platform = lookup(var.worker_groups[count.index], "platform", local.workers_group_defaults["platform"])
cluster_name = coalescelist(aws_eks_cluster.this[*].name, [""])[0]
endpoint = coalescelist(aws_eks_cluster.this[*].endpoint, [""])[0]
cluster_auth_base64 = coalescelist(aws_eks_cluster.this[*].certificate_authority[0].data, [""])[0]
pre_userdata = lookup(
var.worker_groups[count.index],
"pre_userdata",
local.workers_group_defaults["pre_userdata"],
)
additional_userdata = lookup(
var.worker_groups[count.index],
"additional_userdata",
local.workers_group_defaults["additional_userdata"],
)
bootstrap_extra_args = lookup(
var.worker_groups[count.index],
"bootstrap_extra_args",
local.workers_group_defaults["bootstrap_extra_args"],
)
kubelet_extra_args = lookup(
var.worker_groups[count.index],
"kubelet_extra_args",
local.workers_group_defaults["kubelet_extra_args"],
)
},
lookup(
var.worker_groups[count.index],
"userdata_template_extra_args",
local.workers_group_defaults["userdata_template_extra_args"]
)
)
}

data "template_file" "launch_template_userdata" {
count = var.create_eks ? local.worker_group_launch_template_count : 0
template = lookup(
var.worker_groups_launch_template[count.index],
"userdata_template_file",
file(
lookup(var.worker_groups_launch_template[count.index], "platform", local.workers_group_defaults["platform"]) == "windows"
? "${path.module}/templates/userdata_windows.tpl"
: "${path.module}/templates/userdata.sh.tpl"
)
)

vars = merge({
platform = lookup(var.worker_groups_launch_template[count.index], "platform", local.workers_group_defaults["platform"])
cluster_name = coalescelist(aws_eks_cluster.this[*].name, [""])[0]
endpoint = coalescelist(aws_eks_cluster.this[*].endpoint, [""])[0]
cluster_auth_base64 = coalescelist(aws_eks_cluster.this[*].certificate_authority[0].data, [""])[0]
pre_userdata = lookup(
var.worker_groups_launch_template[count.index],
"pre_userdata",
local.workers_group_defaults["pre_userdata"],
)
additional_userdata = lookup(
var.worker_groups_launch_template[count.index],
"additional_userdata",
local.workers_group_defaults["additional_userdata"],
)
bootstrap_extra_args = lookup(
var.worker_groups_launch_template[count.index],
"bootstrap_extra_args",
local.workers_group_defaults["bootstrap_extra_args"],
)
kubelet_extra_args = lookup(
var.worker_groups_launch_template[count.index],
"kubelet_extra_args",
local.workers_group_defaults["kubelet_extra_args"],
)
},
lookup(
var.worker_groups_launch_template[count.index],
"userdata_template_extra_args",
local.workers_group_defaults["userdata_template_extra_args"]
)
)
}

data "aws_iam_role" "custom_cluster_iam_role" {
count = var.manage_cluster_iam_resources ? 0 : 1
name = var.cluster_iam_role_name
}

data "aws_iam_instance_profile" "custom_worker_group_iam_instance_profile" {
count = var.manage_worker_iam_resources ? 0 : local.worker_group_count
name = lookup(
var.worker_groups[count.index],
"iam_instance_profile_name",
local.workers_group_defaults["iam_instance_profile_name"],
)
}

data "aws_iam_instance_profile" "custom_worker_group_launch_template_iam_instance_profile" {
count = var.manage_worker_iam_resources ? 0 : local.worker_group_launch_template_count
name = lookup(
var.worker_groups_launch_template[count.index],
"iam_instance_profile_name",
local.workers_group_defaults["iam_instance_profile_name"],
)
}

data "aws_partition" "current" {}
12 changes: 3 additions & 9 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## How do I customize X on the worker group's settings?

All the options that can be customized for worker groups are listed in [local.tf](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/local.tf) under `workers_group_defaults_defaults`.
All the options that can be customized for worker groups are listed in [local.tf](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/modules/worker_groups/local.tf) under `workers_group_defaults_defaults`.

Please open Issues or PRs if you think something is missing.

Expand Down Expand Up @@ -61,12 +61,6 @@ You need to add the tags to the VPC and subnets yourself. See the [basic example

An alternative is to use the aws provider's [`ignore_tags` variable](https://www.terraform.io/docs/providers/aws/#ignore\_tags-configuration-block). However this can also cause terraform to display a perpetual difference.

## How do I safely remove old worker groups?

You've added new worker groups. Deleting worker groups from earlier in the list causes Terraform to want to recreate all worker groups. This is a limitation with how Terraform works and the module using `count` to create the ASGs and other resources.

The safest and easiest option is to set `asg_min_size` and `asg_max_size` to 0 on the worker groups to "remove".

## Why does changing the worker group's desired count not do anything?

The module is configured to ignore this value. Unfortunately Terraform does not support variables within the `lifecycle` block.
Expand All @@ -77,9 +71,9 @@ You can change the desired count via the CLI or console if you're not using the

If you are not using autoscaling and really want to control the number of nodes via terraform then set the `asg_min_size` and `asg_max_size` instead. AWS will remove a random instance when you scale down. You will have to weigh the risks here.

## Why are nodes not recreated when the `launch_configuration`/`launch_template` is recreated?
## Why are nodes not recreated when the `launch_configuration` is recreated?

By default the ASG is not configured to be recreated when the launch configuration or template changes. Terraform spins up new instances and then deletes all the old instances in one go as the AWS provider team have refused to implement rolling updates of autoscaling groups. This is not good for kubernetes stability.
By default the ASG is not configured to be recreated when the launch configuration changes. Terraform spins up new instances and then deletes all the old instances in one go as the AWS provider team have refused to implement rolling updates of autoscaling groups. This is not good for kubernetes stability.

You need to use a process to drain and cycle the workers.

Expand Down
48 changes: 5 additions & 43 deletions docs/spot-instances.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,65 +22,27 @@ Notes:
- There is an AWS blog article about this [here](https://aws.amazon.com/blogs/compute/run-your-kubernetes-workloads-on-amazon-ec2-spot-instances-with-amazon-eks/).
- Consider using [k8s-spot-rescheduler](https://github.com/pusher/k8s-spot-rescheduler) to move pods from on-demand to spot instances.

## Using Launch Configuration

Example worker group configuration that uses an ASG with launch configuration for each worker group:

```hcl
worker_groups = [
{
name = "on-demand-1"
instance_type = "m4.xlarge"
asg_max_size = 1
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=normal"
suspended_processes = ["AZRebalance"]
},
{
name = "spot-1"
spot_price = "0.199"
instance_type = "c4.xlarge"
asg_max_size = 20
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
suspended_processes = ["AZRebalance"]
},
{
name = "spot-2"
spot_price = "0.20"
instance_type = "m4.xlarge"
asg_max_size = 20
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
suspended_processes = ["AZRebalance"]
}
]
```

## Using Launch Templates

Launch Template support is a recent addition to both AWS and this module. It might not be as tried and tested but it's more suitable for spot instances as it allowed multiple instance types in the same worker group:

```hcl
worker_groups = [
{
name = "on-demand-1"
worker_groups = {
on-demand-1 = {
instance_type = "m4.xlarge"
asg_max_size = 10
kubelet_extra_args = "--node-labels=spot=false"
suspended_processes = ["AZRebalance"]
}
]
worker_groups_launch_template = [
{
name = "spot-1"
},
spot-1 = {
override_instance_types = ["m5.large", "m5a.large", "m5d.large", "m5ad.large"]
spot_instance_pools = 4
asg_max_size = 5
asg_desired_capacity = 5
kubelet_extra_args = "--node-labels=node.kubernetes.io/lifecycle=spot"
public_ip = true
},
]
}
```

## Important Notes
Expand Down
10 changes: 4 additions & 6 deletions examples/basic/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -135,22 +135,20 @@ module "eks" {

vpc_id = module.vpc.vpc_id

worker_groups = [
{
name = "worker-group-1"
worker_groups = {
worker-group-1 = {
instance_type = "t2.small"
additional_userdata = "echo foo bar"
asg_desired_capacity = 2
additional_security_group_ids = [aws_security_group.worker_group_mgmt_one.id]
},
{
name = "worker-group-2"
worker-group-2 = {
instance_type = "t2.medium"
additional_userdata = "echo foo bar"
additional_security_group_ids = [aws_security_group.worker_group_mgmt_two.id]
asg_desired_capacity = 1
},
]
}

worker_additional_security_group_ids = [aws_security_group.all_worker_mgmt.id]
map_roles = var.map_roles
Expand Down
Loading

0 comments on commit 87edcca

Please sign in to comment.