Best way to remove asg without rebuilding all of them. #476

sethpollack · 2019-08-23T15:16:40Z

When I remove group-01 it changes the indexes and forces a rebuild of group-02 and group-03. Is there a recommended way to avoid that?

  worker_groups = [
    {
      name                     = "group-01"
      instance_type            = "r5.2xlarge"
    },
    {
      name                     = "group-02"
      instance_type            = "m5.4xlarge"
    },
    {
      name                     = "group-03"
      instance_type            = "m5.8xlarge"
    }
  ]

The text was updated successfully, but these errors were encountered:

barryib · 2019-08-23T15:25:22Z

I don't think you can do that actually. But since Terraform 0.12.6, we can now use for_each in resources. With for_each the resource reference is not an indice anymore. This let us remove specific resource within a list or a map.

More info here:

barryib · 2019-08-23T15:31:42Z

Maybe this will help you as workaround for now hashicorp/terraform#14275 (comment)

sethpollack · 2019-08-23T16:16:59Z

Cool thanks!. Are there any plans to move to for_each?

aeugenio · 2019-09-11T05:14:53Z

we have an eks cluster with five worker groups (asg's). i wanted to upgrade them from the 12.7 ami to the 12.10 ami. based on https://docs.aws.amazon.com/eks/latest/userguide/migrate-stack.html, i tried these steps:

helm delete --purge cluster-autoscaler
in our eks/main.tf which uses this module, duplicate the asg's inside the worker_groups list and update the ami's on the new worker groups. (this means add 5 new asg's to the end of the list.)
tf apply; confirm ten asg's on 2 different launch configs
run ./drain-asgs.sh which is

#!/bin/bash

set -e

K8S_VERSION=1.12.7
nodes=$(kubectl  get nodes -o jsonpath="{.items[?(@.status.nodeInfo.kubeletVersion==\"v$K8S_VERSION\")].metadata.name}")
for node in ${nodes[@]}
do
    echo "Draining $node"
    kubectl drain $node --ignore-daemonsets --delete-local-data
done

delete original asg's from worker_groups list in eks/main.tf. (this means the list is back to its original size of 5 and all using the new 12.10 ami.)
tf apply
Plan: 5 to add, 5 to change, 20 to destroy.

because the worker_groups logic is index-based, a bunch of chaos ensues and the new asg's actually end up deleted.

sounds like using for_each would allow worker_groups mgmt to move away from index-based logic and will support removing groups from the list.

barryib · 2019-10-09T18:20:25Z

I would like to work on this, but it sounds like it will force everyone to move theirs resources in states.

I don't know how this can be painful for users.

Any thoughts ? @max-rocket-internet @dpiddockcmp ?

aeugenio · 2019-10-09T20:25:42Z

i had the same concerns about existing tf state, but fortunately our team is in a spot where we can rebuild all of our envs. so i forked somewhere in between 6.0.1 and 6.0.2 and implemented using a map of maps for worker_group_launch_templates instead of a list. our team also had to split the eks control plane from the worker groups as flipping bits for the eks api endpoint access vars was causing all the asg's to roll (i believe due to naming and kubeconfig/auth-map.rendered dependencies).

here's an example of how our eks-worker-groups module looks:

worker_groups_launch_template = {
   "${var.env}-${var.cluster_id}-infra-${data.aws_availability_zones.available.names[0]}-20191003" = {
     instance_type                 = "m5.large"
     asg_min_size                  = 1
     asg_max_size                  = 2
     key_name                      = var.ssh_key_name
     subnets                       = [ data.terraform_remote_state.vpc.outputs.private_subnets[0] ]
     kubelet_extra_args            = "--node-labels=k8s-env=${local.full-cluster-name},nodegroup=infra --register-with-taints=infra=true:NoSchedule"
     autoscaling_enabled           = true
     root_volume_size              = 1024
     termination_policies          = [ "OldestInstance" ]
     ami_id                        = "ami-05d586e6f773f6abf"
     tags = [{
       key = "worker-type"
       value = "infra"
       propagate_at_launch = true
     }]
   },

i am planning on getting a public repo up that can at least provide a starting point if someone wants to see how i handled all the for_each's. we also made a couple other adjustments that would make submitting a PR not really possible. but really want to at least contribute the concept back.

aeugenio · 2019-10-09T20:28:42Z

oh yeah forgot to mention that upgrading the cluster via the steps i outlined above works really nicely.

dpiddockcmp · 2019-10-10T09:32:37Z

There's no way around it, moving to a map of maps and for_each will be painful for everyone upgrading.

I've done it to a few unrelated plans as part of our 0.12 upgrade efforts. You can't simply use terraform state mv resource[0] resource["key"] for the first item as TF gets confused. You have to either drop and import, manually edit the state file or let TF delete and recreate the resources. Not sure if TF > 0.12.7 has improved that. Need to test what has changed in 0.12.10

A lot of short term pain for all. Would make life much easier for people who change their worker_groups.

max-rocket-internet · 2019-10-21T12:59:05Z

You have to either drop and import, manually edit the state file or let TF delete and recreate the resources

Hmmm, sounds painful.

When I remove group-01 it changes the indexes and forces a rebuild of group-02 and group-03.

Is it not possible to use terraform state mv resource[1] resource[0] for this?

galindro · 2020-01-08T15:31:53Z

Unfortunately no @max-rocket-internet . Take a look:

First, I tried to just move it:

$ terraform state mv module.eks.aws_autoscaling_group.workers[1] module.eks.aws_autoscaling_group.workers[0]
Acquiring state lock. This may take a few moments...

Error: Invalid target address

Cannot move to module.eks.aws_autoscaling_group.workers[0]:
there is already a resource instance at that address in the current state.

Then, I tried to remove the resource from index 0 and try to move again:

$ terraform state rm module.sc-eks.module.eks.aws_autoscaling_group.workers[0]
Removed module.sc-eks.module.eks.aws_autoscaling_group.workers[0]
Successfully removed 1 resource instance(s).

$ terraform state mv module.eks.aws_autoscaling_group.workers[1] module.eks.aws_autoscaling_group.workers[0]
Move "module.eks.aws_autoscaling_group.workers[1]" to "module.eks.aws_autoscaling_group.workers[0]"

Error: Invalid target address

Cannot move to module.eks.aws_autoscaling_group.workers[0]:
module.eks.aws_autoscaling_group.workers does not exist in the
current state.

galindro · 2020-01-09T09:00:02Z

I was able to workaround by moving the index 0 to index 2 and then move index 1 to 0. After that, I just ran the apply command again to destroy the old workers.

ArchiFleKs · 2020-02-12T23:37:22Z

I wanted to try and work on this, I started with the worker_launch_template and iterate from here.

If someone have the time I'd like some feedback/help here

I'm having issues with the coalescelist and getting this error:

                                                                                                               
  on aws_auth.tf line 13, in data "template_file" "launch_template_worker_role_arns":                          
  13:         lookup(data.aws_iam_instance_profile.custom_worker_group_launch_template_iam_instance_profile[eac
h.key], "role_name", ""                                                                                        
    |----------------                                                                                          
    | data.aws_iam_instance_profile.custom_worker_group_launch_template_iam_instance_profile is object with no 
attributes                                                                                                     
    | each.key is "default-eu-west-3c"                                                                         
                                                                                                               
The given key does not identify an element in this collection value.

Despite using a lookup here, it stills tried to access the attribute on an empty map. I'm stuck here, I don't know how to have the same behavior as coalescelist but with two maps.

ArchiFleKs · 2020-02-12T23:41:41Z

Looking into what's been done for the node_group the best way would be to merge everything beforehand from what I understand here: https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/modules/node_groups/locals.tf

jaimehrubiks · 2020-03-17T17:10:31Z

What if we support both in parallel?
Add worker_groups_launch_template_map so that new users can start using the _map variable, while old users can remain on the old one and can try to transition in the future

stale · 2020-06-15T17:38:51Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2020-07-15T18:11:12Z

This issue has been automatically closed because it has not had recent activity since being marked as stale.

lexsys27 · 2020-09-03T11:56:30Z

Any updates to this issue?

I would like to have worker_groups as a map too. That way I can delete and add groups without rebuilding all others which indexes are higher. Also I can address them by names in worker_asg_names rather than indexes.

Several days ago I wanted to delete groups 0 and 1. After considering the plan Terraform gave me I just scaled them to 0 not to rebuild all other groups :)

dpiddockcmp · 2020-09-04T15:49:46Z

To treat the worker_groups as a map would need extensive changes to the module. It will also require every user of the module to recreate all worker groups when upgrading.

stale · 2020-12-03T23:11:31Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2021-01-03T01:09:45Z

This issue has been automatically closed because it has not had recent activity since being marked as stale.

varkey · 2021-07-02T06:57:09Z

This is still a problem, can we please keep this issue open?

sherifabdlnaby · 2021-09-07T11:08:40Z

Stale Bump :)

daroga0002 · 2021-09-07T13:47:04Z

#858 or #1366 will solve this issue, but this potencially is breaking change for existing users and will require moving terraform state resources

stale · 2021-10-09T06:28:50Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2021-10-16T07:11:00Z

This issue has been automatically closed because it has not had recent activity since being marked as stale.

github-actions · 2022-11-17T02:27:29Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

barryib mentioned this issue Oct 22, 2019

Reduce duplication by merging worker_groups and worker_groups_launch_template? #563

Closed

aeugenio mentioned this issue Jan 9, 2020

Is the complexity of this module getting too high? #635

Closed

mcavoyk mentioned this issue Feb 10, 2020

Worker Groups should support a map of maps #732

Closed

4 tasks

barryib mentioned this issue Mar 11, 2020

State of worker group launch templates stored with keys instead of indices #778

Closed

4 tasks

barryib mentioned this issue May 19, 2020

feat: Add EKS Fargate support #866

Closed

2 tasks

stale bot added the stale label Jun 15, 2020

stale bot closed this as completed Jul 15, 2020

barryib reopened this Jul 20, 2020

stale bot removed the stale label Jul 20, 2020

barryib mentioned this issue Jul 21, 2020

[Question] How to delete some of worker_groups ASG? #956

Closed

stale bot added the stale label Dec 3, 2020

stale bot closed this as completed Jan 3, 2021

bryantbiggs reopened this Sep 7, 2021

stale bot removed the stale label Sep 7, 2021

daroga0002 added breaking change labels Sep 8, 2021

stale bot added the stale label Oct 9, 2021

stale bot closed this as completed Oct 16, 2021

github-actions bot locked as resolved and limited conversation to collaborators Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best way to remove asg without rebuilding all of them. #476

Best way to remove asg without rebuilding all of them. #476

sethpollack commented Aug 23, 2019

barryib commented Aug 23, 2019 •

edited

Loading

barryib commented Aug 23, 2019 •

edited

Loading

sethpollack commented Aug 23, 2019

aeugenio commented Sep 11, 2019 •

edited

Loading

barryib commented Oct 9, 2019

aeugenio commented Oct 9, 2019 •

edited

Loading

aeugenio commented Oct 9, 2019

dpiddockcmp commented Oct 10, 2019

max-rocket-internet commented Oct 21, 2019

galindro commented Jan 8, 2020 •

edited

Loading

galindro commented Jan 9, 2020

ArchiFleKs commented Feb 12, 2020

ArchiFleKs commented Feb 12, 2020

jaimehrubiks commented Mar 17, 2020

stale bot commented Jun 15, 2020

stale bot commented Jul 15, 2020

lexsys27 commented Sep 3, 2020

dpiddockcmp commented Sep 4, 2020

stale bot commented Dec 3, 2020

stale bot commented Jan 3, 2021

varkey commented Jul 2, 2021

sherifabdlnaby commented Sep 7, 2021

daroga0002 commented Sep 7, 2021 •

edited

Loading

stale bot commented Oct 9, 2021

stale bot commented Oct 16, 2021

github-actions bot commented Nov 17, 2022

Best way to remove asg without rebuilding all of them. #476

Best way to remove asg without rebuilding all of them. #476

Comments

sethpollack commented Aug 23, 2019

barryib commented Aug 23, 2019 • edited Loading

barryib commented Aug 23, 2019 • edited Loading

sethpollack commented Aug 23, 2019

aeugenio commented Sep 11, 2019 • edited Loading

barryib commented Oct 9, 2019

aeugenio commented Oct 9, 2019 • edited Loading

aeugenio commented Oct 9, 2019

dpiddockcmp commented Oct 10, 2019

max-rocket-internet commented Oct 21, 2019

galindro commented Jan 8, 2020 • edited Loading

galindro commented Jan 9, 2020

ArchiFleKs commented Feb 12, 2020

ArchiFleKs commented Feb 12, 2020

jaimehrubiks commented Mar 17, 2020

stale bot commented Jun 15, 2020

stale bot commented Jul 15, 2020

lexsys27 commented Sep 3, 2020

dpiddockcmp commented Sep 4, 2020

stale bot commented Dec 3, 2020

stale bot commented Jan 3, 2021

varkey commented Jul 2, 2021

sherifabdlnaby commented Sep 7, 2021

daroga0002 commented Sep 7, 2021 • edited Loading

stale bot commented Oct 9, 2021

stale bot commented Oct 16, 2021

github-actions bot commented Nov 17, 2022

barryib commented Aug 23, 2019 •

edited

Loading

barryib commented Aug 23, 2019 •

edited

Loading

aeugenio commented Sep 11, 2019 •

edited

Loading

aeugenio commented Oct 9, 2019 •

edited

Loading

galindro commented Jan 8, 2020 •

edited

Loading

daroga0002 commented Sep 7, 2021 •

edited

Loading