Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to add taints to google_container_node_pool #680

Closed
hashibot opened this issue Nov 3, 2017 · 15 comments
Closed

Add option to add taints to google_container_node_pool #680

hashibot opened this issue Nov 3, 2017 · 15 comments
Assignees

Comments

@hashibot
Copy link

hashibot commented Nov 3, 2017

This issue was originally opened by @eyalzek as hashicorp/terraform#16548. It was migrated here as a result of the provider split. The original body of the issue is below.


According to this doc:
https://cloud.google.com/container-engine/docs/node-taints#applying_taints_to_a_node_pool

it's possible to apply a taint to a node-pool/cluster, but the taint has to be specified during the creation. This makes it impossible to manage the node pool using terraform unless you create the pool using gcloud, then import it to terraform. The problem with that is that many of the configuration options of the pool cannot be altered after creation, so you're required to convert almost all of the terraform configuration into command line arguments when creating the pool, otherwise terraform will attempt to recreate the pool and as a result dump the taint configuration.

After doing all of that and running terraform plan, terraform still complains for the 2 fields:
initial_node_count (which cannot be configured using gcloud) and id. The only way to make it possible to manage this with terraform was to add a lifecycle such as:

  lifecycle {
    ignore_changes = ["initial_node_count", "id"]
  }

All in all not ideal.

@eyalzek
Copy link

eyalzek commented Nov 8, 2017

Just to update, with the latest google-cloud-sdk (178.0.0-0) there's an option --num-nodes that corresponds to initial_node_count, also, terraform now does not complain for the id attribute (c9e2ce7), so the lifecycle rules I set up are unnecessary.

@danawillow danawillow self-assigned this Nov 9, 2017
@danawillow
Copy link
Contributor

Hey @eyalzek, it looks like there are a few different issues you've encountered. The first one is to add the option for node taints, which we can absolutely do. Let's leave this issue open for that.

It also looks like you encountered some issues when running terraform import and terraform plan for node pools/clusters- mind filing separate issues for those so one of us can investigate? If you could also include a sample config and debug logs, that would help us figure out what's not working correctly.

@eyalzek
Copy link

eyalzek commented Nov 10, 2017

@danawillow I cannot edit the original ticket, but as I mentioned in the previous comment; the second problem I had was due to my out of date packages and not an actual problem with terraform. So this ticket is about adding missing attributes to google_container_node_pool.

@davidquarles
Copy link
Contributor

Duplicates #621, FWIW.

@danawillow This requires the container/v1beta1 API. I'm super curious how versioned API support has worked out elsewhere and whether or not that's something we want to do here, whether in the context of this feature or not. I know the team maintaining the Kubernetes provider has said they may support non-GA APIs once they dust settles over here and they've had a chance to confer. We can have the more general discussion elsewhere, if you think it's something worth exploring for the container API.

Side note: It is awesome that this team is pioneering versioned API support!

@davidquarles
Copy link
Contributor

@eyalzek I know it's a bit hacky, but this is what I'm doing until this lands, if it's helpful to you:

node_pool/main.tf

resource "null_resource" "taint" {
  triggers {
    pool  = "${google_container_node_pool.pool.id}"
    taints = "${var.taints}"
  }

  provisioner "local-exec" {
    command = "${path.module}/bin/taint-node-pool ${var.cluster} ${var.name} '${var.taints}'"
  }
}

node_pool/bin/taint-node-pool

#!/usr/bin/env bash
set -euo pipefail
if [ $# -lt 3 ]; then
    echo "Usage taint-node-pool CLUSTER NODE_POOL TAINT"
    exit 1
fi

declare -r cluster=$1 \
    && shift
declare -r pool=$1 \
    && shift
declare -r taints="$@"

source $(cd "$(dirname "$BASH_SOURCE")" && pwd)/utils.sh

wait-for-cluster
get-cluster-credentials

selector="cloud.google.com/gke-nodepool=$pool"
jsonpath='{ .items[].spec.taints[?(@.key != "CriticalAddonsOnly")].key }'
kubectl get nodes -l "$selector" -o jsonpath="$jsonpath" \
    | xargs -n1 \
    | sort -u \
    | xargs -n1 -I{} kubectl taint node -l "$selector" {}-

if [ -n "$taints" ]; then
    kubectl taint node -l "node-pool=$pool" $taints
fi

@eyalzek
Copy link

eyalzek commented Nov 15, 2017

@davidquarles thanks, but this doesn't exactly cover the case of auto-scaling node pools. Unfortunately tainting the entire node pool has to be done while creating it.

@davidquarles
Copy link
Contributor

@eyalzek Yeah, you're totally right. I guess you'd have to copy the instance template directly, append to the kube-env metadata, and copy back, which has always seemed like a disgusting (though useful!) hack. It's a shame that isn't exposed as a mutable field after node pool creation.

@eyalzek
Copy link

eyalzek commented Nov 15, 2017

Can you link to an example of that hack applied somewhere? :)

@davidquarles
Copy link
Contributor

This is probably near-blasphemy to detail here, but either way – I started scripting an example for you last night, and it's much uglier programmatically than it is in the UI/console. The basic workflow, given a cluster named foo and a node pool named workers is this:

  • copy instance template gke-foo-workers-<uid> to gke-foo-workers-<uid>-1, adding NODE_TAINTS: <your comma-separated taints> to the kube-env metadata before saving
  • update the instance group gke-foo-workers-<uid>-grp to use the new template (meaningless for existing nodes)
  • delete instance template gke-foo-workers-<uid>
  • copy instance template gke-foo-workers-<uid>-1 back to the original name, gke-foo-workers-<uid>
  • update the instance group, again, to use the new gke-foo-workers-<uid> template
  • delete the temporary gke-foo-workers-<uid>-1 template
  • taint existing nodes via kubectl taint nodes -l "cloud.google.com/gke-nodepool=workers" <your comma-separated taints>

...at this point you have tainted nodes, taints will be added by your instance template, and AFAIK terraform isn't going to see this as a change / do anything destructive. this is all potentially achievable in terraform, though i'd imagine it's a bit complex.

@uschen
Copy link

uschen commented Feb 21, 2018

Is this on the road map?

@a13x212
Copy link

a13x212 commented Mar 16, 2018

@davidquarles could you post the utils.sh also?

@burdiyan
Copy link

GKE API now allows to specify node taint on creation of a node pool. It would be super cool to have this feature in the provider!

@aswinkm-tc
Copy link

The above change only add node-taints to a GKE cluster not node pools. Please consider adding taints to node pools also.

@danawillow
Copy link
Contributor

Hey @AswinKakarot, taints are part of the node_config block, which you can set on either the cluster or the node pool. You can see some examples of how to do this in the tests in the linked PR.

@ghost
Copy link

ghost commented Nov 17, 2018

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Nov 17, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants