Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NTH Queue Processor Mode #10995

Merged
merged 13 commits into from
Apr 22, 2021
Merged

Conversation

haugenj
Copy link
Contributor

@haugenj haugenj commented Mar 8, 2021

Issue:
#7119

Adds the queue-processor mode to the Node Termination Handler Add on and provisions the requisite resources (SQS Queue, Eventbridge rules/targets, ASG Lifecycle Hooks, IAM policy updates).

Opening this up early to see if there are any big gaps in what I've done

Todo:

  • Tests
  • Checking for existing resources when updating
  • Cloudformation/Terraform support

Questions:

  • Am I missing anything big?
  • How to switch between existing IMDS NTH yaml template and new Queue processor one. I'm assuming I can use something like {{ with .NodeTerminationHandler.EnableSqsTerminationDraining }} at the top of the file? I'm not familiar with this logic... is it bootstrap?
  • Do we want to support a higher level of customization for the queue-processor through additional configuration options?
  • I based this off of the branch release-1.19 instead of master because I didn't see the existing NTH config in master. Is this alright?

@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Mar 8, 2021
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 8, 2021
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If you signed the CLA as a corporation, please sign in with your organization's credentials at https://identity.linuxfoundation.org/projects/cncf to be authorized.
  • If you have done the above and are still having issues with the CLA being reported as unsigned, please log a ticket with the Linux Foundation Helpdesk: https://support.linuxfoundation.org/
  • Should you encounter any issues with the Linux Foundation Helpdesk, send a message to the backup e-mail support address at: [email protected]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Mar 8, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @haugenj!

It looks like this is your first PR to kubernetes/kops 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kops has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @haugenj. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 8, 2021
@k8s-ci-robot k8s-ci-robot requested review from hakman and rdrgmnzs March 8, 2021 21:32
@k8s-ci-robot k8s-ci-robot added area/addons area/api area/provider/aws Issues or PRs related to aws provider labels Mar 8, 2021
@hakman hakman requested review from olemarkus and removed request for rdrgmnzs March 9, 2021 05:42
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 9, 2021
@olemarkus
Copy link
Member

Thanks for this PR. Looks like it is going in the right direction

Questions:

* Am I missing anything big?

* How to switch between existing IMDS NTH yaml template and new Queue processor one. I'm assuming I can use something like `{{ with .NodeTerminationHandler.EnableSqsTerminationDraining }}` at the top of the file? I'm not familiar with this logic... is it bootstrap?

If the templates are radically different and easiest to maintain separatly, you can switch the location here:
https://github.com/kubernetes/kops/blob/master/upup/pkg/fi/cloudup/bootstrapchannelbuilder/bootstrapchannelbuilder.go#L546

* Do we want to support a higher level of customization for the queue-processor through additional configuration options?

I suspect so, yes. I am not sure what is configurable here.
What we usually do is to support the obvious configurations that users typically change, and then wait for feature requests for the rest.

* I based this off of the branch `release-1.19` instead of master because I didn't see the existing NTH config in master. Is this alright?

Not quite sure what you mean here. The configuration for NTH is in the same place in master as in the 1.19 branch:
https://github.com/kubernetes/kops/blob/master/pkg/apis/kops/componentconfig.go#L844

Copy link
Member

@olemarkus olemarkus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely on the right path here. Thanks for this and keep up the good work!

pkg/model/context.go Outdated Show resolved Hide resolved
pkg/model/iam/iam_builder.go Outdated Show resolved Hide resolved
---
# Source: aws-node-termination-handler/templates/psp.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the PSP stuff isn't required. kube-system has a default allow-anything policy if the admission controller is enabled. But fine to leave it in if it makes copy/pasting from source easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards leaving it in. IMO it'd be simpler to keep it exactly in line with the default config released by NTH

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up removing it and merging the two templates, with an if/else to switch between the IMDS daemonset and the Queue Processor deployment, the rest of the templates were the same

upup/pkg/fi/cloudup/nodeterminationhandlertasks/sqs.go Outdated Show resolved Hide resolved
@haugenj
Copy link
Contributor Author

haugenj commented Mar 9, 2021

Not quite sure what you mean here. The configuration for NTH is in the same place in master as in the 1.19 branch:
https://github.com/kubernetes/kops/blob/master/pkg/apis/kops/componentconfig.go#L844

Not sure how I didn't see that tbh 😅. I'll move the changes to the master branch on the next revision

@haugenj
Copy link
Contributor Author

haugenj commented Mar 16, 2021

@olemarkus can you give me some pointers for what kind of tests I should add to this?

@olemarkus
Copy link
Member

Typically validation tests: https://kops.sigs.k8s.io/contributing/adding_a_feature/#tests
This helps ensuring users cannot configure the cluster and components in an incompatible way.

In addition, it would be nice with an integration test that pretty much does a full deployment and ends up in "golden" cloudformation/terraform files. See https://kops.sigs.k8s.io/contributing/testing/#adding-an-integration-test.

Ping me if you need any help

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 22, 2021
@haugenj haugenj force-pushed the release-1.19 branch 2 times, most recently from e665193 to afaa68d Compare March 23, 2021 23:03
@olemarkus
Copy link
Member

Except for the comments made, I think this one is good to go.

@olemarkus
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 20, 2021
docs/addons.md Outdated Show resolved Hide resolved
pkg/apis/kops/componentconfig.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 22, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 22, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rifelpet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 22, 2021
@k8s-ci-robot k8s-ci-robot merged commit 2649cbc into kubernetes:master Apr 22, 2021
@rifelpet
Copy link
Member

Thanks for sticking with this @haugenj ! I think a lot of people will be eager to use this functionality

@haugenj haugenj deleted the release-1.19 branch April 22, 2021 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons area/api area/documentation area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants