Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow Node Feature Discovery garbage collector to run on control-plane nodes #722

Conversation

dlipovetsky
Copy link
Contributor

What problem does this PR solve?:
With this change, all NFD components can run on a single node cluster.

Which issue(s) this PR fixes:
Fixes #https://jira.nutanix.com/browse/NCN-100706

How Has This Been Tested?:

Special notes for your reviewer:

…-plane nodes

With this change, all NFD components can run on a single node cluster.
@github-actions github-actions bot added the fix label Jun 17, 2024
…control-plane nodes

Also define tolerations when NFD Addon is deployed using Helm
@jimmidyson
Copy link
Member

jimmidyson commented Jun 18, 2024

From #723 (comment):

I wonder if NFD should also tolerate the CriticalAddonsOnly taint, as many other Addons do?

From what I can see, NFD is deployed with an empty priorityClassName so this wouldn't have any effect right now as the CriticalAddonsOnly taint is only applied temporarily when pre-empting lower priority pods.

So I guess the next question is should we set the priorityClassName on the NFD addon deployments and then tolerate CriticalAddonsOnly? 😅 I don't see any harm in it, but is it really critical?

Reading kubernetes/autoscaler#4097 (comment) and onwards indicates that CriticalAddonsOnly taint is now only used by certain cloud providers (e.g. EKS, AKS) and is not a standard taint. With that in mind, I'm not sure that any of our addons require this toleration.

But we should also see if we need to add priorityClassName: system-cluster-critical to the NFD pods?

@dlipovetsky
Copy link
Contributor Author

But we should also see if we need to add priorityClassName: system-cluster-critical to the NFD pods?

I created a separate JIRA to review this for all Addons.

@dlipovetsky dlipovetsky requested review from supershal and faiq June 18, 2024 23:50
@jimmidyson jimmidyson enabled auto-merge (squash) June 19, 2024 09:13
@jimmidyson jimmidyson merged commit 6365f5e into nutanix-cloud-native:main Jun 19, 2024
16 checks passed
@github-actions github-actions bot mentioned this pull request Jun 19, 2024
faiq pushed a commit that referenced this pull request Jun 24, 2024
🤖 I have created a release *beep* *boop*
---


## 0.10.0 (2024-06-24)

<!-- Release notes generated using configuration in .github/release.yaml
at main -->

## What's Changed
### Exciting New Features 🎉
* feat: Upgrade to Cilium v1.15.5 by @jimmidyson in
#689
* feat: Upgrade to Calico v3.28.0 by @jimmidyson in
#688
* feat: bumps caaph to v0.2.3 by @faiq in
#691
* feat: Add local-path-provisioner CSI by @jimmidyson in
#693
* feat: cluster-api v1.7.3 by @jimmidyson in
#714
* feat: bumps caaph to 0.2.4 by @faiq in
#718
* feat: Controller that copies ClusterClasses to namespaces by
@dlipovetsky in
#715
* feat: adds a mindthegap container and deployment by @faiq in
#637
* feat: implements BeforeClusterUpgrade hook by @faiq in
#682
### Fixes 🔧
* fix: use external Nutanix API types directly by @dkoshkin in
#698
* fix: Post-process clusterconfig CRDs for supported CSI providers by
@jimmidyson in
#695
* fix: nutanix credentials Secrets owner refs by @dkoshkin in
#711
* fix: credential provider response secret ownership by @dkoshkin in
#709
* fix: static credentials Secret generation by @dkoshkin in
#717
* fix: set ownerReference on imageRegistry and globalMirror Secrets by
@dkoshkin in
#720
* fix: Allow Nutanix CSI snapshot controller & webhook to run on CP
nodes by @dlipovetsky in
#723
* refactor: Use maps for CSI providers and storage classes by
@jimmidyson in
#696
* fix: CredentialProviderConfig matchImages to support registries with
port by @dkoshkin in
#724
* fix: Allow Node Feature Discovery garbage collector to run on
control-plane nodes by @dlipovetsky in
#722
* fix: RBAC role for namespace-sync controller to watch,list namespaces
by @dkoshkin in
#738
* fix: image registries not handling CA certificates by @dkoshkin in
#729
* fix: adds a docker buildx step before release-snapshot by @faiq in
#741
### Other Changes
* docs: Add released version to helm and clusterctl install by
@jimmidyson in
#683
* revert: Temporary lint config fix until next golangci-lint release
(#629) by @jimmidyson in
#686
* refactor: Delete unused code by @jimmidyson in
#687
* refactor: Reduce log verbosity for skipped handlers by @jimmidyson in
#692
* build: update Go to 1.22.4 by @dkoshkin in
#700
* build(deps): Upgrade CAPX version to v1.4.0 by @thunderboltsid in
#707
* build: Move CSI supported provider logic to script by @jimmidyson in
#703
* build: Add testifylint linter by @jimmidyson in
#706
* build: Update all tools by @jimmidyson in
#704
* refactor: rename credential provider response secret by @dkoshkin in
#710
* refactor: Simplify code by using slices.Clone by @jimmidyson in
#712
* refactor: consistently use the same SetOwnerReference function by
@dkoshkin in
#713
* refactor: kube-vip commands by @dkoshkin in
#699
* build: Fix an incorrect make variable passed to goreleaser by
@dlipovetsky in
#716
* build: Add 'chart-docs' make target by @dlipovetsky in
#727
* build: Make CAREN mindthegap reg multiarch by @jimmidyson in
#730
* Add helm values schema plugin by @dlipovetsky in
#728
* test(e2e): Use mesosphere fork with CRSBinding fix by @jimmidyson in
#736

## New Contributors
* @thunderboltsid made their first contribution in
#707

**Full Changelog**:
v0.9.0...v0.10.0

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants