Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Plugin Config Daemon node selector #484

Merged

Conversation

wizhaoredhat
Copy link
Contributor

@wizhaoredhat wizhaoredhat commented Aug 2, 2023

When node selectors are added via "configDaemonNodeSelector" via the sriovoperatorconfigs CRD then completely removed, the previous node selectors would remain for the plugin config daemons (sriov-device-plugin). This was not cleaned up, because the function "syncPluginDaemonSet" bails out when "configDaemonNodeSelector" is empty. For example:

Step 1) Create 3 labels in node selector:
configDaemonNodeSelector = {"labelA": "", "labelB": "", "labelC": ""}
device-plugin DS NodeSelector = {"labelA": "", "labelB": "", "labelC": ""} 

Step 2) Remove all labels in node selector:
configDaemonNodeSelector = {}
device-plugin DS NodeSelector = {"labelA": "", "labelB": "", "labelC": ""}

The expectation is that the device plugin will revert to the default node selectors (e.g. {"node-role.kubernetes.io/worker": "", "kubernetes.io/os": "linux"})

This change will make the node selector field in the sriov-device-plugin to take in a variable. This default node selector value is then shared with the code that updates the node selector.

@github-actions
Copy link

github-actions bot commented Aug 2, 2023

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@wizhaoredhat
Copy link
Contributor Author

/cc @SchSeba
/cc @zeeke
/cc @e0ne
/cc @adrianchiris
/cc @zshi-redhat

@github-actions github-actions bot requested a review from zshi-redhat August 2, 2023 00:31
@coveralls
Copy link

Pull Request Test Coverage Report for Build 5733255954

  • 1 of 44 (2.27%) changed or added relevant lines in 3 files are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.06%) to 24.39%

Changes Missing Coverage Covered Lines Changed/Added Lines %
controllers/sriovnetworknodepolicy_controller.go 0 1 0.0%
controllers/helper.go 0 7 0.0%
controllers/sriovoperatorconfig_controller.go 1 36 2.78%
Files with Coverage Reduction New Missed Lines %
controllers/sriovibnetwork_controller.go 2 64.15%
Totals Coverage Status
Change from base Build 5645891211: -0.06%
Covered Lines: 2080
Relevant Lines: 8528

💛 - Coveralls

@wizhaoredhat
Copy link
Contributor Author

@bn222

When node selectors are added via "configDaemonNodeSelector"
via the sriovoperatorconfigs CRD then completely removed, the previous
node selectors would remain for the plugin config daemons
(sriov-device-plugin). This was not cleaned up, because the function
"syncPluginDaemonSet" bails out when "configDaemonNodeSelector" is
empty. For example:

Step 1) Create 3 labels in node selector:
configDaemonNodeSelector = {"labelA": "", "labelB": "", "labelC": ""}
device-plugin DS NodeSelector = {"labelA": "", "labelB": "", "labelC": ""}

Step 2) Remove all labels in node selector:
configDaemonNodeSelector = {}
device-plugin DS NodeSelector = {"labelA": "", "labelB": "", "labelC": ""}

The expectation is that the device plugin will revert to the default
node selectors (e.g. {"node-role.kubernetes.io/worker": "",
"kubernetes.io/os": "linux"})

This change will make the node selector field in the sriov-device-plugin
to take in a variable. This default node selector value is then shared
with the code that updates the node selector.

Signed-off-by: William Zhao <[email protected]>
@wizhaoredhat wizhaoredhat force-pushed the fix_plugin_ds_node_selector branch from ad92936 to 48b2aeb Compare August 23, 2023 23:04
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

Copy link
Member

@zeeke zeeke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wizhaoredhat
Copy link
Contributor Author

@adrianchiris @e0ne Please take a look when you have a chance. Thanks!

@adrianchiris
Copy link
Collaborator

Hey, A question:

What is the behaviour today when first deploying sriov-network-operator with sriovoperatorconfig having an empty ConfigDaemonNodeSelector?

as far as i see from the codebase. it will not deploy daemonsets. is my understanding correct ?
if thats the case, i think we would want to return to the same state. (that is delete the ds)

@SchSeba
Copy link
Collaborator

SchSeba commented Sep 11, 2023

Hi @adrianchiris,
by default the config-daemon have node selector on workers you can check in the bin/data folder

@wizhaoredhat
Copy link
Contributor Author

@adrianchiris @SchSeba Yes by default the node selector is on workers with an empty ConfigDaemonNodeSelector. Thanks for responding to this.

Is this good on your end. Could we merge this changes?

@e0ne e0ne merged commit 532df8a into k8snetworkplumbingwg:master Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants