-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cnf-tests: dynamically created KubeletConfig #1583
cnf-tests: dynamically created KubeletConfig #1583
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zeeke The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test ? |
@zeeke: The following commands are available to trigger required jobs:
The following commands are available to trigger optional jobs:
Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
0f296c3
to
bea3895
Compare
e616fda
to
6029de6
Compare
/hold Reconfiguring a PerformanceProfile causes multiple node reboots that can lead to a job timeout Need to find a different way to test these scenarios |
41cf0d9
to
287969f
Compare
/hold cancel Using KubeletConfig instead of PerformanceProfile makes it easier to reboot the node only once, leading to a suite setup time of ~8 minutes. This rehearsal job [1] shows NUMA test passing:
(see [2] for full build-log.txt. The last line says "Performance profile ..." because of a leftover, fixed in a recent commit) @gregkopels, @SchSeba Please take a look [1] https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/41815/rehearse-41815-periodic-ci-openshift-release-master-nightly-4.14-e2e-telco5g-cnftests/1688107838901063680 |
67c7507
to
6b40d67
Compare
/retest |
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
mcp, err := client.Client.MachineConfigPools().Get(context.Background(), name, metav1.GetOptions{}) | ||
if err != nil { | ||
klog.Warningf("Error while waiting for MachineConfigPool[%s] to be updated: %v", name, err) | ||
return false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we expect here to always succeed to get the mcp?
Wondering why returning nil and not err.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be an API server error or some other transient error. Here we return nil because the wait.Poll(...)
function stops looping if the callback returns an error. It's different than the gomega Eventually
loop.
if label == "node-role.kubernetes.io/worker" { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if statement that filters out "worker" nodes imposes a limitation to this function that is not trivial.
I think maybe making this function more generic by removing this statement and make the caller dealing with the returned value as he needs.
It's just an opinion. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, but node-role.kubernetes.io/worker
is a kind of special role. Indeed, a node can have only one role other than worker
, called "custom role". If you try putting two or more roles on a node, MachineConfigOperator stops managing that node and starts complaining about that. [1]
I would rename this function FindCustomRoleLabel(...)
and add a comment explaining that.
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
6b40d67
to
a871c19
Compare
a871c19
to
c359af3
Compare
cnf-tests/testsuites/pkg/machineconfigpool/machineconfigpool.go
Outdated
Show resolved
Hide resolved
Thank you for addressing the comments. just a small nit, but besides that it's lgtm |
c359af3
to
1ca3ced
Compare
mcv1 "github.com/openshift/machine-config-operator/pkg/apis/machineconfiguration.openshift.io/v1" | ||
kubeletconfigv1beta1 "k8s.io/kubelet/config/v1beta1" | ||
|
||
"github.com/openshift/cluster-node-tuning-operator/pkg/performanceprofile/controller/performanceprofile/components" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT - Maybe move up with other github.com/openshift-kni/ imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! import list looks nicer, thanks
/lgtm |
@gkopels: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Make NUMA/SR-IOV integration tests creating their own KubeletConfig that sets single-numa-node policy and reserve an enitre NUMA node to system. Add functions to manipulate `node-role.kubernetes.io/x` labels on node to apply the performacen profile to arbitrary nodes. Signed-off-by: Andrea Panattoni <[email protected]>
Signed-off-by: Andrea Panattoni <[email protected]>
1ca3ced
to
7853e40
Compare
/lgtm |
/retest |
Failing test
doesn't belong to this PR. /override ci/prow/e2e-gcp-ovn |
@zeeke: Overrode contexts on behalf of zeeke: ci/prow/e2e-gcp-ovn In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Make NUMA/SR-IOV integration tests creating their own PerformanceProfile that sets single-numa-node policy and reserve an enitre NUMA node as Isolated.
Add functions to manipulate
node-role.kubernetes.io/x
labels on node to apply the performacen profile to arbitrary nodes.cc @gregkopels