-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: OCPBUGS-28647: consume deferred updates from performance profile #1118
WIP: OCPBUGS-28647: consume deferred updates from performance profile #1118
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e30ada3
to
1b0603e
Compare
@ffromani: This pull request references Jira Issue OCPBUGS-28647, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold |
fe45be9
to
218ed63
Compare
@@ -595,6 +600,17 @@ func (r *PerformanceProfileReconciler) isMixedCPUsEnabled(object client.Object) | |||
return profileutil.IsMixedCPUsEnabled(object.(*performancev2.PerformanceProfile)) | |||
} | |||
|
|||
func (r *PerformanceProfileReconciler) isTunedDeferredEnabled(object client.Object) bool { | |||
if ntoconfig.InHyperShift() { | |||
return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's still WIP so I might be judging too early.
We recently aligned all NTO/PAO features to work on Hypershift let's make sure to keep this spirit, unless there's a really good reason why not.
And even if there's, let's add a comment explaining why we cannot support it on Hypershift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, good point. Will fix/clarify by the time the WIP tag is lifted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to come up with a mechanism to keep this practice then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanirq For new features we're usually adding e2e tests.
Since the same tests are running on both OCP and HCP they likely to be failed on HCP in case no one took care to add support for the new feature on HCP
pkg/performanceprofile/controller/performanceprofile/components/tuned/tuned.go
Show resolved
Hide resolved
218ed63
to
40ae191
Compare
40ae191
to
d5731dd
Compare
need to clarify if we want to update at all in hypershift. Won't harm, but would it be useful? |
Since we intend to downport this PR I am wondering if should actually keep the separation and have a followup PR for hypershift support |
to backport the feature we will need code changes anyway (unfortunately). I think the hypershift support is not a concern in this case. |
in case the performance profile is annotated with the very same annotation which enabled tuned deferred updates, then that annotation is propagated to the generated tuned objects, effectively enabling the tuned deferred update feature. Deferred updates are not supported in hypershift yet Signed-off-by: Francesco Romani <[email protected]>
d5731dd
to
31e133b
Compare
We'll need an e2e test for that, unless there's something as part of NTO tests |
/hold cancel |
I am bit worried about this one. PerfProfile does more than just create Tuned. We also post MC and KubeletConfigs that do not support deferred updates. And they need to stay in sync. Moreover, any change to cpu sets or kernel args in PP causes an MCP reboot. So I wonder what can be changed in PP that it does not reboot the nodes and goes to Tuned. |
each of these can be addressed individually. But it's allowed for a change in the perfprofile to fan out on multiple objects, and each of them can cause a reboot. And by design these objects are reconciled independently. |
uhm the linked bug mentions only tuned though. Fixing all the other reboot cause, assuming we can in the current architecture, is a much larger scope. |
I addressed somehow this part in my last commit, PTAL @MarSik @yanirq |
I'm good with the setup for having the option to have multiple objects. Lets get to an agreement with the current method presented here (set of objects,using wildcards) so we are consistent in documentation as well (u/s docs as well?) |
The original intent of the deferred updates is to address tuned-triggered reboots. By design, performanceprofiles fans out objects managed by different independent controllers. Guaranteeing deferred updates for the performanceprofile is thus a much harder problem than tuned. To meet these conflicting needs, we augment the lower-level tuned annotation to accept a comma-separated, lowercase, list of components whose updates should be deferred. So far we will only support tuned. xref: https://issues.redhat.com/browse/OCPBUGS-28647 Signed-off-by: Francesco Romani <[email protected]>
ddb8ad1
to
d50fa39
Compare
talking about classic OCP (non hypershift) we do have e2e tests for the tuned functionality and we have controller tests which ensure that the label is propagated correctly. Would this be sufficient coverage? |
@ffromani: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
tuned-level changes are sufficient to close this bug. But we need one more followup, to be filed soon. |
@ffromani: This pull request references Jira Issue OCPBUGS-28647. The bug has been updated to no longer refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Consume the deferred updates support from the performance profile. The idea is that if the annotation (the same as NTO) is found on performanceprofile, it is passed throught the generated tuned object.
Initially we don't support hypershift