-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NO-JIRA: deferred updates: fix in-place update handling on reboot #1162
NO-JIRA: deferred updates: fix in-place update handling on reboot #1162
Conversation
Signed-off-by: Francesco Romani <[email protected]>
@@ -1029,19 +1031,18 @@ func (c *Controller) changeSyncerTuneD(change Change) (synced bool, err error) { | |||
// Cache the value written to tunedRecommendFile. | |||
c.daemon.recommendedProfile = change.recommendedProfile | |||
klog.V(1).Infof("recommended TuneD profile updated from %q to %q [inplaceUpdate=%v nodeRestart=%v]", prevRecommended, change.recommendedProfile, inplaceUpdate, change.nodeRestart) | |||
changeRecommend = true | |||
|
|||
if change.deferredMode == util.DeferUpdate && !inplaceUpdate && c.daemon.recoveredRecommendedProfile == change.recommendedProfile { | |||
klog.V(1).Infof("recommended TuneD profile changed; skip TuneD reload [deferred=%v recoveredRecommended=%v]", change.deferredMode, c.daemon.recoveredRecommendedProfile) | |||
// Reset because we need only once the first time we process the TuneD k8s object. Let's avoid stale data. | |||
c.daemon.recoveredRecommendedProfile = "" | |||
} else { | |||
klog.V(1).Infof("recommended TuneD profile changed; trigger TuneD reload [deferred=%v]", change.deferredMode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log still says tuned should be reloaded. I will confuse people working on this in the future.. I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the flow will ultimately trigger a tuned reload: see lines around 1065.
But still: how can we improve the logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right! So this is a fix that updates the fingerprint in addition to setting the reload flag. The reload part is not changing. Ok then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do want to not set the reload flag if very specific conditions are met, like the one captured in the test mentioned in the commit message, which is constantly failing without this fix
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ffromani, MarSik The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
/retest |
f9d84bc
to
f429cda
Compare
add stricter check for the node condition when validating OCPBUGS-38795 Signed-off-by: Francesco Romani <[email protected]>
There are conditions on which we should not set the reload flag. This avoid regression in the test "Profile deferred when applied should trigger changes when applied fist, then deferred when edited, if tuned restart should be kept deferred" Signed-off-by: Francesco Romani <[email protected]>
f429cda
to
c506826
Compare
/hold cancel managed to deflake further (on my env) but not completely |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes, Francesco. They look good to me. I've added some comments which are not exactly fair, because they should've been added them during previous reviews. Nevertheless, I believe that adding better logging/comments what happens (especially) in the
change.deferredMode == util.DeferUpdate && !inplaceUpdate && c.daemon.recoveredRecommendedProfile == change.recommendedProfile
section would help a lot future maintainers.
/hold want to rerun the e2e testsuite |
/hold cancel |
Thank you for the changes/clarification, Francesco! I've found some typos and things that (IMO) need further clarification. I'll post a suggested diff later on and we can work from there. |
I tried to clarify the recently added comments and remove the typos. Hopefully I didn't introduce any false information/typos. Feel free to adjust as you see fit and happy to discuss here of over Slack:
|
LGTM, applying |
document better the logic about processing edits Signed-off-by: Francesco Romani <[email protected]>
b9947d5
to
c3bb0c4
Compare
/hold cancel |
/lgtm |
Hmm, I think |
Overriding 2 tests because of a known change in TuneD. The tests already passed with the previous version of TuneD and need to be adjusted. /override ci/prow/e2e-gcp-pao |
@jmencak: Overrode contexts on behalf of jmencak: ci/prow/e2e-gcp-pao, ci/prow/e2e-hypershift-pao In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/jira refresh |
@jmencak: No Jira issue is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retitle NO-JIRA: deferred updates: fix in-place update handling on reboot to be added manually to #1149 |
@ffromani: This pull request explicitly references no jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/override ci/prow/e2e-gcp-pao |
@jmencak: Overrode contexts on behalf of jmencak: ci/prow/e2e-gcp-pao, ci/prow/e2e-hypershift-pao In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@ffromani: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
[ART PR BUILD NOTIFIER] Distgit: cluster-node-tuning-operator |
…enshift#1162) * log: make sure to have high verbosity in tests Signed-off-by: Francesco Romani <[email protected]> * e2e: deferred: stricter testing add stricter check for the node condition when validating OCPBUGS-38795 Signed-off-by: Francesco Romani <[email protected]> * deferred: tuned: fix reload trigger when inplace update There are conditions on which we should not set the reload flag. This avoid regression in the test "Profile deferred when applied should trigger changes when applied fist, then deferred when edited, if tuned restart should be kept deferred" Signed-off-by: Francesco Romani <[email protected]> * deferred: tuned: clarify comments and logs document better the logic about processing edits Signed-off-by: Francesco Romani <[email protected]> --------- Signed-off-by: Francesco Romani <[email protected]>
) * OCPBUGS-28647: tuned: distinguish deferred updates (#1129) * OCPBUGS-28647: tuned: distinguish deferred updates To fully support the usecase described in OCPBUGS-28647 and fix the issue, we need to further distinguish between first-time profile change and in-place profile change. This is required to better support a GitOps flow. The key distinction is if the recommended profile changes or not, and there's a desire to defer application of changes only when a profile is updated (e.g. sysctl modified), not the first time it is applied. Thus: - first-time profile change is a change which triggers a change of the recommended profile. - in-place profile update is a change which does NOT cause a switch to a TuneD profile with a different name. This involves changes to only the contents of the currently used profile. We change the way the annotation is used. We now require a value, which can be either - always: every Tuned object annotated this way will have its application deferred. - update: every Tuned object annotated this way will be processed as usual (and as it wasn't annotated) if it's a first-time profile change, but its in-place updates will be deferred. - a new internal value "never" is also added to be used internally to mean the deferred feature is disabled entirely. User can use this value but it will explicitly disable the feature (which is disabled already by default), thus is redundant and not recommended. Signed-off-by: Francesco Romani <[email protected]> * e2e: drop tags, use labels now that we have the more powerful ginkgo labels, we can stop using tags for newer tests. Signed-off-by: Francesco Romani <[email protected]> --------- Signed-off-by: Francesco Romani <[email protected]> * OCPBUGS-38795: Fix defer status during recommended profile change (#1142) * OCPBUGS-38795: Fix defer status during recommended profile change Process recommended profile change even when the profile itself has not changed itself. The internal profile fingerprint tracking was not updated during recommended profile update with no internal changes. * Add e2e test for defer=update * NO-JIRA: deferred updates: fix in-place update handling on reboot (#1162) * log: make sure to have high verbosity in tests Signed-off-by: Francesco Romani <[email protected]> * e2e: deferred: stricter testing add stricter check for the node condition when validating OCPBUGS-38795 Signed-off-by: Francesco Romani <[email protected]> * deferred: tuned: fix reload trigger when inplace update There are conditions on which we should not set the reload flag. This avoid regression in the test "Profile deferred when applied should trigger changes when applied fist, then deferred when edited, if tuned restart should be kept deferred" Signed-off-by: Francesco Romani <[email protected]> * deferred: tuned: clarify comments and logs document better the logic about processing edits Signed-off-by: Francesco Romani <[email protected]> --------- Signed-off-by: Francesco Romani <[email protected]> * Omit unbackported logging support --------- Signed-off-by: Francesco Romani <[email protected]> Co-authored-by: Francesco Romani <[email protected]> Co-authored-by: Martin Sivák <[email protected]>
There are conditions on which we should not set the reload flag. This avoid regression in the test
"Profile deferred when applied should trigger changes when applied fist, then deferred when edited, if tuned restart should be kept deferred"