Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.17] OCPBUGS-38721: tuned: distinguish deferred updates #1149

Merged
merged 4 commits into from
Oct 2, 2024

Conversation

yanirq
Copy link
Contributor

@yanirq yanirq commented Sep 1, 2024

This is a manual cherry pick of:
#1129 OCPBUGS-28647: tuned: distinguish deferred updates
#1142 OCPBUGS-38795: Fix defer status during recommended profile change
#1162 NO-JIRA: deferred updates: fix in-place update handling on reboot

ffromani and others added 2 commits September 1, 2024 21:47
* OCPBUGS-28647: tuned: distinguish deferred updates

To fully support the usecase described in OCPBUGS-28647 and fix
the issue, we need to further distinguish between first-time
profile change and in-place profile change. This is required to
better support a GitOps flow.

The key distinction is if the recommended profile changes or not,
and there's a desire to defer application of changes only when a
profile is updated (e.g. sysctl modified), not the first time it
is applied.

Thus:
- first-time profile change is a change which triggers a change
  of the recommended profile.
- in-place profile update is a change which does NOT cause a
  switch to a TuneD profile with a different name. This involves
  changes to only the contents of the currently used profile.

We change the way the annotation is used. We now require a value,
which can be either
- always: every Tuned object annotated this way will have its
  application deferred.
- update: every Tuned object annotated this way will be processed
  as usual (and as it wasn't annotated) if it's a first-time profile
  change, but its in-place updates will be deferred.
- a new internal value "never" is also added to be used internally
  to mean the deferred feature is disabled entirely.
  User can use this value but it will explicitly disable the feature
  (which is disabled already by default), thus is redundant and not
  recommended.

Signed-off-by: Francesco Romani <[email protected]>

* e2e: drop tags, use labels

now that we have the more powerful ginkgo labels,
we can stop using tags for newer tests.

Signed-off-by: Francesco Romani <[email protected]>

---------

Signed-off-by: Francesco Romani <[email protected]>
…enshift#1142)

* OCPBUGS-38795: Fix defer status during recommended profile change

Process recommended profile change even when the profile itself
has not changed itself. The internal profile fingerprint tracking
was not updated during recommended profile update with no internal
changes.

* Add e2e test for defer=update
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Sep 1, 2024
@openshift-ci-robot
Copy link
Contributor

@yanirq: This pull request references Jira Issue OCPBUGS-38721, which is valid.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.17.0) matches configured target version for branch (4.17.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-28647 is in the state ON_QA, which is one of the valid states (MODIFIED, ON_QA, VERIFIED)
  • dependent Jira Issue OCPBUGS-28647 targets the "4.18.0" version, which is one of the valid target versions: 4.18.0
  • bug has dependents

Requesting review from QA contact:
/cc @mrniranjan

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is a manual cherry pick of:
#1129 OCPBUGS-28647: tuned: distinguish deferred updates
#1142 OCPBUGS-38795: Fix defer status during recommended profile change

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Sep 1, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yanirq

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 1, 2024
@yanirq
Copy link
Contributor Author

yanirq commented Sep 1, 2024

/cc @ffromani @MarSik @jmencak
/uncc @kpouget

…enshift#1162)

* log: make sure to have high verbosity in tests

Signed-off-by: Francesco Romani <[email protected]>

* e2e: deferred: stricter testing

add stricter check for the node condition when
validating OCPBUGS-38795

Signed-off-by: Francesco Romani <[email protected]>

* deferred: tuned: fix reload trigger when inplace update

There are conditions on which we should not set the
reload flag. This avoid regression in the test
"Profile deferred when applied should trigger changes when applied fist,
then deferred when edited, if tuned restart should be kept deferred"

Signed-off-by: Francesco Romani <[email protected]>

* deferred: tuned: clarify comments and logs

document better the logic about processing edits

Signed-off-by: Francesco Romani <[email protected]>

---------

Signed-off-by: Francesco Romani <[email protected]>
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 20, 2024
@ffromani
Copy link
Contributor

/retest

Comment on lines 357 to 362
err = util.SetLogLevel(profile.Spec.Config.Verbosity)
if err != nil {
klog.Errorf("failed to set log level %d: %v", profile.Spec.Config.Verbosity, err)
} else {
klog.Infof("set log level %d", profile.Spec.Config.Verbosity)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hunk depends on a unbackported PR: #1114 . We shold either also backport it (probably not the best approach) or remove this hunk in a separate commit in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I tend to lean towards the latter. Dropping this hunk.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2024
Copy link
Contributor

openshift-ci bot commented Sep 21, 2024

@yanirq: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jmencak
Copy link
Contributor

jmencak commented Sep 23, 2024

@ffromani considering 4.17.0 wrap up is close , I am wondering whether we should wait with backport-risk-assessed label.

What makes 4.17.0 special vs. 4.17.z? You need backport-risk-assessed for 4.17.z too and customers are more likely wait for 4.17.z than running with 4.17.0 immediately.

@yanirq
Copy link
Contributor Author

yanirq commented Sep 23, 2024

@ffromani considering 4.17.0 wrap up is close , I am wondering whether we should wait with backport-risk-assessed label.

What makes 4.17.0 special vs. 4.17.z? You need backport-risk-assessed for 4.17.z too and customers are more likely wait for 4.17.z than running with 4.17.0 immediately.

Yes you are right. scratch the version reference.

@ffromani
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 24, 2024
@jmencak
Copy link
Contributor

jmencak commented Sep 24, 2024

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Sep 24, 2024
@ffromani
Copy link
Contributor

we need staff-eng-approved anyway, so I guess we will end up delivering in 4.17.z

@jmencak
Copy link
Contributor

jmencak commented Oct 2, 2024

@liqcui , can we please have cherry-pick-approved label for this? Thank you!

@liqcui
Copy link

liqcui commented Oct 2, 2024

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Oct 2, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit c77c557 into openshift:release-4.17 Oct 2, 2024
16 checks passed
@openshift-ci-robot
Copy link
Contributor

@yanirq: Jira Issue OCPBUGS-38721: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-38721 has been moved to the MODIFIED state.

In response to this:

This is a manual cherry pick of:
#1129 OCPBUGS-28647: tuned: distinguish deferred updates
#1142 OCPBUGS-38795: Fix defer status during recommended profile change
#1162 NO-JIRA: deferred updates: fix in-place update handling on reboot

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: cluster-node-tuning-operator
This PR has been included in build cluster-node-tuning-operator-container-v4.17.0-202410022234.p0.gc77c557.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants