Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-36870: Remove tuned/rendered object #1110

Conversation

jmencak
Copy link
Contributor

@jmencak jmencak commented Jul 11, 2024

This is a manual backport of #1036 . Changes from #1036: using the current (deprecated) k8s.io/utils/pointer package and adjusted the test code to use it.

The NTO operand is controlled by the operator by updates to two resources. Its corresponding k8s Tuned Profile resource and tuned/rendered object, which contains a list of all TuneD (daemon) profiles.

While this setup works for most cases, there is a problem with this approach when a cluster administator changes both a current TuneD profile content and (at the same) time switches to a new TuneD profile completely. Then, depending on the k8s object update timing, we could see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-36870

@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Jul 11, 2024
@openshift-ci-robot
Copy link
Contributor

@jmencak: This pull request references Jira Issue OCPBUGS-36870, which is valid.

7 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.15.z) matches configured target version for branch (4.15.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
  • release note text is set and does not match the template
  • dependent bug Jira Issue OCPBUGS-32469 is in the state Closed (Done-Errata), which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
  • dependent Jira Issue OCPBUGS-32469 targets the "4.16.0" version, which is one of the valid target versions: 4.16.0, 4.16.z
  • bug has dependents

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

This is a manual backport of #1036 . Changes from 1036:

go get k8s.io/utils/ptr
go mod vendor
go mod tidy

The NTO operand is controlled by the operator by updates to two resources. Its corresponding k8s Tuned Profile resource and tuned/rendered object, which contains a list of all TuneD (daemon) profiles.

While this setup works for most cases, there is a problem with this approach when a cluster administator changes both a current TuneD profile content and (at the same) time switches to a new TuneD profile completely. Then, depending on the k8s object update timing, we could see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-36870

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jmencak
Copy link
Contributor Author

jmencak commented Jul 11, 2024

Our epic voyage towards 4.14 backports begins
/cc @ffromani
as he already reviewed #1036

@openshift-ci openshift-ci bot requested review from ffromani, dagrayvid and Tal-or July 11, 2024 13:57
Copy link
Contributor

openshift-ci bot commented Jul 11, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmencak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 11, 2024
@jmencak
Copy link
Contributor Author

jmencak commented Jul 11, 2024

Right, so the linter is failing on the deprecated k8s.io/utils/pointer package. I guess there are two options:

  • Pull in the changes from a70c07f09 wrt. dropping the deprecated k8s.io/utils/pointer package.
  • Keep using the deprecated k8s.io/utils/pointer package and adjust the current (test) code to use it.

Opinions, reviewers? Thank you.

@jmencak
Copy link
Contributor Author

jmencak commented Jul 12, 2024

Right, so the linter is failing on the deprecated k8s.io/utils/pointer package. I guess there are two options:

* Pull in the changes from [a70c07f09](https://github.com/openshift/cluster-node-tuning-operator/commit/a70c07f09) wrt. dropping the deprecated `k8s.io/utils/pointer` package.

* Keep using the deprecated `k8s.io/utils/pointer` package and adjust the current (test) code to use it.

Opinions, reviewers? Thank you.

@ffromani thoughts keeping other backporting efforts in mind? Both options should work, the first one will probably be more more invasive. (Slightly) leaning towards bullet point 2 myself, but happy with both.

@ffromani
Copy link
Contributor

Right, so the linter is failing on the deprecated k8s.io/utils/pointer package. I guess there are two options:

* Pull in the changes from [a70c07f09](https://github.com/openshift/cluster-node-tuning-operator/commit/a70c07f09) wrt. dropping the deprecated `k8s.io/utils/pointer` package.

* Keep using the deprecated `k8s.io/utils/pointer` package and adjust the current (test) code to use it.

Opinions, reviewers? Thank you.

@ffromani thoughts keeping other backporting efforts in mind? Both options should work, the first one will probably be more more invasive. (Slightly) leaning towards bullet point 2 myself, but happy with both.

same here, slightly leaning towards 2. I'd go for it unless other reviewers have stronger opinions

The NTO operand is controlled by the operator by updates to two
resources.  Its corresponding k8s Tuned Profile resource and
tuned/rendered object, which contains a list of all TuneD (daemon)
profiles.

While this setup works for most cases, there is a problem with this
approach when a cluster administator changes both a current TuneD
profile content and (at the same) time switches to a new TuneD profile
completely.  Then, depending on the k8s object update timing, we could
see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles
directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-36870
@jmencak jmencak force-pushed the 4.15-OCPBUGS-36870-no-rendered branch from 14ea159 to 2287f21 Compare July 15, 2024 08:48
Copy link
Contributor

openshift-ci bot commented Jul 15, 2024

@jmencak: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jmencak
Copy link
Contributor Author

jmencak commented Jul 15, 2024

Right, so the linter is failing on the deprecated k8s.io/utils/pointer package. I guess there are two options:

* Pull in the changes from [a70c07f09](https://github.com/openshift/cluster-node-tuning-operator/commit/a70c07f09) wrt. dropping the deprecated `k8s.io/utils/pointer` package.

* Keep using the deprecated `k8s.io/utils/pointer` package and adjust the current (test) code to use it.

Opinions, reviewers? Thank you.

@ffromani thoughts keeping other backporting efforts in mind? Both options should work, the first one will probably be more more invasive. (Slightly) leaning towards bullet point 2 myself, but happy with both.

Thank you. Changed the PR to use option 2 and the tests pass. @Tal-or , opinions? Thank you.

@Tal-or
Copy link
Contributor

Tal-or commented Jul 15, 2024

Right, so the linter is failing on the deprecated k8s.io/utils/pointer package. I guess there are two options:

* Pull in the changes from [a70c07f09](https://github.com/openshift/cluster-node-tuning-operator/commit/a70c07f09) wrt. dropping the deprecated `k8s.io/utils/pointer` package.

* Keep using the deprecated `k8s.io/utils/pointer` package and adjust the current (test) code to use it.

Opinions, reviewers? Thank you.

@ffromani thoughts keeping other backporting efforts in mind? Both options should work, the first one will probably be more more invasive. (Slightly) leaning towards bullet point 2 myself, but happy with both.

Thank you. Changed the PR to use option 2 and the tests pass. @Tal-or , opinions? Thank you.

Works for me as well. We can change it later during some maintenance work in the future

@Tal-or
Copy link
Contributor

Tal-or commented Jul 22, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 22, 2024
@ffromani
Copy link
Contributor

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 22, 2024
@jmencak
Copy link
Contributor Author

jmencak commented Jul 23, 2024

/hold

Thank you for looking, Francesco. May I have the reasoning behind this? Thank you!

@ffromani
Copy link
Contributor

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 23, 2024
@ffromani
Copy link
Contributor

/hold

Thank you for looking, Francesco. May I have the reasoning behind this? Thank you!

Didn't want #1019 to be preempted too much by other PRs merging, causing #1019 to retrigger CI over and over and over again and going in merge starvation. Didn't notice this PR is not against master branch.

@jmencak
Copy link
Contributor Author

jmencak commented Jul 23, 2024

/hold

Thank you for looking, Francesco. May I have the reasoning behind this? Thank you!

Didn't want #1019 to be preempted too much by other PRs merging, causing #1019 to retrigger CI over and over and over again and going in merge starvation. Didn't notice this PR is not against master branch.

Makes sense, thank you for the reasoning.

@jmencak
Copy link
Contributor Author

jmencak commented Jul 23, 2024

/label backport-risk-assessed
@liqcui , can we please have the cherry-pick-approved label?

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Jul 23, 2024
@liqcui
Copy link

liqcui commented Jul 23, 2024

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jul 23, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit 7eab9a3 into openshift:release-4.15 Jul 23, 2024
15 checks passed
@openshift-ci-robot
Copy link
Contributor

@jmencak: Jira Issue OCPBUGS-36870: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-36870 has been moved to the MODIFIED state.

In response to this:

This is a manual backport of #1036 . Changes from #1036: using the current (deprecated) k8s.io/utils/pointer package and adjusted the test code to use it.

The NTO operand is controlled by the operator by updates to two resources. Its corresponding k8s Tuned Profile resource and tuned/rendered object, which contains a list of all TuneD (daemon) profiles.

While this setup works for most cases, there is a problem with this approach when a cluster administator changes both a current TuneD profile content and (at the same) time switches to a new TuneD profile completely. Then, depending on the k8s object update timing, we could see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-36870

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jmencak jmencak deleted the 4.15-OCPBUGS-36870-no-rendered branch July 23, 2024 06:58
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: cluster-node-tuning-operator
This PR has been included in build cluster-node-tuning-operator-container-v4.15.0-202407230808.p0.g7eab9a3.assembly.stream.el9.
All builds following this will include this PR.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.15.0-0.nightly-2024-07-23-123403

jmencak added a commit to jmencak/cluster-node-tuning-operator that referenced this pull request Aug 14, 2024
This is a backport openshift#1110 of which resolved OCPBUGS-36870 in 4.15.

The NTO operand is controlled by the operator by updates to two
resources.  Its corresponding k8s Tuned Profile resource and
tuned/rendered object, which contains a list of all TuneD (daemon)
profiles.

While this setup works for most cases, there is a problem with this
approach when a cluster administator changes both a current TuneD
profile content and (at the same) time switches to a new TuneD profile
completely.  Then, depending on the k8s object update timing, we could
see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles
directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-37754
jmencak added a commit to jmencak/cluster-node-tuning-operator that referenced this pull request Aug 14, 2024
This is a backport openshift#1110 of which resolved OCPBUGS-36870 in 4.15.

The NTO operand is controlled by the operator by updates to two
resources.  Its corresponding k8s Tuned Profile resource and
tuned/rendered object, which contains a list of all TuneD (daemon)
profiles.

While this setup works for most cases, there is a problem with this
approach when a cluster administator changes both a current TuneD
profile content and (at the same) time switches to a new TuneD profile
completely.  Then, depending on the k8s object update timing, we could
see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles
directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-37754
openshift-merge-bot bot pushed a commit that referenced this pull request Aug 19, 2024
This is a backport #1110 of which resolved OCPBUGS-36870 in 4.15.

The NTO operand is controlled by the operator by updates to two
resources.  Its corresponding k8s Tuned Profile resource and
tuned/rendered object, which contains a list of all TuneD (daemon)
profiles.

While this setup works for most cases, there is a problem with this
approach when a cluster administator changes both a current TuneD
profile content and (at the same) time switches to a new TuneD profile
completely.  Then, depending on the k8s object update timing, we could
see two TuneD daemon reloads instead of just one.

Remove the tuned/rendered object and carry TuneD (daemon) profiles
directly in the Tuned Profile k8s objects.

Resolves: OCPBUGS-37754

Co-authored-by: Jiri Mencak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants