Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.8] Bug 2092355: Rebase to 0.47.11 #278

Merged
merged 24 commits into from
Jun 15, 2022

Conversation

creydr
Copy link
Member

@creydr creydr commented Jun 7, 2022

Inspired by the commit description for #243

Identify carry commits:

git remote update
git log --oneline --no-merges v0.47.11..openshift/release-4.8

Note that the UPSTREAM: <carry>: prefix was added to any new commits to carry since the previous rebase. This will make new carry commits easy to identify alongside prior carry commits.

After identifying the carry commits, the next step is to create the new commit-tree that will be used for the rebase and then cherry pick the carry commits into the new branch.

The following commands cover these steps:

Process

$ git remote update # make sure we update our refs
$ git checkout v0.47.11 # upstream kubernetes-nmstate 0.47.11 commit tag
$ git checkout -b merge-tmp # create a branch to do our merge work from
$ git checkout openshift/release-4.8 # we want to be at the tip of the openshift master branch when we run the next command
$ echo 'merge nmstate/kubernetes-nmstate v0.47.11' | git commit-tree merge-tmp^{tree} -p HEAD -p merge-tmp -F -
deadbeef12345678 # id of new merge commit, output by the previous command
$ git branch merge-0.47.11 deadbeef12345678 # create a new branch for the cherry-pick work
$ git checkout merge-0.47.11 # make sure we are on the proper branch
$ git cherry-pick <carry commits>

Carried Commits:

  • UPSTREAM: <carry>: Add midstream approvers and reviewers (623a6ef)
  • UPSTREAM: <carry>: Add manifests for 4.7 (0cfa30e)
  • UPSTREAM: <carry>: Add manifests for 4.8 (16dc236)
  • UPSTREAM: <carry>: Change handler toleration to "operator: exists" (#755) (974af03)
  • UPSTREAM: <carry>: Update Dockerfile images to match ART (340e7fb)
  • UPSTREAM: <carry>: Set default bundle channel to 4.8 (faff316)
  • UPSTREAM: <carry>: Name replacement must exactly match string in CSV (f1e6664)
  • UPSTREAM: <carry>: Fix olm.skipRange substitution (62ae112)
  • UPSTREAM: <carry>: Remove serviceaccount manifest (66048b5)

Release note:

Rebase to kubernetes-nmstate upstream v0.47.11

qinqon and others added 24 commits June 15, 2021 12:10
At the beginning, check if NNCE exists for a policy and is in
progressing state.

If it is, then it means that handler was killed while applying the
desired state - do not increment unavailableNodeCount in such case.

There is still a window where killing handler would cause
inconsistency - between status is set to Failed or Success and the
unavailableNodeCount is decremented.

Signed-off-by: Radim Hrazdil <[email protected]>
Signed-off-by: Quique Llorente <[email protected]>

Co-authored-by: Radim Hrazdil <[email protected]>
Add system-node-critical to nmstate-handler,
since this pod should run on each node,

Add system-cluster-critical to nmstate-webhook
and to nmstate-cert-manager pods,
since those pods aren't bound to a specific node,
yet those are important control plane pods.

This will make the control plane less sensitive to preemption
than user workloads.

Signed-off-by: Or Shoval <[email protected]>
For each of the 3 nmstate pods,
supply the minimum cpu / memory requests.
Values are set according empiric tests.

See https://bugzilla.redhat.com/show_bug.cgi?id=1935218

Signed-off-by: Or Shoval <[email protected]>
Currently, the label app=kubernetes-nmstate is used to count number
of nmstate enabled nodes, but this label is also present on webhooks
and cert manager, which may run on control-plane.

Using label component=kubernetes-nmstate-handler ensures that only
nodes with handlers running on them are considered.

Signed-off-by: Radim Hrazdil <[email protected]>
The default value of maxUnavailable is 1. That serializes upgrades and
may drag for very long time. This change turns it into 10% to speed up
the upgrade process while keeping services reasonably available.

https://bugzilla.redhat.com/show_bug.cgi?id=1990065

Signed-off-by: Petr Horáček <[email protected]>

Co-authored-by: Petr Horáček <[email protected]>
Operator deletes new NMState CR if there is already one
existing. The problem is however, that current CRs are not
always listed in correct order, which means that the current
logic sometimes doesn't delete new CR when another already
exists, causing e2e tests to flake.

This commit sorts listed NMstate CRs when there are more than one
of them, and compare the name of the reconciled CR with the oldest.

Signed-off-by: Radim Hrazdil <[email protected]>
…iltering script (nmstate#793) (nmstate#820)

* Add test suite file to pkg/helpers

pkg/helper tests are not executed because of missing suite file.

Signed-off-by: Radim Hrazdil <[email protected]>

* Add sjson package for manipulating json strings

Signed-off-by: Radim Hrazdil <[email protected]>

* Set default vlan config on linux-bridge ports in api

Instead of applying vlan configuration after setting
desiredState using a script, add default configuration to desiredState
using nmstate API.

Set default VLAN configuration to policy enactment desiredState
in Reconcile, so that created NNCE desiredState
contains the applied default values.

Signed-off-by: Radim Hrazdil <[email protected]>

* Update desiredState defaults withing enactment init

In order to save one API call, update the enactment
desiredState defaults in initializeEnactment function.

Signed-off-by: Radim Hrazdil <[email protected]>

* Remove unused vlan-filtering script

Vlan configuration is done via nmstate API, so there is no
need to keep the vlan_filtering script.

Update test that relied on renaming the vlan_filtering
script to make the NNCP configuration fail to achieve rollback.
Instead, pass incorrect yaml.

Signed-off-by: Radim Hrazdil <[email protected]>

* Add example for linux-bridge with port vlan

Signed-off-by: Radim Hrazdil <[email protected]>
Dockerfile still attempty to copy build/bin directory,
but this folder has been removed.

Signed-off-by: Radim Hrazdil <[email protected]>
kubernetes requeues request with rate limiting when an error
is returned, regardless of what is in the returned Result.

As a consequence, in big clusters, nodes that wait for policy
need to wait for increasingly long time between each reconcile
loop.

Signed-off-by: Radim Hrazdil <[email protected]>
It limits the number of certificates it overlaps.

Signed-off-by: Quique Llorente <[email protected]>
When upgrading from an older knmstate version that uses
vlan-filtering script to a version relying on nmstate
API, nmstate fails on verification when reconciling
bridge with ports configured by the older knmstate
version.

This change applies the vlans using nmcli, so that
there is no discrepancy between kernel and NM configuration.

Signed-off-by: Radim Hrazdil <[email protected]>
Only apply vlan filtering via nmcli to bridges and ports
that are listed in desiredState and also exist in current
state.

Signed-off-by: Radim Hrazdil <[email protected]>
* Timeout if Pending takes too long

Add LastUnavailableNodeCount timestamp to avoid
getting stuck at Pending forever.

Signed-off-by: Radim Hrazdil <[email protected]>

* handler, container: Use centos stream 8

Signed-off-by: Radim Hrazdil <[email protected]>
The internal Networking team needs to be able to approve and review.

Signed-off-by: Brad P. Crochet <[email protected]>
(cherry picked from commit 647a569)
Signed-off-by: Brad P. Crochet <[email protected]>
(cherry picked from commit f5a3d5a)
…mstate#755)

Since nmstate is to be used for configuration of the network
infrastructure, we want the handler to run on all nodes regardless
of taint. If a network config should not run on a given node, the
NNCP nodeSelector field can be used to accomplish that.

For example, a node with a taint of:

[map[effect:NoSchedule key:node.ocs.openshift.io/storage value:true]]

will currently keep the nmstate handler pod from running there.
However, that is not desirable since it prevents the use of nmstate
to configure networking on that storage node.

This changes the handler toleration to "operator: exists", which
will allow the handler to run on all nodes. The webhook toleration
is left alone since there is no need for that to be running on
nodes with a NoSchedule taint.

Signed-off-by: Ben Nemec <[email protected]>
(cherry picked from commit b2bccf9)
(cherry picked from commit de92fbc)
This is no longer required[0] and may cause issues when newer versions
of the operator-sdk are used.

0: operator-framework/operator-sdk#5326
(cherry picked from commit 180fb54)
@openshift-ci
Copy link

openshift-ci bot commented Jun 7, 2022

@creydr: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

[release-4.8] [WIP] Rebase to 0.47.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from cybertron and phoracek June 7, 2022 09:44
@cybertron
Copy link
Member

/lgtm
/approve
/hold

We're going to try to run some more tests on this before it merges, but since we didn't have e2e CI on this repo it's going to have to be manual. Once we're satisfied that this has been tested as much as is reasonable we can remove the hold and merge it.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 7, 2022
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 7, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jun 7, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: creydr, cybertron

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 7, 2022
@creydr creydr changed the title [release-4.8] [WIP] Rebase to 0.47.11 [release-4.8] Rebase to 0.47.11 Jun 8, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jun 8, 2022

@creydr: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

[release-4.8] Rebase to 0.47.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@creydr
Copy link
Member Author

creydr commented Jun 13, 2022

/test all

1 similar comment
@creydr
Copy link
Member Author

creydr commented Jun 14, 2022

/test all

@creydr
Copy link
Member Author

creydr commented Jun 15, 2022

@cybertron @phoracek @qinqon the e2e tests passed after the rebase. Do we have a BZ for this?

@cybertron
Copy link
Member

@cybertron
Copy link
Member

/retitle [release-4.8] Bug 2092355: Rebase to 0.47.11

@openshift-ci openshift-ci bot changed the title [release-4.8] Rebase to 0.47.11 [release-4.8] Bug 2092355: Rebase to 0.47.11 Jun 15, 2022
@openshift-ci openshift-ci bot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jun 15, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jun 15, 2022

@creydr: This pull request references Bugzilla bug 2092355, which is invalid:

  • expected the bug to target the "4.8.z" release, but it targets "---" instead
  • expected dependent Bugzilla bug 2093941 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is NEW instead
  • expected dependent Bugzilla bug 2093941 to target a release in 4.9.0, 4.9.z, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

[release-4.8] Bug 2092355: Rebase to 0.47.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cybertron
Copy link
Member

/bugzilla refresh

@openshift-ci
Copy link

openshift-ci bot commented Jun 15, 2022

@cybertron: This pull request references Bugzilla bug 2092355, which is invalid:

  • expected dependent Bugzilla bug 2093941 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is NEW instead
  • expected dependent Bugzilla bug 2093941 to target a release in 4.9.0, 4.9.z, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cybertron
Copy link
Member

/bugzilla refresh

@openshift-ci openshift-ci bot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jun 15, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jun 15, 2022

@cybertron: This pull request references Bugzilla bug 2092355, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.z) matches configured target release for branch (4.8.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 2097396 is in the state CLOSED (CURRENTRELEASE), which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 2097396 targets the "4.9.z" release, which is one of the valid target releases: 4.9.0, 4.9.z
  • bug has dependents

Requesting review from QA contact:
/cc @amalykhi

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from amalykhi June 15, 2022 15:33
@cybertron
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jun 15, 2022

@creydr: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci openshift-ci bot merged commit f324084 into openshift:release-4.8 Jun 15, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jun 15, 2022

@creydr: All pull requests linked via external trackers have merged:

Bugzilla bug 2092355 has been moved to the MODIFIED state.

In response to this:

[release-4.8] Bug 2092355: Rebase to 0.47.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-kubernetes-nmstate-operator-container-v4.8.0-202311261141.p0.gf324084.assembly.stream for distgit openshift-kubernetes-nmstate-operator.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants