-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.8] Bug 2092355: Rebase to 0.47.11 #278
Conversation
At the beginning, check if NNCE exists for a policy and is in progressing state. If it is, then it means that handler was killed while applying the desired state - do not increment unavailableNodeCount in such case. There is still a window where killing handler would cause inconsistency - between status is set to Failed or Success and the unavailableNodeCount is decremented. Signed-off-by: Radim Hrazdil <[email protected]> Signed-off-by: Quique Llorente <[email protected]> Co-authored-by: Radim Hrazdil <[email protected]>
Add system-node-critical to nmstate-handler, since this pod should run on each node, Add system-cluster-critical to nmstate-webhook and to nmstate-cert-manager pods, since those pods aren't bound to a specific node, yet those are important control plane pods. This will make the control plane less sensitive to preemption than user workloads. Signed-off-by: Or Shoval <[email protected]>
For each of the 3 nmstate pods, supply the minimum cpu / memory requests. Values are set according empiric tests. See https://bugzilla.redhat.com/show_bug.cgi?id=1935218 Signed-off-by: Or Shoval <[email protected]>
Currently, the label app=kubernetes-nmstate is used to count number of nmstate enabled nodes, but this label is also present on webhooks and cert manager, which may run on control-plane. Using label component=kubernetes-nmstate-handler ensures that only nodes with handlers running on them are considered. Signed-off-by: Radim Hrazdil <[email protected]>
The default value of maxUnavailable is 1. That serializes upgrades and may drag for very long time. This change turns it into 10% to speed up the upgrade process while keeping services reasonably available. https://bugzilla.redhat.com/show_bug.cgi?id=1990065 Signed-off-by: Petr Horáček <[email protected]> Co-authored-by: Petr Horáček <[email protected]>
Operator deletes new NMState CR if there is already one existing. The problem is however, that current CRs are not always listed in correct order, which means that the current logic sometimes doesn't delete new CR when another already exists, causing e2e tests to flake. This commit sorts listed NMstate CRs when there are more than one of them, and compare the name of the reconciled CR with the oldest. Signed-off-by: Radim Hrazdil <[email protected]>
…iltering script (nmstate#793) (nmstate#820) * Add test suite file to pkg/helpers pkg/helper tests are not executed because of missing suite file. Signed-off-by: Radim Hrazdil <[email protected]> * Add sjson package for manipulating json strings Signed-off-by: Radim Hrazdil <[email protected]> * Set default vlan config on linux-bridge ports in api Instead of applying vlan configuration after setting desiredState using a script, add default configuration to desiredState using nmstate API. Set default VLAN configuration to policy enactment desiredState in Reconcile, so that created NNCE desiredState contains the applied default values. Signed-off-by: Radim Hrazdil <[email protected]> * Update desiredState defaults withing enactment init In order to save one API call, update the enactment desiredState defaults in initializeEnactment function. Signed-off-by: Radim Hrazdil <[email protected]> * Remove unused vlan-filtering script Vlan configuration is done via nmstate API, so there is no need to keep the vlan_filtering script. Update test that relied on renaming the vlan_filtering script to make the NNCP configuration fail to achieve rollback. Instead, pass incorrect yaml. Signed-off-by: Radim Hrazdil <[email protected]> * Add example for linux-bridge with port vlan Signed-off-by: Radim Hrazdil <[email protected]>
Dockerfile still attempty to copy build/bin directory, but this folder has been removed. Signed-off-by: Radim Hrazdil <[email protected]>
kubernetes requeues request with rate limiting when an error is returned, regardless of what is in the returned Result. As a consequence, in big clusters, nodes that wait for policy need to wait for increasingly long time between each reconcile loop. Signed-off-by: Radim Hrazdil <[email protected]>
It limits the number of certificates it overlaps. Signed-off-by: Quique Llorente <[email protected]>
When upgrading from an older knmstate version that uses vlan-filtering script to a version relying on nmstate API, nmstate fails on verification when reconciling bridge with ports configured by the older knmstate version. This change applies the vlans using nmcli, so that there is no discrepancy between kernel and NM configuration. Signed-off-by: Radim Hrazdil <[email protected]>
Only apply vlan filtering via nmcli to bridges and ports that are listed in desiredState and also exist in current state. Signed-off-by: Radim Hrazdil <[email protected]>
* Timeout if Pending takes too long Add LastUnavailableNodeCount timestamp to avoid getting stuck at Pending forever. Signed-off-by: Radim Hrazdil <[email protected]> * handler, container: Use centos stream 8 Signed-off-by: Radim Hrazdil <[email protected]>
…mstate#1067) Signed-off-by: Radim Hrazdil <[email protected]>
The internal Networking team needs to be able to approve and review. Signed-off-by: Brad P. Crochet <[email protected]> (cherry picked from commit 647a569)
Signed-off-by: Brad P. Crochet <[email protected]> (cherry picked from commit f5a3d5a)
(cherry picked from commit 4d19528)
…mstate#755) Since nmstate is to be used for configuration of the network infrastructure, we want the handler to run on all nodes regardless of taint. If a network config should not run on a given node, the NNCP nodeSelector field can be used to accomplish that. For example, a node with a taint of: [map[effect:NoSchedule key:node.ocs.openshift.io/storage value:true]] will currently keep the nmstate handler pod from running there. However, that is not desirable since it prevents the use of nmstate to configure networking on that storage node. This changes the handler toleration to "operator: exists", which will allow the handler to run on all nodes. The webhook toleration is left alone since there is no need for that to be running on nodes with a NoSchedule taint. Signed-off-by: Ben Nemec <[email protected]> (cherry picked from commit b2bccf9) (cherry picked from commit de92fbc)
(cherry picked from commit 1d55cc8)
(cherry picked from commit d340b59)
(cherry picked from commit 41987bc)
(cherry picked from commit 07ae771)
This is no longer required[0] and may cause issues when newer versions of the operator-sdk are used. 0: operator-framework/operator-sdk#5326 (cherry picked from commit 180fb54)
@creydr: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm We're going to try to run some more tests on this before it merges, but since we didn't have e2e CI on this repo it's going to have to be manual. Once we're satisfied that this has been tested as much as is reasonable we can remove the hold and merge it. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: creydr, cybertron The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@creydr: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test all |
1 similar comment
/test all |
@cybertron @phoracek @qinqon the e2e tests passed after the rebase. Do we have a BZ for this? |
/retitle [release-4.8] Bug 2092355: Rebase to 0.47.11 |
@creydr: This pull request references Bugzilla bug 2092355, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@cybertron: This pull request references Bugzilla bug 2092355, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@cybertron: This pull request references Bugzilla bug 2092355, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 6 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold cancel |
@creydr: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@creydr: All pull requests linked via external trackers have merged: Bugzilla bug 2092355 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[ART PR BUILD NOTIFIER] This PR has been included in build ose-kubernetes-nmstate-operator-container-v4.8.0-202311261141.p0.gf324084.assembly.stream for distgit openshift-kubernetes-nmstate-operator. |
Inspired by the commit description for #243
Identify carry commits:
Note that the
UPSTREAM: <carry>:
prefix was added to any new commits to carry since the previous rebase. This will make new carry commits easy to identify alongside prior carry commits.After identifying the carry commits, the next step is to create the new commit-tree that will be used for the rebase and then cherry pick the carry commits into the new branch.
The following commands cover these steps:
Process
merge nmstate/kubernetes-nmstate v0.47.0
af57ae1 (Bug 1954309: Rebase to upstream 0.47.0 #191).Carried Commits:
UPSTREAM: <carry>: Add midstream approvers and reviewers
(623a6ef)UPSTREAM: <carry>: Add manifests for 4.7
(0cfa30e)UPSTREAM: <carry>: Add manifests for 4.8
(16dc236)UPSTREAM: <carry>: Change handler toleration to "operator: exists" (#755)
(974af03)UPSTREAM: <carry>: Update Dockerfile images to match ART
(340e7fb)UPSTREAM: <carry>: Set default bundle channel to 4.8
(faff316)UPSTREAM: <carry>: Name replacement must exactly match string in CSV
(f1e6664)UPSTREAM: <carry>: Fix olm.skipRange substitution
(62ae112)UPSTREAM: <carry>: Remove serviceaccount manifest
(66048b5)Release note: