forked from nmstate/kubernetes-nmstate
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebase to upstream 0.44.0 #189
Merged
openshift-merge-robot
merged 56 commits into
openshift:master
from
cybertron:merge-0.44.0
Apr 12, 2021
Merged
Rebase to upstream 0.44.0 #189
openshift-merge-robot
merged 56 commits into
openshift:master
from
cybertron:merge-0.44.0
Apr 12, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The prow publish job [1] is on commiting since there is no git user email configured, this change do similar to kubevirtci [2] and adds the kubevirtbot name and email. [1] https://storage.googleapis.com/kubevirt-prow/logs/publish-kubernetes-nmstate/1331559224613801984/build-log.txt [2] https://github.com/kubevirt/kubevirtci/blob/master/publish.sh#L29-L30 Signed-off-by: Quique Llorente <[email protected]>
Operator-sdk and opm are necessary for dealing with bundle files, and testing those bundle files. Let's vendor them so we can pin the versions used. Signed-off-by: Brad P. Crochet <[email protected]>
* Roll out policy configuration sequentially Policies are by default applied sequentially, one node at a time. Introduced new API field for NNCP: Parallel, which can be set to true to roll out configuration in parallel. Signed-off-by: Radim Hrazdil <[email protected]> * Address review commits round1 Signed-off-by: Radim Hrazdil <[email protected]> * Set parallel od test-policy nncp based on env variable Signed-off-by: Radim Hrazdil <[email protected]> * Test enactments are progressing concurrently when nncp parallel is true Signed-off-by: Radim Hrazdil <[email protected]> * Add skip flag for parallel test when parallel is not set to true Signed-off-by: Radim Hrazdil <[email protected]>
Currectly kubernetes-nmstate prints the network state every time it does not find the default gw when the probe is running, clearly this is too verbose. To fix that this change in case of "-v=production" just print the error and the gjson path used to find the default gw, in case of "-v=debug" it also prints the network state. Signed-off-by: Quique Llorente <[email protected]>
* ci, Change ci Makefile commands use KUBECONFIG and OPERATOR_NAMESPACE Currently test-e2e-handler and test-e2e-operator use explicit kubeconfig of the knmstate repo. In order to run tier1 tests, we want to be able to use KUBECONFIG set by external cluster. We also add OPERATOR_NAMESPACE to ecplicitly mention that it is used in the tests. Added OPERATOR_NAMESPACE var to makefile targets commands Added KUBECONFIG var instead of explicit kubeconfig in ci makefile targets Signed-off-by: Ram Lavi <[email protected]> * ci, Remove usage of namespace in cluster scoped objects Currently we are using the namespace to find NodeNetworkState object This is wrong, since this object is cluster scoped. Removed the usage of the namespac when searching of NodeNetworkState objects Signed-off-by: Ram Lavi <[email protected]>
The kubernetes e2e test reporter was activated for operator tests but not for handler. Signed-off-by: Quique Llorente <[email protected]>
Current kubernetes-nmstate version does not Reconcile already populated NNCPs if Nodes are added to the cluster or node labels are modified. This PR force Reconcile of all policies if a node is create or updated. Since Reconcile functions already take care to checking node labels we choose to simplify the implementation here just checking for Node Update not checking if the update is for labels or if the labels are for this node, nmstate is declarative so if if same version is re-apply it should not be a noop. Signed-off-by: Quique Llorente <[email protected]>
* tests: Show NNCE if NNCP is not matching If a NNCP is not matching in the e2e tests the Reason and the Message is missing from the output and also the NNCEs are not presented. This change uses gomega gstruct matcher so we can have fuzzy match on Reason and Message and still it will show it them and also add the NNCEs in case of not matching. Signed-off-by: Quique Llorente <[email protected]> * tests: Initialize correctly testenv The OperatorNamespace variable was not initialized before kubernetes reporter is created so it couldn't find pods to get the logs from. Signed-off-by: Quique Llorente <[email protected]>
When rolling out nncp sequentially, if a node fails to apply configuration other nodes do not attempt to apply it and set their enactment condition to Aborted. Signed-off-by: Radim Hrazdil <[email protected]>
* Bump nmstate to 0.3 Looks like nmstate 0.2 is going to approach EOL so we bump to next minor it also depends on NM 1.26. We are not going to bump kubecirtci node NM since we have to check that this is retrocompatible. Signed-off-by: Quique Llorente <[email protected]> * Don't dump k8s pod logs at test output After fix the k8s reporter a part from dumping pod logs to a file it also dump it to GinkgoWriter, the GinkgoWriter show the content written with it when the test fail. Dumping this to logs makes the output too long and we already have it at the file, this change remove that and make the reporter just dump to a file. Signed-off-by: Quique Llorente <[email protected]> * Print generated bridge1 and bond1 names To reduce the collision between test in case tear down is broken we generate a different name for bridge1 and bond1 variable per test, for debugging purposes is good to know what this names are per test so you can see at what test they were generated. Signed-off-by: Quique Llorente <[email protected]> * Install tools only at go.mod changes After some changes at master the go install command to install the tools is run everytime a make target that need the tools is executed, this change leave as it was, creating a dependency between the binary and go.mod. Signed-off-by: Quique Llorente <[email protected]> * Adapt thest to nmstate-0.3 checks After bumping to nmstate-0.3 a lot o test were failing with good reason, they have bugs at tear-down code and some bridge/bonds where not propertly deleted, this make nmstate-0.3 fail when another bridge/bond is trying to use as slave a nic that already belongs to another bridge/bond. To fix all this now the tests are very strict too and the network configuration state check is done at all nodes not only at workers since some of our tests are run at master nodes too (examples, node label). Also there is a bug a trace that prints a big and ugly error log related of using WithValue instead of with name WithName at allPolicies function. Signed-off-by: Quique Llorente <[email protected]> * Add sleep after changing node labels at tests The NNCPs are reconciled if node labels are changed, but there is no webhook that blocks the operation until the NNCP status is reset, that's make some times fall into a wrong assumptions around status. This change add some sleep time after changing labels with a TODO about removing it when webhook is implemented. Signed-off-by: Quique Llorente <[email protected]>
Running 'make format' changes logo files, adding grep filter to ignore these. Signed-off-by: Radim Hrazdil <[email protected]>
k8s release-notes cmd has change the argument name from --format to --go-template, this change aligns the infra with that. Signed-off-by: Quique Llorente <[email protected]>
Future versions of nmstate/nm need to be tested to ensure that they don't introduce something really incompatible with kubernetes-nmstate to implement that a new "knob" is added to Makefile in the form of a NMSTATE_PIN variable that will use future versions if content is "future". Signed-off-by: Quique Llorente <[email protected]>
…e#656) The condition contains ! (negation) which should not be there and as a consequence incorrect error is returned Signed-off-by: Radim Hrazdil <[email protected]>
Signed-off-by: Tareq Alayan <[email protected]> Co-authored-by: Tareq Alayan <[email protected]>
Due to a BZ https://bugzilla.redhat.com/show_bug.cgi?id=1906307 temporarily removing second slave iface from the bond. When resloved, will add it back. Signed-off-by: Radim Hrazdil <[email protected]>
The node controller refresh NNSs every 5 seconds at every Reconcile it do three things, GET Node, call nmstatectl show and UPDATE NNS, this is too much of apiserver traffic and end up with big cpu usage. This change reduce the calls to GET and UPDATE by comparing current network state with last retrieved network state and if they differ then it do them. Signed-off-by: Quique Llorente <[email protected]>
…#668) * Print NNS test currentState as string Comparin directly the NNS Status struct print currentState as slice of bytes that does not help to find differences, this change tranform the obtained and expected to strings to make easier to find differences. Signed-off-by: Quique Llorente <[email protected]> * Remove gc-timer from e2e tests check When NNS current is compared the dynamic attributes has to be removed for it to match like is already dont at controller. Signed-off-by: Quique Llorente <[email protected]>
Test that try to create the test fail will fail if it's not deleted at all specs as teardown part. Signed-off-by: Quique Llorente <[email protected]>
Signed-off-by: Radim Hrazdil <[email protected]>
Listen 3.2.1 doesn't support ruby version >= 3.0 Signed-off-by: Radim Hrazdil <[email protected]>
Bumps [nokogiri](https://github.com/sparklemotion/nokogiri) from 1.10.10 to 1.11.1. - [Release notes](https://github.com/sparklemotion/nokogiri/releases) - [Changelog](https://github.com/sparklemotion/nokogiri/blob/master/CHANGELOG.md) - [Commits](sparklemotion/nokogiri@v1.10.10...v1.11.1) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Radim Hrazdil <[email protected]>
The NodeNeworkState is showing up calico interfaces but they are calico implementation tails and kubernets-nmstate should not show or modify them, also looks like it affects e2e tests on calico systems [1]. This change just filter them same as we do with veth. [1] https://prow.apps.ovirt.org/view/gcs/kubevirt-prow/pr-logs/pull/kubevirt_cluster-network-addons-operator/727/pull-e2e-cluster-network-addons-operator-nmstate-functests/1347188092745814016 Signed-off-by: Quique Llorente <[email protected]>
* Replace "slave" with "port" at e2e tests Future version of nmstate is replacing some degrading terms like slave with more neutral ones. This change replaace slave with ports to make tests compatible with new version of nmstate. Signed-off-by: Quique Llorente <[email protected]> * Expec miimon as integer value instead of string Signed-off-by: Quique Llorente <[email protected]> * Parameterize 'port' field name Signed-off-by: Quique Llorente <[email protected]> * Parameterize 'miimon' value Signed-off-by: Quique Llorente <[email protected]>
Added a validation webhook that denies updating a policy when this policy is still in progress. Signed-off-by: Radim Hrazdil <[email protected]>
Trying to refresh NNS every 5 seconds makes nmstatectl to call python everytime consuming CPU resources, decreasing the frequency will reduce CPU consumption. Also increasing the refresh period to 1 minute make CI fail on timeout, to overcome that we force NNS refresh if a NNCP is applied. And filter gc-timer and hello-timer hey are already not reflecting reality since we remove them before compare them so the value that is present at NNS is not accurate, let's just remove them from NNS so we can use FilterOut to compare states too, so we don't compare filtered stuff like veth or calico. Proper solution is to use varlink so python is already started up at different container, this is done at nmstate#663. Signed-off-by: Quique Llorente <[email protected]>
Refreshing NNS at the same time on all the nodes can hit apiserver hard, let's jitter it a little so refreshing is distributed over time. Signed-off-by: Quique Llorente <[email protected]>
We have found that at some envs the "bridge" section from linux-bridge is not present in case the bridge is down, this change ignore the filtering in this is the case to preveng a segmentation fault. Signed-off-by: Quique Llorente <[email protected]>
Signed-off-by: Avishay Traeger <[email protected]>
If the KUBEVIRT_PROVIDER is not set at the Makefile the secondary nics names are not known and test start to fail. This change add the default KUBEVIRT_PROVIDER at the Makefile and remove it from the automation so the Makefile is the one ruling about it. Signed-off-by: Quique Llorente <[email protected]>
nmstate 0.3 introduces a bug where all veths attached to nmstatectl configured bridge turn into "managed" by NetworkManager. This was not the case in 0.2. Due to this regression, NetworkManager deliberatelly detaches veth ifaces from the bridge and by doing that it disconnectecs Pods/VMs from the network. With this change, we explicitly set veth interfaces as unmanaged. Important: With this change, it is no longer possible to change configuration of bridges that have veths attached. Signed-off-by: Petr Horáček <[email protected]>
This reverts commit f717775. Signed-off-by: Petr Horáček <[email protected]>
* Bump kaw to v0.13.0 The certificate overlap mechanism is implemented like renewBefore from cert-manager. Signed-off-by: Quique Llorente <[email protected]> * Correct certificate rotation overlap values The kube-admission-webhook v0.13.0 use CertOverlapInterval as the moment when certificates has to start to rotate before the current certificate expiration time, this change reduce CertOverlapInterval and CARotateInterval to take that into account. Signed-off-by: Quique Llorente <[email protected]>
* state: Interpret basic state through an explicit structure In order to ease the interpretation of the state yaml data, use a type structure that represents the basic root items of state: - interfaces - routes: - config - running In order to keep the generic nature of these items content, they use as value a generic Go interface{}. The filter helper functions, with this change, have been also made immutable. They now return the filtered values. Signed-off-by: Edward Haas <[email protected]> * state, filter: Introduce unit tests for numeric iface names Interfaces names with numeric values are accepted by the kernel. Introduce tests that checks the behavior when using numeric names. Signed-off-by: Edward Haas <[email protected]> * state, filter: Do not panic on interpretation of ifaces names In scenarios where an interface name is composed of numbers in scientific notation without a dot, the application panics with a failure to extract a string type from a Go interface (which is actually a float64). In order to fix the problem, the parsing of the interface name is performed using a dedicated interface-state structure. Note: this change does not solve the problem of incorrectly representing valid scientific numeric values like `10e+02` as strings (they are now represented back as full integers: "1000"). This is due to the YAML-to-JSON conversion which does not obey to the defined member type. ref: yaml/pyyaml#173 Signed-off-by: Edward Haas <[email protected]> * state, filter: Fix interpretation of scientific numeric iface names This change solves the problem of incorrectly representing valid scientific numeric valuesi without a dot in them (e.g. `10e20`) as strings (e.g. represented back as `1000`). The problem originates from the YAML-to-JSON conversion which does not obey to the defined member type. Fixed by using the go-yaml package (which does not perform such a conversion from YAML to JSON). The go-yaml is used just to unmarshal the name from the original raw byte stream and update the state structure with the proper name string. Signed-off-by: Edward Haas <[email protected]>
* Bump kaw v0.14.0 and operator-sdk v1.4.2 The webhookconfiguration v1alpha1 is going to be deprecated at kubernetes 1.22. This change bumps operator-sdk and kube-admission-webhook to support webhookconfiguration v1 Signed-off-by: Quique Llorente <[email protected]> * Adapt code to new operator-sdk and kaw versions After bumping operator-sdk to 1.4.2 the Reconcile function expects a context.Context and the Meta at events is included in the client.Object passed to it. Signed-off-by: Quique Llorente <[email protected]> * Don't use .gitignore at automation rsync After bumping deps some of the .gitignore in the project is making rsync ignore "vendor/github.com/moby/term/.gitignore" and this breaks "make check-gen" Signed-off-by: Quique Llorente <[email protected]>
The network configuration at nodes is chaning sometimes without human intervention (at kubevirtci some ipv6 routes disappear sometimes or appear sometimes). This change do a best effort and checks nns update timestamp according to currentState changed or not. Signed-off-by: Quique Llorente <[email protected]>
Currently, the operator will only reconcile an nmstate resource that has a name 'nmstate'. This patch makes it possible to add the resource as any name, but subsequent resources added will be ignored. Signed-off-by: Brad P. Crochet <[email protected]>
* sync vendor Signed-off-by: Radim Hrazdil <[email protected]> * Change spec.Parallel field with spec.MaxUnavailable Change parallel field with maxUnavailable, which allows more granular configuration for concurrent policy configuration. MaxUnavailable can be either set with an int value or percentage in string: spec: maxUnavailable: "50%" or spec: maxUnavailable: 3 Updated used guide and added an example policy that specifies maxUnavailable. Signed-off-by: Radim Hrazdil <[email protected]> * Update tests that check conditions Update test that check nnce conditions to tolerate both failing and aborted conditions when invalid nncp is created Signed-off-by: Radim Hrazdil <[email protected]>
Instead of calculating number of maxUnavailable nodes from all nmstate enabled nodes, use matching nodes only. Signed-off-by: Radim Hrazdil <[email protected]>
Keep owners we latest review comments. Signed-off-by: Quique Llorente <[email protected]>
Currectly the cert-manager from kube-admission-webhook is running whithing the webhook server itself, this means that we need to create a placeholder secret and also a very specific readiness probe has to be used to ensure that secret is properly rotated at start up. This change introduces a new pod runing the cert-manager code so we don't need to create the placeholder secret (the webhook pod will not start until the secret is not created) and we don't need a readiness probess since secret will be correct form the beginning. Also start up is faster since webhook pods start at the very moment the secret is created by cert-manager pod. Signed-off-by: Quique Llorente <[email protected]>
* Consolidate parallel and sequential lanes Run the e2e handler test lane only once, configured to use default 50% of worker nodes to process NNCPs concurrently. For that reason, setting number of nodes to 5 so that 2 nodes run concurrently (4/2 == 2) Signed-off-by: Radim Hrazdil <[email protected]> * Fix test conditions Now when a failing NNCP is applied, we accept NNCEs in two states: aborted and failed. Signed-off-by: Radim Hrazdil <[email protected]>
Radim has being contributing a lot lately so is fair to make him maintainer. Signed-off-by: Quique Llorente <[email protected]>
Although the mounted secret is good from the beginning, webhook will be only ready when webhook server is started up. Signed-off-by: Quique Llorente <[email protected]>
test_logs directory is created without x permissions so it's not possible to cd into it. Signed-off-by: Radim Hrazdil <[email protected]>
Bumps [kramdown](https://github.com/gettalong/kramdown) from 2.3.0 to 2.3.1. - [Release notes](https://github.com/gettalong/kramdown/releases) - [Changelog](https://github.com/gettalong/kramdown/blob/master/doc/news.page) - [Commits](https://github.com/gettalong/kramdown/commits) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The internal Networking team needs to be able to approve and review. Signed-off-by: Brad P. Crochet <[email protected]>
Signed-off-by: Brad P. Crochet <[email protected]>
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bcrochet, cybertron The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CoreDNS v1.8.1 rebase process
Inspired by the commit description for openshift/coredns#52
Identify carry commits:
git log --oneline --no-merges v0.44.0..openshift/master
Note that the
UPSTREAM: <carry>: openshift:
prefix was added to any new commits to carry since the previous rebase. This will make new carry commits easy to identify alongside prior carry commits.After identifying the carry commits, the next step is to create the new commit-tree that will be used for the rebase and then cherry pick the carry commits into the new branch.
The following commands cover these steps:
Process
My 0.44.0 merge commit from the above process is (9e19439).
With the merge-0.44.0 branch in place, I cherry picked the carry commits. Merge conflicts were resolved on an individual commit basis when necessary. Some carry commits were squashed to reduce the number of cherry-picks in future rebases. If there are any questions there please let me know.
Carried Commits (applied bottom to top / oldest first):
Pre-Squash Commits (all of these were squashed into one of the above commits):