Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean linkstatus on node remove and fix vlanconfig validation error #79

Merged
merged 5 commits into from
Mar 24, 2023

Conversation

yaocw2020
Copy link
Contributor

@yaocw2020 yaocw2020 commented Mar 22, 2023

Problem:

  1. Clear link statuses of the deleted node in all the link monitor
  2. Skip vlanconfig webhook(includes validator and mutator) if the spec is not changed.

Solution:

  1. Add OnRemove function in the node controller to clear link statuses
  2. Use reflect to compare the spec of the old vlanconfig and new vlanconfig. If it's not changed, skip the webhook.

By the way, use a more elegant way to upgrade golang.org/x/net

Related Issue:
harvester/harvester#3685
harvester/harvester#3697

Test plan:

Case 1:

Case2:

  • Set up a cluster with two nodes. The compute node has more than one NIC.
  • Add a new cluster network and network config which selects the compute node.
  • Delete the compute node. The cluster network will be unready. The value of annotation network.harvesterhci.io/matched-nodes is "[]". The related vlanstatus is deleted.

@yaocw2020 yaocw2020 requested review from bk201 and guangbochen March 22, 2023 09:24
@guangbochen guangbochen requested a review from futuretea March 22, 2023 09:26
pkg/controller/manager/node/controller.go Outdated Show resolved Hide resolved
pkg/webhook/vlanconfig/validator.go Outdated Show resolved Hide resolved
@guangbochen guangbochen changed the title Solution two issues related to node change clean linkstatus on node remove and fix vlanconfig validation error Mar 23, 2023
@futuretea
Copy link
Contributor

futuretea commented Mar 23, 2023

Maybe this PR should be divided into two, the validation of harvester/harvester#3685 passed in my env

Copy link
Contributor

@futuretea futuretea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks! Test PASS

❯ kubectl -n harvester-system get po -o custom-columns='NAME:metadata.name,IMAGES:spec.containers[*].image'
NAME                                                    IMAGES
harvester-7bc4cd4ff8-7hm8t                              rancher/harvester:v1.1-head
harvester-load-balancer-6f97f675f9-qtcrz                rancher/harvester-load-balancer:v0.1.4
harvester-network-controller-gwzz8                      harbor.futuretea.me/rancher/harvester-network-controller:c00dd1a-amd64
harvester-network-controller-manager-6746b59c75-4vlqc   harbor.futuretea.me/rancher/harvester-network-controller:c00dd1a-amd64
harvester-network-webhook-5b999f8649-gn7zf              harbor.futuretea.me/rancher/harvester-network-webhook:c00dd1a-amd64
harvester-node-disk-manager-njjkt                       rancher/harvester-node-disk-manager:v0.4.10
harvester-node-manager-4svxh                            rancher/harvester-node-manager:v0.1.3
harvester-webhook-79955cff6b-67cp2                      rancher/harvester-webhook:v1.1-head
kube-vip-4lv6x                                          ghcr.io/kube-vip/kube-vip:v0.5.10
kube-vip-cloud-provider-0                               kubevip/kube-vip-cloud-provider:v0.0.1
virt-api-7cfd78b999-rwmcn                               registry.suse.com/suse/sles/15.4/virt-api:0.54.0-150400.3.7.1
virt-controller-5d54b8b9bf-6m8fd                        registry.suse.com/suse/sles/15.4/virt-controller:0.54.0-150400.3.7.1
virt-controller-5d54b8b9bf-6qvxz                        registry.suse.com/suse/sles/15.4/virt-controller:0.54.0-150400.3.7.1
virt-handler-nj2zb                                      registry.suse.com/suse/sles/15.4/virt-handler:0.54.0-150400.3.7.1
virt-operator-7bdb8d7bff-hvkhk                          registry.suse.com/suse/sles/15.4/virt-operator:0.54.0-150400.3.7.1

❯ k get lms -oyaml
apiVersion: v1
items:
- apiVersion: network.harvesterhci.io/v1beta1
  kind: LinkMonitor
  metadata:
    creationTimestamp: "2023-03-23T08:49:59Z"
    finalizers:
    - wrangler.cattle.io/harvester-network-link-monitor-controller
    generation: 22
    name: mgmt
    resourceVersion: "147726"
    uid: be0c796c-ea18-4c9b-b682-2fce3b443c5e
  spec:
    targetLinkRule:
      nameRule: mgmt(-br|-bo)
  status:
    linkStatus:
      harv-11-1:
      - index: 4
        mac: "50:50:50:50:50:37"
        name: mgmt-br
        state: up
        type: bridge
      - index: 5
        mac: "50:50:50:50:50:37"
        masterIndex: 4
        name: mgmt-bo
        state: up
        type: bond
- apiVersion: network.harvesterhci.io/v1beta1
  kind: LinkMonitor
  metadata:
    creationTimestamp: "2023-03-23T08:49:58Z"
    finalizers:
    - wrangler.cattle.io/harvester-network-link-monitor-controller
    generation: 55
    name: nic
    resourceVersion: "147721"
    uid: 93eb506a-8681-4ca2-a2ee-9022570b7787
  spec:
    targetLinkRule:
      typeRule: device
  status:
    linkStatus:
      harv-11-1:
      - index: 2
        mac: "50:50:50:50:50:37"
        masterIndex: 5
        name: ens33
        state: up
        type: device
      - index: 3
        mac: 00:0c:29:38:4f:5e
        masterIndex: 77
        name: ens34
        state: up
        type: device
- apiVersion: network.harvesterhci.io/v1beta1
  kind: LinkMonitor
  metadata:
    creationTimestamp: "2023-03-23T08:56:54Z"
    finalizers:
    - wrangler.cattle.io/harvester-network-link-monitor-controller
    generation: 65
    name: test
    resourceVersion: "147724"
    uid: 87c3e38a-eb22-4945-a525-c7e4f5a96a62
  spec:
    targetLinkRule:
      nameRule: test(-br|-bo)
  status:
    linkStatus:
      harv-11-1:
      - index: 10
        mac: be:56:d7:70:a0:a0
        name: test-br
        promiscuous: true
        state: up
        type: bridge
      - index: 77
        mac: 00:0c:29:38:4f:5e
        masterIndex: 10
        name: test-bo
        state: up
        type: bond
kind: List
metadata:
  resourceVersion: ""

guangbochen
guangbochen previously approved these changes Mar 23, 2023
Copy link
Contributor

@guangbochen guangbochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and verified that both linkstatus update and vlanconfig validation are working fine.
But have spotted a new error of, could be related to the NAD webhook validation:

time="2023-03-23T14:09:03Z" level=error msg="could not update nad default/test because it's still used by VM(s) default/vm1 which must be stopped at first"
error: stream error: stream ID 3; INTERNAL_ERROR; received from peer

@guangbochen guangbochen self-requested a review March 23, 2023 14:14
@guangbochen guangbochen dismissed their stale review March 23, 2023 14:17

spotted new error

@@ -132,7 +132,7 @@ func (h Handler) deleteLinkMonitor(name string) error {
func (h Handler) setNadReadyLabel(cn *networkv1.ClusterNetwork) error {
isReady := utils.ValueFalse
// Set all net-attach-defs under the cluster network to be deleted as unready
if cn.DeletionTimestamp != nil && networkv1.Ready.IsTrue(cn.Status) {
if cn.DeletionTimestamp == nil && networkv1.Ready.IsTrue(cn.Status) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ready status is wrong before. However it's not used. We fix it together.

Copy link
Member

@bk201 bk201 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, adding node and deleting node works.

Copy link
Contributor

@guangbochen guangbochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants