Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add anti-affinity for controller #174

Closed
wants to merge 2 commits into from

Conversation

curx
Copy link

@curx curx commented Mar 11, 2021

Signed-off-by: Thorsten Schifferdecker [email protected]

What type of PR is this?
/kind feature

What this PR does / why we need it:

add an anti-affinity spec to cis-nfs-controller

Which issue(s) this PR fixes:
see #173

Does this PR introduce a user-facing change?:

fix: add anti-affinity for the controller

Signed-off-by: Thorsten Schifferdecker <[email protected]>
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 11, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @curx!

It looks like this is your first PR to kubernetes-csi/csi-driver-nfs 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-csi/csi-driver-nfs has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: curx
To complete the pull request process, please assign saad-ali after the PR has been reviewed.
You can assign the PR to them by writing /assign @saad-ali in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Hi @curx. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 11, 2021
@andyzhangx
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 11, 2021
Signed-off-by: Thorsten Schifferdecker <[email protected]>
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Mar 11, 2021
@coveralls
Copy link

Pull Request Test Coverage Report for Build 642750984

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 80.47%

Totals Coverage Status
Change from base Build 642057003: 0.0%
Covered Lines: 548
Relevant Lines: 681

💛 - Coveralls

@andyzhangx
Copy link
Member

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2021
@andyzhangx
Copy link
Member

this nfs driver is actually not using liveness probe correctly, otherwise the second controller pod on the same node would be in pending state since it's using hostNetwork, refer to:
https://github.com/kubernetes-csi/csi-driver-smb/blob/cd758b95117c24c51146fea0a536f04c91a7f5c3/deploy/csi-smb-controller.yaml#L70-L73

That's two issues 1) we should schedule controller pods on two different nodes 2) use probe correctly

@andyzhangx
Copy link
Member

PR(#175) fixed the issue, the second nfs controller pod would be in Pending status if there is only one agent node, here are the second pod events:

# k describe po csi-nfs-controller-5dfd6db785-slssf -n kube-system
Name:                 csi-nfs-controller-5dfd6db785-slssf
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 <none>
Labels:               app=csi-nfs-controller
                      pod-template-hash=5dfd6db785
Annotations:          <none>
Status:               Pending
IP:
IPs:                  <none>
Controlled By:        ReplicaSet/csi-nfs-controller-5dfd6db785
Containers:
  csi-provisioner:
    Image:      k8s.gcr.io/sig-storage/csi-provisioner:v2.1.0
    Port:       <none>
    Host Port:  <none>
    Args:
      -v=2
      --csi-address=$(ADDRESS)
      --leader-election
    Limits:
      cpu:     100m
      memory:  300Mi
    Requests:
      cpu:     10m
      memory:  20Mi
    Environment:
      ADDRESS:                       /csi/csi.sock
      KUBERNETES_PORT_443_TCP_ADDR:  andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io
      KUBERNETES_PORT:               tcp://andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:       tcp://andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io:443
      KUBERNETES_SERVICE_HOST:       andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-nfs-controller-sa-token-p59bw (ro)
  liveness-probe:
    Image:      k8s.gcr.io/sig-storage/livenessprobe:v2.1.0
    Port:       <none>
    Host Port:  <none>
    Args:
      --csi-address=/csi/csi.sock
      --probe-timeout=3s
      --health-port=29652
      --v=2
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:     10m
      memory:  20Mi
    Environment:
      KUBERNETES_PORT_443_TCP_ADDR:  andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io
      KUBERNETES_PORT:               tcp://andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:       tcp://andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io:443
      KUBERNETES_SERVICE_HOST:       andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-nfs-controller-sa-token-p59bw (ro)
  nfs:
    Image:      mcr.microsoft.com/k8s/csi/nfs-csi:latest
    Port:       29652/TCP
    Host Port:  29652/TCP
    Args:
      -v=5
      --nodeid=$(NODE_ID)
      --endpoint=$(CSI_ENDPOINT)
    Limits:
      cpu:     200m
      memory:  200Mi
    Requests:
      cpu:     10m
      memory:  20Mi
    Liveness:  http-get http://:healthz/healthz delay=30s timeout=10s period=30s #success=1 #failure=5
    Environment:
      NODE_ID:                        (v1:spec.nodeName)
      CSI_ENDPOINT:                  unix:///csi/csi.sock
      KUBERNETES_PORT_443_TCP_ADDR:  andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io
      KUBERNETES_PORT:               tcp://andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:       tcp://andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io:443
      KUBERNETES_SERVICE_HOST:       andy-aks11-andy-aks11962-b9d228-e3c955a0.hcp.eastus2euap.azmk8s.io
    Mounts:
      /csi from socket-dir (rw)
      /var/lib/kubelet/pods from pods-mount-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-nfs-controller-sa-token-p59bw (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  pods-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/pods
    HostPathType:  Directory
  socket-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  csi-nfs-controller-sa-token-p59bw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  csi-nfs-controller-sa-token-p59bw
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     node-role.kubernetes.io/master=true:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age        From  Message
  ----     ------            ----       ----  -------
  Warning  FailedScheduling  <unknown>        0/2 nodes are available: 1 node(s) didn't have free ports for the requested pod ports, 1 node(s) were unschedulable.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 14, 2021
@k8s-ci-robot
Copy link
Contributor

@curx: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andyzhangx andyzhangx closed this Apr 7, 2021
TerryHowe pushed a commit to TerryHowe/csi-driver-nfs that referenced this pull request Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/feature Categorizes issue or PR as related to a new feature. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants