Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Implement resource constraints on nodes #896

Closed
wants to merge 6 commits into from

Conversation

aojea
Copy link
Contributor

@aojea aojea commented Oct 1, 2019

Allow setting CPU and Memory constraints on kind nodes.

Example:

  1. Create a cluster with a configuration that set constraints in one node
kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
# the control plane node
- role: control-plane
- role: worker
  constraints:
    memory: "100m"
    cpu: "1"

from https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/#specify-a-memory-request-and-a-memory-limit

Create a pod that start to consume more memory that the one assigned to the node

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
spec:
  containers:
  - name: memory-demo-ctr
    image: polinux/stress
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "1500M", "--vm-hang", "1"]
EOF

You can verify, if you are using the docker provider, using docker stats that the node memory is being limited:

CONTAINER ID        NAME                 CPU %               MEM USAGE / LIMIT     MEM %               NET I/O  
          BLOCK I/O           PIDS
d36e98597798        kind-control-plane   6.67%               1.262GiB / 31.39GiB   4.02%               235kB / 1.16MB      844kB / 60.2MB      395
0d6032a26aad        kind-worker          33.56%              99.35MiB / 100MiB     99.35%              3.97MB / 247kB      269MB / 88.1MB      142

and that the memory inside the node is not growing beyond the limit

Tasks:  19 total,   1 running,  18 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.9 us,  2.4 sy,  0.0 ni,  0.0 id, 96.6 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  32147.3 total,  16734.1 free,   1959.3 used,  13453.9 buff/cache
MiB Swap:   2055.0 total,   1658.1 free,    396.9 used.  29781.0 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                     
  908 root      20   0  130492   2444      0 S   4.0   0.0   0:08.87 kindnetd                                    
  734 root      20   0 2140992  67840  60452 S   2.7   0.2   0:15.46 kubelet                                     
   45 root      20   0 2284748  26936  22944 S   2.5   0.1   0:12.26 containerd                                  
  835 root      20   0   10744   4348   4180 S   0.7   0.0   0:00.26 containerd-shim                             
  787 root      20   0   10744   4744   4436 S   0.2   0.0   0:00.42 containerd-shim                             
  891 root      20   0    9400   3796   3796 S   0.2   0.0   0:00.37 containerd-shim                             
  931 root      20   0  140760   4484      0 S   0.1   0.0   0:19.99 kube-proxy                                  
    1 root      20   0   17532   7828   7744 S   0.0   0.0   0:00.35 systemd                                     
   30 root      19  -1   22556   6280   6160 D   0.0   0.0   0:00.66 systemd-journal                             
  804 root      20   0    1024      0      0 S   0.0   0.0   0:00.00 pause                                       
  853 root      20   0    1024      0      0 S   0.0   0.0   0:00.04 pause                                       
  913 root      20   0   10744   4308   4308 S   0.0   0.0   0:00.45 containerd-shim                             
 1606 root      20   0   10744   4160   4052 S   0.0   0.0   0:00.39 containerd-shim                             
 1623 root      20   0    1024      0      0 S   0.0   0.0   0:00.03 pause                                       
 1684 root      20   0    9336   3468   3468 S   0.0   0.0   0:00.37 containerd-shim                             
 1699 root      20   0     744      0      0 S   0.0   0.0   0:00.06 stress                                      
 1718 root      20   0 1536748  40412     36 D   0.0   0.1   0:12.56 stress                                      
 1758 root      20   0    4176   2884   2844 S   0.0   0.0   0:00.06 bash                                        
 1770 root      20   0    6024   3136   2700 R   0.0   0.0   0:00.04 top               

Fixes #877

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 1, 2019
@k8s-ci-robot k8s-ci-robot requested review from amwat and munnerz October 1, 2019 12:10
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 1, 2019
@aojea
Copy link
Contributor Author

aojea commented Oct 1, 2019

/assign @BenTheElder

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aojea
To complete the pull request process, please assign bentheelder
You can assign the PR to them by writing /assign @bentheelder in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 1, 2019
@BenTheElder
Copy link
Member

First of all: does this actually work? Did you try deploying a resource intensive pod?

pkg/apis/config/v1alpha3/types.go Outdated Show resolved Hide resolved
pkg/apis/config/v1alpha3/types.go Outdated Show resolved Hide resolved
pkg/apis/config/v1alpha3/types.go Outdated Show resolved Hide resolved
site/content/docs/user/quick-start.md Show resolved Hide resolved
site/content/docs/user/quick-start.md Outdated Show resolved Hide resolved
@aojea
Copy link
Contributor Author

aojea commented Oct 1, 2019

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 1, 2019
@aojea aojea force-pushed the constraint branch 3 times, most recently from 6e56fd1 to 58b6211 Compare October 2, 2019 10:02
@aojea
Copy link
Contributor Author

aojea commented Oct 2, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 2, 2019
@aojea
Copy link
Contributor Author

aojea commented Oct 2, 2019

@k8s-ci-robot
pull-kind-e2e-kubernetes — Job failed.

🤔

@aojea
Copy link
Contributor Author

aojea commented Oct 3, 2019

/test pull-kind-e2e-kubernetes

@BenTheElder BenTheElder added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Oct 11, 2019
@BenTheElder
Copy link
Member

let's take another look at this for v1alpha5

still need to see how this behaves on eg docker for mac, dig into the details more.

@BenTheElder
Copy link
Member

cc @amwat

@aojea aojea force-pushed the constraint branch 2 times, most recently from 547f356 to 8bf09c9 Compare December 10, 2019 17:06
@Gsantomaggio
Copy link

Thank you @aojea
This feature seems to be very useful.

is this going to be included in the next release?

@BenTheElder
Copy link
Member

/lifecycle frozen
there are no plans to put this in the next release, I'm still not convinced that this does what you'd expect correctly, and we need to think about defaulting etc. more.

we originally only added multi-node as a way to test rolling update and taints in kubernetes. they're not meant to be an effective means of isolation. all of the nodes share one physical host and kernel.

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Mar 13, 2020
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 1, 2020
@aojea aojea changed the title Implement resource constraints on nodes WIP Implement resource constraints on nodes May 1, 2020
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 1, 2020
@k8s-ci-robot
Copy link
Contributor

@aojea: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kind-verify f57297f link /test pull-kind-verify
pull-kind-unit f57297f link /test pull-kind-unit
pull-kind-conformance-parallel-1-15 f57297f link /test pull-kind-conformance-parallel-1-15
pull-kind-e2e-kubernetes-1-18 f57297f link /test pull-kind-e2e-kubernetes-1-18
pull-kind-conformance-parallel-1-17 f57297f link /test pull-kind-conformance-parallel-1-17
pull-kind-conformance-parallel-ipv6 f57297f link /test pull-kind-conformance-parallel-ipv6
pull-kind-conformance-parallel-1-16 f57297f link /test pull-kind-conformance-parallel-1-16
pull-kind-e2e-kubernetes f57297f link /test pull-kind-e2e-kubernetes

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@aojea
Copy link
Contributor Author

aojea commented Jun 16, 2020

This feature seems to much work for "faking" a VM, people can (??) use lxcfs https://linuxcontainers.org/lxcfs/introduction/ or just VMs, but adding this logic to kind does not make sense to me after spending some time with it.... better integrating firecracker or one runtime that provides the isolation

@aojea aojea closed this Jun 16, 2020
@aojea aojea deleted the constraint branch August 9, 2020 08:42
@zhanghe9702
Copy link

zhanghe9702 commented Dec 16, 2021

kind using docker construct virtual Node, why not using docker cmd options reserve resources ? add same option kind create cluster should be enough? https://docs.docker.com/config/containers/resource_constraints/

@aojea
Copy link
Contributor Author

aojea commented Dec 16, 2021

that is what this does, however, the kubelet on the kind nodes see ALL the host resources, not the containers resources, see #896 (comment)

@zhanghe9702
Copy link

that is what this does, however, the kubelet on the kind nodes see ALL the host resources, not the containers resources, see #896 (comment)

👍 ,thx

@chevalsumo
Copy link

Despite the fact that kubelet sees all the resources, this solution suits my use case, because I want to simulate nodes with limited capacities and I know in advance the resources requested by my pods. I would like to ask you how I could use it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Configure capacity of the worker nodes
6 participants