Add support for resource limits to play kube #7853

xordspar0 · 2020-09-30T17:58:44Z

Notes:

MemoryReservation seems like a rough equivalent to requests.memory.
I don't know of any equivalent to requests.cpu. I'm not sure if this makes sense in a single-node cluster anyway.

mheon · 2020-09-30T18:57:11Z

Last commit is missing a signoff.

Overall, LGTM. Does generate kube support making these limits? Otherwise we should get an issue written up for that.

rhatdan · 2020-09-30T19:19:10Z

Could you squash your commits together and sign them.

xordspar0 · 2020-09-30T19:38:41Z

Overall, LGTM. Does generate kube support making these limits? Otherwise we should get an issue written up for that.

Good question. Not according to my quick test. I'll file an issue.

Could you squash your commits together and sign them.

Sure, but I don't think the test is passing yet, at least not my local tests. I'll let the CI run it and see what happens.

haircommander · 2020-09-30T19:41:44Z

LGTM thanks @xordspar0

xordspar0 · 2020-09-30T20:32:15Z

Hm, it's failing in CI too:

         Running: podman [options] play kube /tmp/podman_test266484954/kube.yaml
         Trying to pull docker.io/library/alpine:latest...
         Getting image source signatures
         Copying blob sha256:df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c
         Copying config sha256:a24bb4013296f61e89ba57005a7b3e52274d8edd3ae2077d04395f806b63d83e
         Writing manifest to image destination
         Storing signatures
         Error: container_linux.go:370: starting container process caused: process_linux.go:338: getting the final child's pid from pipe caused: EOF: OCI runtime error

I'll keep working on it.

test/e2e/play_kube_test.go

zhangguanzhang · 2020-10-02T14:19:10Z

the errors

[+0241s] Podman generate kube 
           podman play kube allows setting resource limits
           /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:1249
         
         [BeforeEach] Podman generate kube
           /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:623
         [It] podman play kube allows setting resource limits
           /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:1249
         Running: podman [options] play kube /tmp/podman_test033427213/kube.yaml
         Trying to pull docker.io/library/alpine:latest...
         Getting image source signatures
         Copying blob sha256:df20fa9351a15782c64e6dddb2d4a6f50bf6d3688060a34c4014b0d9a752eb4c
         Copying config sha256:a24bb4013296f61e89ba57005a7b3e52274d8edd3ae2077d04395f806b63d83e
         Writing manifest to image destination
         Storing signatures
         Error: writing file `cpu.max`: Invalid argument: OCI runtime error
         [AfterEach] Podman generate kube
           /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:635
         Running: podman [options] stop -a --time 0
         262bfa189da0dd6845f38d8218fd0beea513e3b254939ae968bd7250462fa1cd
         b36f8ddb9e99572ee928e6ed87ce218ccabaf1c9eef07dc03ed665c98b5ce412
         Running: podman [options] pod stop -a -t 0
         fbae9e0acfbfc407bbacab811d038ba74cedbcb3f537942d61d8d1ad8f439e1c
         Running: podman [options] pod rm -fa
         fbae9e0acfbfc407bbacab811d038ba74cedbcb3f537942d61d8d1ad8f439e1c
         Running: podman [options] rm -fa
         
         

         • Failure [3.633 seconds]
         Podman generate kube
         /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:615
           podman play kube allows setting resource limits [It]
           /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:1249
         
           Expected
               <int>: 125
           to equal
               <int>: 0
         
           /var/tmp/go/src/github.com/containers/podman/test/e2e/play_kube_test.go:1271

xordspar0 · 2020-10-06T18:14:07Z

Woo, I figured it out! I fixed the error with the help of this thread in the Kubernetes project. The minimum allowed CPU quota is 1 ms. The Kubernetes API gives us an int64 for CPU quota in millicores, but internally in Podman we represent CPU quotas in microseconds, which is what the kernel uses. If I just take 10 millicores and give it to the kernel, it will interpret it as 10 microseconds, which is too small. I added code to do the conversion from millicores to microseconds.

xordspar0 · 2020-10-06T20:36:13Z

Is there a way to only run the test in CI environments that have cgroups v2?

mheon · 2020-10-06T20:49:28Z

I think SkipIfNotFedora() will do what you're looking for

rhatdan · 2020-10-07T10:05:31Z

test/e2e/play_kube_test.go

+
+	It("podman play kube allows setting resource limits", func() {
+		SkipIfContainerized("Resource limits require a running systemd")
+		SkipIfNotFedora() // Requires cgroups v2


SkipIfRootlessCgroupsV1()
Will this test work on a CgroupV1 rootfull environment?

You're right. I have seen the error "invalid configuration, cannot specify resource limits without cgroups v2 and --cgroup-manager=systemd" before, but when I check the code I see that that only applies in rootless mode.

podman/pkg/spec/spec.go

Lines 420 to 438 in 0b7b222

if rootless.IsRootless() {

cgroup2, err := cgroups.IsCgroup2UnifiedMode()

if err != nil {

return nil, err

}

if !addedResources {

configSpec.Linux.Resources = &spec.LinuxResources{}

}

canUseResources := cgroup2 && runtimeConfig != nil && (runtimeConfig.Engine.CgroupManager == cconfig.SystemdCgroupsManager)

if addedResources && !canUseResources {

return nil, errors.New("invalid configuration, cannot specify resource limits without cgroups v2 and --cgroup-manager=systemd")

}

if !canUseResources {

// Force the resources block to be empty instead of having default values.

configSpec.Linux.Resources = &spec.LinuxResources{}

}

}

Hm, I don't understand what happened. When I had only SkipIfContainerized there were two failures: the rootless cgroups v1 tests. Now, with SkipIfContainerized and SkipIfRootlessCgroupsV1, there are 10 CI builds that fail. 😕

This doesn't seem to work with Cgroups v1, even with root. All the Cgroups v1 CI builds are hitting this error:

Error: container_linux.go:370: starting container process caused: process_linux.go:338: getting the final child's pid from pipe caused: EOF: OCI runtime error

I'm not sure if that's normal for podman and resource limits.

Limits seem to require root and Cgroups v2. When I test with podman run on my local machine I get this:

$ bin/podman run -it --rm --cpu-quota 10000 alpine sh Error: OCI runtime error: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: process_linux.go:422: setting cgroup config for procHooks process caused: cannot set cpu limit: container could not join or create cgroup $ sudo bin/podman run -it --rm --cpu-quota 10000 alpine sh [sudo] password for jordan: / #

mheon · 2020-10-08T19:23:43Z

LGTM once this goes green.
/approve

openshift-ci-robot · 2020-10-08T19:23:46Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mheon, xordspar0

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [mheon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

TomSweeneyRedHat · 2020-10-08T21:08:03Z

all kinds of test unhappiness @xordspar0

xordspar0 · 2020-10-12T03:57:19Z

I'm making slow progress on this. The following diff makes the tests pass on my Fedora 33 VM. Both should be valid, so there's still a bug hiding in my code somewhere...

diff --git a/test/e2e/play_kube_test.go b/test/e2e/play_kube_test.go
index 01ea53da4..356c00c3e 100644
--- a/test/e2e/play_kube_test.go
+++ b/test/e2e/play_kube_test.go
@@ -1413,8 +1413,8 @@ spec:
                        numReplicas           int32  = 3
                        expectedCpuRequest    string = "100m"
                        expectedCpuLimit      string = "200m"
-                       expectedMemoryRequest string = "10000"
-                       expectedMemoryLimit   string = "20000"
+                       expectedMemoryRequest string = "10000Ki"
+                       expectedMemoryLimit   string = "20000Ki"
                )
 
                expectedCpuQuota := milliCPUToQuota(expectedCpuLimit)

xordspar0 · 2020-10-12T04:15:37Z

Acutally, this isn't a bug in my code. What I learned:

If you don't specify units for memory in a Kubernetes spec, it uses bytes, not kilobytes like I assumed.
10 K is not enough memory to start a container.

Hence the timeouts in some tests. The only thing that was confusing for me was this error:

$ bin/podman play kube deployment-with-too-little-memory.yaml 
Trying to pull docker.io/library/alpine:latest...
Getting image source signatures
Copying blob df20fa9351a1 [--------------------------------------] 0.0b / 0.0b
Copying config a24bb40132 done  
Writing manifest to image destination
Storing signatures
Error: OCI runtime error: error encountered while bringing up pod test-pod-0: sync socket closed

Maybe crun could provide a better error message? Or maybe there's nothing to be done. I don't know enough about what went wrong, but here's the line where that error message was generated: https://github.com/containers/crun/blob/22c34e34ab5fcdc64b70ea940d1ee5b00b2227a6/src/libcrun/container.c#L360

xordspar0 · 2020-10-12T12:51:57Z

Closer! Now there are only failures on rootless Fedora 31 and 32. I've been testing with rootless Fedora 33. I'll grab an older ISO and continue working there.

xordspar0 · 2020-10-12T12:57:38Z

The only remaining test failure is a reported issue: #7959
I'm going to skip this test on Fedora 31 and 32 and wrap up this PR.

Signed-off-by: Jordan Christiansen <[email protected]>

TomSweeneyRedHat · 2020-10-12T14:25:16Z

LGTM
assuming happy tests

xordspar0 · 2020-10-12T16:19:59Z

Woo, the tests are all passing! This took a lot longer than expected, but it's worth it. Engineering is often like that: you peel back a layer of failures only to discover another layer underneath. Thanks for your patience.

mheon · 2020-10-12T16:44:08Z

Thanks for putting in the effort to get this passing, @xordspar0 - merging away

/lgtm

xordspar0 · 2020-10-12T17:23:33Z

🎉

openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 30, 2020

xordspar0 marked this pull request as ready for review September 30, 2020 19:40

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 30, 2020

xordspar0 mentioned this pull request Sep 30, 2020

Add support for resource CPU limits to generate kube #7855

Closed

xordspar0 commented Sep 30, 2020

View reviewed changes

test/e2e/play_kube_test.go Outdated Show resolved Hide resolved

zhangguanzhang reviewed Oct 1, 2020

View reviewed changes

test/e2e/play_kube_test.go Show resolved Hide resolved

xordspar0 mentioned this pull request Oct 5, 2020

Podman play kube ignores Limits #7742

Closed

rhatdan reviewed Oct 7, 2020

View reviewed changes

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 8, 2020

Add support for resource limits to play kube

a413d4d

Signed-off-by: Jordan Christiansen <[email protected]>

openshift-ci-robot assigned mheon Oct 12, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 12, 2020

openshift-merge-robot merged commit 609f230 into containers:master Oct 12, 2020

xordspar0 deleted the play-kube-limits-#7742 branch October 12, 2020 17:23

edsantiago mentioned this pull request Oct 19, 2020

Cirrus CI: agent stopped responding #8068

Closed

ghost mentioned this pull request Jan 8, 2021

Podman 2.2.1 rootless: cannot play podman-generated YAML file #8710

Closed

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for resource limits to play kube #7853

Add support for resource limits to play kube #7853

xordspar0 commented Sep 30, 2020 •

edited

Loading

mheon commented Sep 30, 2020

rhatdan commented Sep 30, 2020

xordspar0 commented Sep 30, 2020

haircommander commented Sep 30, 2020

xordspar0 commented Sep 30, 2020 •

edited

Loading

zhangguanzhang commented Oct 2, 2020

xordspar0 commented Oct 6, 2020

xordspar0 commented Oct 6, 2020

mheon commented Oct 6, 2020

rhatdan Oct 7, 2020

xordspar0 Oct 7, 2020

xordspar0 Oct 7, 2020

xordspar0 Oct 9, 2020

xordspar0 Oct 9, 2020

mheon commented Oct 8, 2020

openshift-ci-robot commented Oct 8, 2020

TomSweeneyRedHat commented Oct 8, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Oct 12, 2020 •

edited

Loading

TomSweeneyRedHat commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

mheon commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

	if rootless.IsRootless() {
	cgroup2, err := cgroups.IsCgroup2UnifiedMode()
	if err != nil {
	return nil, err
	}
	if !addedResources {
	configSpec.Linux.Resources = &spec.LinuxResources{}
	}

	canUseResources := cgroup2 && runtimeConfig != nil && (runtimeConfig.Engine.CgroupManager == cconfig.SystemdCgroupsManager)

	if addedResources && !canUseResources {
	return nil, errors.New("invalid configuration, cannot specify resource limits without cgroups v2 and --cgroup-manager=systemd")
	}
	if !canUseResources {
	// Force the resources block to be empty instead of having default values.
	configSpec.Linux.Resources = &spec.LinuxResources{}
	}
	}

Add support for resource limits to play kube #7853

Add support for resource limits to play kube #7853

Conversation

xordspar0 commented Sep 30, 2020 • edited Loading

mheon commented Sep 30, 2020

rhatdan commented Sep 30, 2020

xordspar0 commented Sep 30, 2020

haircommander commented Sep 30, 2020

xordspar0 commented Sep 30, 2020 • edited Loading

zhangguanzhang commented Oct 2, 2020

xordspar0 commented Oct 6, 2020

xordspar0 commented Oct 6, 2020

mheon commented Oct 6, 2020

rhatdan Oct 7, 2020

Choose a reason for hiding this comment

xordspar0 Oct 7, 2020

Choose a reason for hiding this comment

xordspar0 Oct 7, 2020

Choose a reason for hiding this comment

xordspar0 Oct 9, 2020

Choose a reason for hiding this comment

xordspar0 Oct 9, 2020

Choose a reason for hiding this comment

mheon commented Oct 8, 2020

openshift-ci-robot commented Oct 8, 2020

TomSweeneyRedHat commented Oct 8, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Oct 12, 2020 • edited Loading

TomSweeneyRedHat commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

mheon commented Oct 12, 2020

xordspar0 commented Oct 12, 2020

xordspar0 commented Sep 30, 2020 •

edited

Loading

xordspar0 commented Sep 30, 2020 •

edited

Loading

xordspar0 commented Oct 12, 2020 •

edited

Loading