Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util, server: fix memory detection with non-root cgroups #56840

Merged
merged 1 commit into from
Nov 20, 2020

Conversation

darinpp
Copy link
Contributor

@darinpp darinpp commented Nov 18, 2020

The cgroup information is constructed by looking at /proc/self/cgroup,
matching with the mount point in /proc/self/mountinfo
and then inpsecting the memory.stat file. The matching between the
cgroup and the mount point was working only in case that the mount and
the cgroup are exact match relative to cgroup NS. This is however
inadequate as while true for the docker case, it isn't true in the general
case. In this example determining the cgroup memory limit isn't possible:

cgcreate -t $USER:$USER -a $USER:$USER -g memory:crdb_test
echo 3999997952 > /sys/fs/cgroup/memory/crdb_test/memory.limit_in_bytes
cgexec -g memory:crdb_test ./cockroach start-single-nod

To address this, this patch instead constructs the expected
relative path of the cgroup's memory info by starting with the
mount path and then adding the cgroup.

Release note: None

@darinpp darinpp requested review from itsbilal, knz and ajwerner November 18, 2020 03:18
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @darinpp, @itsbilal, and @knz)


pkg/util/cgroups/cgroups.go, line 289 at r1 (raw file):

}

// Reads /proc/[pid]/mountinfo for cgoup or cgroup2 mount which defines the used version.

Is there a case in v1 cgroups where the memory and cpu cgroups can be mounted in totally different trees? Is this going to throw of the work bilal just did? the v1 cgroup APIs are too fiddly. I get this change but I wonder if we’d be better served by saying which cgroup controller we’re looking for the mount point for. In v1 iiuc it’s just convention that the different controllers get mounted at the same root directory.


pkg/util/cgroups/cgroups.go, line 323 at r1 (raw file):

			// cockroach.log -> server/config.go:433 ⋮ system total memory: ‹63 GiB›
			memNSRelativePath := string(fields[3])
			if !strings.Contains(memNSRelativePath, "..") {

When is this true? Can you add tests for it?


pkg/util/cgroups/cgroups.go, line 324 at r1 (raw file):

			memNSRelativePath := string(fields[3])
			if !strings.Contains(memNSRelativePath, "..") {
				if relPath, err := filepath.Rel(memNSRelativePath, cRoot); err == nil {

I imagine the linter isn’t happy about your throwing away this error.

Copy link
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @darinpp, and @knz)


pkg/util/cgroups/cgroups.go, line 289 at r1 (raw file):

Previously, ajwerner wrote…

Is there a case in v1 cgroups where the memory and cpu cgroups can be mounted in totally different trees? Is this going to throw of the work bilal just did? the v1 cgroup APIs are too fiddly. I get this change but I wonder if we’d be better served by saying which cgroup controller we’re looking for the mount point for. In v1 iiuc it’s just convention that the different controllers get mounted at the same root directory.

I'm not sure if I understand what you're referring to - this does take a controller arg that it then tries to match. The mountinfopath will be the same in both cases, but the cgroup controller mount points could be completely different based on the controller specified, yeah.


pkg/util/cgroups/cgroups.go, line 313 at r1 (raw file):

				return mountPoint, ver, nil
			}
			// It is possible that the memory controller mount and the cgroup path are not the same (both are relative to the NS root).

s/memory controller/controller/ and similar changes below.


pkg/util/cgroups/cgroups_test.go, line 74 at r1 (raw file):

				"/proc/self/cgroup":                 v1CgroupWithMemoryControllerNS,
				"/proc/self/mountinfo":              v1MountsWithMemControllerNS,
				"/sys/fs/cgroup/memory/cgroup_test/memory.stat": v1MemoryStat,

Worthwhile adding a similar test for the cpu,cpuacct controllers.

@darinpp darinpp force-pushed the fix-cgroup-non-root branch from bfaca63 to 82ee257 Compare November 19, 2020 02:31
Copy link
Contributor Author

@darinpp darinpp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @itsbilal, and @knz)


pkg/util/cgroups/cgroups.go, line 289 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I'm not sure if I understand what you're referring to - this does take a controller arg that it then tries to match. The mountinfopath will be the same in both cases, but the cgroup controller mount points could be completely different based on the controller specified, yeah.

I renamed the variable as this code now is controller agnostic and works for both cpu, memory etc. It doesn't matter which controller is mounted where. This simply looks at the controller that is passed in relation to the cgroup root.


pkg/util/cgroups/cgroups.go, line 313 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

s/memory controller/controller/ and similar changes below.

changed


pkg/util/cgroups/cgroups.go, line 323 at r1 (raw file):

Previously, ajwerner wrote…

When is this true? Can you add tests for it?

I added few tests. The mount point changes as the namespace change. So a mount done outside of the namespace will have ... In this case it isn't possible to construct the path but is possible to mount within the namespace as I showed in the added test.


pkg/util/cgroups/cgroups.go, line 324 at r1 (raw file):

Previously, ajwerner wrote…

I imagine the linter isn’t happy about your throwing away this error.

I didn't see it complaining. Will check and fix.


pkg/util/cgroups/cgroups_test.go, line 74 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Worthwhile adding a similar test for the cpu,cpuacct controllers.

Done.

@darinpp darinpp requested review from ajwerner and itsbilal November 19, 2020 02:42
Copy link
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: good work!

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @knz)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one error handling thing and I can be convinced to not care.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @darinpp, and @knz)


pkg/util/cgroups/cgroups.go, line 289 at r1 (raw file):

Previously, darinpp wrote…

I renamed the variable as this code now is controller agnostic and works for both cpu, memory etc. It doesn't matter which controller is mounted where. This simply looks at the controller that is passed in relation to the cgroup root.

😓


pkg/util/cgroups/cgroups.go, line 324 at r1 (raw file):

Previously, darinpp wrote…

I didn't see it complaining. Will check and fix.

What's the deal if this fails? Is it possible we're going to find another entry for the desired controller or should we return an error?

The cgroup information is constructed by looking at `/proc/self/cgroup`,
matching with the mount point in `/proc/self/mountinfo`
and then inpsecting the `memory.stat` file. The matching between the
cgroup and the mount point was working only in case that the mount and
the cgroup are exact match relative to cgroup NS. This is however
inadequate as while true for the docker case, it isn't true in the general
case. In this example determining the cgroup memory limit isn't possible:
```
cgcreate -t $USER:$USER -a $USER:$USER -g memory:crdb_test
echo 3999997952 > /sys/fs/cgroup/memory/crdb_test/memory.limit_in_bytes
cgexec -g memory:crdb_test ./cockroach start-single-nod
```
To address this, this patch instead constructs the expected
relative path of the cgroup's memory info by starting with the
mount path and then adding the cgroup.

Release note: None
@darinpp darinpp force-pushed the fix-cgroup-non-root branch from 82ee257 to 4a2177f Compare November 19, 2020 23:51
@darinpp
Copy link
Contributor Author

darinpp commented Nov 20, 2020

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 20, 2020

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants