Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BootTimeWithContext is being called for every pid causing high CPU consumption #1283

Open
1 of 5 tasks
dloucasfx opened this issue Apr 1, 2022 · 3 comments
Open
1 of 5 tasks

Comments

@dloucasfx
Copy link

Description

User reported that our process is causing high CPU consumption, up to 500% on a linux machine with 128 CPU cores.
Profiling the process (see below image) shows that over 51% of the CPU time used by the process was spent in gopsutil, mainly executing the BootTimeWithContext func

Screen Shot 2022-04-01 at 11 04 04 AM

Although I was able to slightly optimize ReadLinesOffsetN gopsutil/common.go at master · shirou/gopsutil

From this:

~/Work/source/scratch  go test -bench=. -count 10                                                                                                                                                 ✔  9s  10:25:37 
goos: darwin
goarch: amd64
pkg: scratch
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkReadingProcStat-12        18934             57342 ns/op
BenchmarkReadingProcStat-12        20154             56708 ns/op
BenchmarkReadingProcStat-12        20383             56275 ns/op
BenchmarkReadingProcStat-12        21481             96737 ns/op
BenchmarkReadingProcStat-12        17793             61745 ns/op
BenchmarkReadingProcStat-12        20980             64534 ns/op
BenchmarkReadingProcStat-12        21688             60939 ns/op
BenchmarkReadingProcStat-12        19582             56650 ns/op
BenchmarkReadingProcStat-12        21484             55046 ns/op
BenchmarkReadingProcStat-12        21628             53929 ns/op
PASS
ok      scratch 20.430s

To this:

 ~/Work/source/scratch  go test -bench=. -count 10                                                                                                                                                ✔  21s  10:34:57 
goos: darwin
goarch: amd64
pkg: scratch
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkReadingProcStatOptimized-12               28608             41201 ns/op
BenchmarkReadingProcStatOptimized-12               25236             45126 ns/op
BenchmarkReadingProcStatOptimized-12               30998             37839 ns/op
BenchmarkReadingProcStatOptimized-12               30225             39085 ns/op
BenchmarkReadingProcStatOptimized-12               30805             38958 ns/op
BenchmarkReadingProcStatOptimized-12               31250             38267 ns/op
BenchmarkReadingProcStatOptimized-12               31510             43607 ns/op
BenchmarkReadingProcStatOptimized-12               30832             38834 ns/op
BenchmarkReadingProcStatOptimized-12               31084             39827 ns/op
BenchmarkReadingProcStatOptimized-12               30219             41070 ns/op
PASS
ok      scratch 16.807s

The main issue is in the fact that BootTimeWithContext (which ends up calling ReadLinesOffsetN ) gets executed for every running process on the system! gopsutil/process_linux.go at master · shirou/gopsutil and we have hundreds of processes running on the system.
Moreover, because this is running on a host with 128 cpu cores, the /proc/stat file is larger than normal, causing ReadLinesOffsetN to be significant

The boot time does not change, so the code should NOT recalculate it and call BootTimeWithContext for every process, especially that the ReadLinesOffsetN is proven to be expensive. Instead, we should cache it and reuse it.

To Reproduce

On a host with 128 cpus and hundreds of running , run this 3-5 times in parallel procs, err := process.Processes()

Expected behavior
CPU usage should not exceed 20-30%, instead we are seeing CPU usage reaching 500%

Environment (please complete the following information):

  • Windows: [paste the result of ver]
  • Linux: [paste contents of /etc/os-release and the result of uname -a]
  • Mac OS: [paste the result of sw_vers and uname -a
  • FreeBSD: [paste the result of freebsd-version -k -r -u and uname -a]
  • OpenBSD: [paste the result of uname -a]

Additional context
[Cross-compiling? Paste the command you are using to cross-compile and the result of the corresponding go env]

@Lomanic
Copy link
Collaborator

Lomanic commented Apr 1, 2022

Duplicate of #1070 (#842 is also linked)

See #1070 (comment) for why we can't cache BootTimeWithContext. Yes it can change.

@dloucasfx
Copy link
Author

dloucasfx commented Apr 1, 2022

Duplicate of #1070 (#842 is also linked)

See #1070 (comment) for why we can't cache BootTimeWithContext. Yes it can change.

In this case, can't we just call it once for each call ProcessesWithContext instead of calling it for each process , this will drastically improve performance, besides, if the clock has changed while ProcessesWithContext is running, then regardless of how it's been called, part of the results will have wrong create time.

In addition , can we have another method that gets the process info, but excludes the create time?

@Lomanic I saw you have suggested unix.Sysinfo why this was not accepted?

@xhsky
Copy link

xhsky commented Jul 25, 2022

@dloucasfx I also encountered this problem, is there a solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants