Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/system.Stat: fix panic, speed up, add benchmarks #2696

Merged
merged 3 commits into from
Sep 9, 2021

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Nov 30, 2020

1. libct/system/proc_test: fix, improve, add benchmarks

  1. Add a test case that tests parentheses in command.

  2. Replace individual comparisons with reflect.DeepEqual.
    This also fixes wrong %-style types used in Fatalf statements.

  3. Replace Fatalf with Errorf so we don't bail out on the first
    failure, and do not check result on error.

  4. Add two benchmarks.

2. libct/system.Stat: fix/improve/speedup

  1. Remove PID field. It is totally useless -- we already know the PID.

  2. Rewrite parseStat() to make it faster and more correct:

    • do not use fmt.Scanf as it is very slow;
    • avoid splitting data into 20+ fields, of which we only need 2 (should also help garbage collector);
    • make sure to not panic on short lines and other bad input;
    • add some bad input tests (some fail with old code);
    • use LastIndexByte instead of LastIndex where appropriate.

Benchmarks:

before (from the previous commit message):

BenchmarkParseStat-4              116415             10804 ns/op
BenchmarkParseRealStat-4             240           4781769 ns/op

after:

BenchmarkParseStat-4           1164948              1068 ns/op
BenchmarkParseRealStat-4           331           3458315 ns/op

We are seeing 10x speedup in a synthetic benchmark, and about 1.4x
speedup in a real world benchmark. All this while being more scrupulous
about any possible errors. I also suspect much less allocations and thus
garbage to collect.

While at it, do not ignore any possible errors, and properly wrap those.

  1. Add I and P process states (available since Linux 4.14).

History

  • v2: use pkg/errors more, remove t.Logf from test
  • v3: rebased; drop pkg/errors; gofumpt'ed
  • v4: rebased; improved description
  • v5: rebased; mention bad input tests, added second benchmark results
  • v6: remove PID field, do not use strings.Split, further speedup.
  • v7: readability improvements, added new process states.
  • v8: improved parser to bail earier on incorrect data, more doc, more tests.

@kolyshkin
Copy link
Contributor Author

Rebased to benefit from #2690 (while Travis was not able to test this PR for 6+ hours 😢)

thaJeztah
thaJeztah previously approved these changes Dec 1, 2020
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

libcontainer/system/proc.go Outdated Show resolved Hide resolved
@kolyshkin
Copy link
Contributor Author

@AkihiroSuda @mrunalp PTAL

@kolyshkin
Copy link
Contributor Author

rebased (for new CI)

thaJeztah
thaJeztah previously approved these changes Jan 7, 2021
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but left some thoughts/suggestions for consideration

libcontainer/system/proc_test.go Show resolved Hide resolved
libcontainer/system/proc_test.go Outdated Show resolved Hide resolved
libcontainer/system/proc.go Outdated Show resolved Hide resolved
libcontainer/system/proc.go Outdated Show resolved Hide resolved
libcontainer/system/proc.go Show resolved Hide resolved
@kolyshkin kolyshkin added this to the 1.0.0-rc94 milestone Feb 2, 2021
@cyphar
Copy link
Member

cyphar commented Feb 4, 2021

Is this a bugfix or an improvement? If it's an improvement it belongs in post-1.0 as a milestone.

@kolyshkin
Copy link
Contributor Author

Is this a bugfix or an improvement? If it's an improvement it belongs in post-1.0 as a milestone.

Improvement (except eliminating a linter warning which might be classified as bugfix), and that's lots of code, so moving to post 1.0

@kolyshkin kolyshkin modified the milestones: 1.0.0-rc94, post-1.0 Feb 4, 2021
@kolyshkin kolyshkin force-pushed the proc-stat branch 2 times, most recently from 1ca7e5c to 8d6a5d4 Compare February 4, 2021 20:18
@cyphar
Copy link
Member

cyphar commented Feb 5, 2021

Yeah agreed, this change looks a little scary tbh.

@kolyshkin
Copy link
Contributor Author

close/open to re-kick ci

@kolyshkin
Copy link
Contributor Author

v3: rebased; drop using pkg/errors; gofumpt'ed

@kolyshkin kolyshkin requested review from cyphar and AkihiroSuda June 29, 2021 04:13
@kolyshkin kolyshkin changed the title libct/system.Stat: improve and speed up libct/system.Stat: fix panic, speed up, add benchmarks Jun 29, 2021
@kolyshkin
Copy link
Contributor Author

CI failure is unrelated (#3050)

@kolyshkin
Copy link
Contributor Author

Rebased to fix CI.

@kolyshkin kolyshkin force-pushed the proc-stat branch 2 times, most recently from f4716d3 to 89deda8 Compare August 4, 2021 01:02
@kolyshkin
Copy link
Contributor Author

Some quite interesting observations:

  • removing strings.Split (which result in ~20 fields slice) and a couple of fmt.Scanfs results in 7x improvement!
  • removing strings.Split (for much shorter data, only the first two fields) results in further 2x improvement!

@kolyshkin
Copy link
Contributor Author

@thaJeztah PTAL

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good; I was looking for corner-cases, and possibly found some (may not be important to add, but thought I'd post them)

I'm fine with merging as-is though

libcontainer/system/proc_test.go Show resolved Hide resolved
1. Add a test case that tests parentheses in command.

2. Replace individual comparisons with reflect.DeepEqual.
   This also fixes wrong %-style types in Fatalf statements.

3. Replace Fatalf with Errorf so we don't bail out on the first
   failure, and do not check result on error.

4. Add two benchmarks. On my laptop, they show:

BenchmarkParseStat-4       	  116415	     10804 ns/op
BenchmarkParseRealStat-4   	     240	   4781769 ns/op

Signed-off-by: Kir Kolyshkin <[email protected]>
1. Remove PID field as it is useless.

2. Rewrite parseStat() to make it faster and more correct:

 - do not use fmt.Scanf as it is very slow;
 - avoid splitting data into 20+ fields, of which we only need 2;
 - make sure to not panic on short lines and other bad input;
 - add some bad input tests (some fail with old code);
 - use LastIndexByte instead of LastIndex.

Benchmarks:

before (from the previous commit message):

> BenchmarkParseStat-4              116415             10804 ns/op
> BenchmarkParseRealStat-4             240           4781769 ns/op

after:

> BenchmarkParseStat-4       	 1164948	      1068 ns/op
> BenchmarkParseRealStat-4   	     331	   3458315 ns/op

We are seeing 10x speedup in a synthetic benchmark, and about 1.4x
speedup in a real world benchmark.

While at it, do not ignore any possible errors, and properly wrap those.

[v2: use pkg/errors more, remove t.Logf from test]
[v3: rebased; drop pkg/errors; gofumpt'ed]
[v4: rebased; improved description]
[v5: rebased; mention bad input tests, added second benchmark results]
[v6: remove PID field, do not use strings.Split, further speedup]

Signed-off-by: Kir Kolyshkin <[email protected]>
Those states are available since Linux 4.14 (kernel commits
8ef9925b02c23e3838d5 and 06eb61844d841d003). Before this
patch, they were shown as unknown.

This is mostly cosmetical.

Note that I is described in /proc/pid/status as just "idle", although
elsewhere it says it's an idle kernel thread. Let's have it as "idle"
for brevity.

Signed-off-by: Kir Kolyshkin <[email protected]>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +88 to +89
// 1. Look for the first '(' and the last ')' first, what's in between is Name.
// We expect at least 20 fields and a space after the last one.
Copy link
Member

@cyphar cyphar Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's hoping that they don't add a new field in a few kernel versions that contains ) grimace. Why on earth doesn't /proc/self/status contain StartTime? If it did we could just use that rather than this unparseable crap. /rant

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I heard they will switch to YAML, because it seems to be popular 🤪

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants