Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly calculate cpu_seconds for processtree #9943

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

berland
Copy link
Contributor

@berland berland commented Feb 3, 2025

If a subprocess in the processtree terminates early, its cpu_seconds value will be lost unless we keep all values pr. pid until the end.

Since pids can be reused (although with little probability) we need to detect this through a decreasing cpu_second value for a pid that has already been observed in the processtree.

The test has been modified in order to trigger this scenario.

Issue
Resolves #9941

Approach
🗒️ Do the book-keeping necessary.

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure unit tests pass locally after every commit (git rebase -i main --exec 'just rapid-tests')

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Create Backport PR to latest release

Copy link

codspeed-hq bot commented Feb 3, 2025

CodSpeed Performance Report

Merging #9943 will not alter performance

Comparing berland:correct_cpu_seconds (14c9939) with main (7c27a9c)

Summary

✅ 25 untouched benchmarks

If a subprocess in the processtree terminates early, its cpu_seconds
value will be lost unless we keep all values pr. pid until the end.

Since pids can be reused (although with little probability) we need
to detect this through a decreasing cpu_second value for a pid that
has already been observed in the processtree.

The test has been modified in order to trigger this scenario.
@berland berland force-pushed the correct_cpu_seconds branch from 4852ce2 to 14c9939 Compare February 3, 2025 12:38
@berland berland mentioned this pull request Feb 3, 2025
9 tasks
@berland berland added the release-notes:bug-fix Automatically categorise as bug fix in release notes label Feb 3, 2025
(memory_rss, cpu_seconds_snapshot, oom_score, pids) = _get_processtree_data(
process
)
for pid, seconds in cpu_seconds_snapshot.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could extract this part to make it easier to test:

@dataclass
class CpuTimer:
  _cpu_seconds_pr_pid: dict[str, float] = field(default_factory=dict, init=False)

  def update(self, cpu_seconds_snapshot):
    ...

  def total_cpu_seconds(self):
    return sum(self._cpu_seconds_pr_pid.values())

Then you can easily test the logic by giving it fake cpu_seconds_snapshots.

Copy link
Contributor

@eivindjahren eivindjahren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a small suggestion about improving the testability of the logic, but I see no need to block merging. The suggested refactoring and adding of testing can be done in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes:bug-fix Automatically categorise as bug fix in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logging of cpu-seconds will miss subprocesses that exit before the main process
2 participants