Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3.0.1-rhel] podman stats: calc CPU percentage correctly #18710

Conversation

TomSweeneyRedHat
Copy link
Member

@TomSweeneyRedHat TomSweeneyRedHat commented May 27, 2023

When you run podman stats, the first interval always shows the wrong cpu
usage. To calculate cpu percentage we get the cpu time from the cgroup
and compare this against the system time between two stats. Since the
first time we do not have a previous stats an empty struct is used
instead. Thus we do not use the actual running time of the container but
the current unix timestamp (time since Jan 1 1970).

To fix this we make sure that the previous stats time is set to the
container start time, when it is empty.

[NO NEW TESTS NEEDED] No idea how I could create a test which would have
a predictable cpu usage.

See the linked bugzilla for a reproducer.
Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=2210139

Does this PR introduce a user-facing change?

None

@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label May 27, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 27, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: TomSweeneyRedHat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. release-note-none and removed do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None labels May 27, 2023
@@ -168,7 +170,7 @@ func outputStats(reports []define.ContainerStats) error {
if report.IsJSON(statsOptions.Format) {
return outputJSON(stats)
}
format := "{{.ID}}\t{{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}\t{{.NetIO}}\t{{.BlockIO}}\t{{.PIDS}}\n"
format := "{{.ID}}\t{{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}\t{{.NetIO}}\t{{.BlockIO}}\t{{.PIDS}}\t{{.UpTime}}\t{{.AVGCPU}}\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why commit 1d36ad56bd2714ee482539dc3983ad1232d293f1 is backported? This is changing the default podman stats output which IMO should never be done for backports/minor releases.

Reading the bug the request is only to fix the broken CpuPerc calculation not adding new features. I would expect a backport of only afb72eaa849f577842e51de96a497406d3e87fea to be enough to fix the issue, of course you may need to rebase so it applies correctly to 3.0.

Copy link
Member Author

@TomSweeneyRedHat TomSweeneyRedHat Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I included 1d36ad5 is that I initially grabbed both commits from #13597, and that included 130bcc3, which includes calculations on the stats.AvgCPU field, which was defined by 1d36ad5. I wasn't thrilled about the change to the stats line either, given the versions. Let me go revisit the BZ and see if we can pull it off with just the one commit. Thanks for the review.

When you run podman stats, the first interval always shows the wrong cpu
usage. To calculate cpu percentage we get the cpu time from the cgroup
and compare this against the system time between two stats. Since the
first time we do not have a previous stats an empty struct is used
instead. Thus we do not use the actual running time of the container but
the current unix timestamp (time since Jan 1 1970).

To fix this we make sure that the previous stats time is set to the
container start time, when it is empty.

[NO NEW TESTS NEEDED] No idea how I could create a test which would have
a predictable cpu usage.

See the linked bugzilla for a reproducer.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2066145

Signed-off-by: Paul Holzinger <[email protected]>
Signed-off-by: Tom Sweeney <[email protected]>
@TomSweeneyRedHat TomSweeneyRedHat force-pushed the dev/tsweeney/pscpufix_v3.0.1-rhel branch from 958629b to ca3d490 Compare June 1, 2023 22:20
@TomSweeneyRedHat TomSweeneyRedHat changed the title [v3.0.1-rhel] podman stats: improve cpu average calc [v3.0.1-rhel] podman stats: calc CPU percentage correctly Jun 1, 2023
@TomSweeneyRedHat
Copy link
Member Author

TomSweeneyRedHat commented Jun 1, 2023

@Luap99 PTA(nother)L

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Member

rhatdan commented Jun 5, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 5, 2023
@openshift-merge-robot openshift-merge-robot merged commit 9581020 into containers:v3.0.1-rhel Jun 5, 2023
@TomSweeneyRedHat TomSweeneyRedHat deleted the dev/tsweeney/pscpufix_v3.0.1-rhel branch June 30, 2023 21:42
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 28, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants