-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: add a interval parameter to cli and api stats streaming #11003
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM, thank you! Shall we do the same pod stats
?
I have no good idea how we could test that (Cc @edsantiago).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
Some comments which should be addressed. Also you have to add this new parameter to the pkg/bindings
here to make it work with podman-remote, make sure to run make generate-bindings
afterwards.
Your commit is also missing the sign-off-by line.
Possible test, e.g. #!/usr/bin/env bats -*- bats -*-
#
# tests for podman stats
#
load helpers
@test "podman stats --interval" {
# Simple boring process
run_podman run -d --name foo $IMAGE sleep 30
# Run stats for 15 seconds. This blocks.
PODMAN_TIMEOUT=15 run_podman 124 stats --interval 10
# Confirm that we see some headers, on first and third lines
is "${lines[0]}" "ID.*NAME.*CPU.*MEM USAGE.*CPU TIME" "stat headers seen"
is "${lines[2]}" "ID.*NAME.*CPU.*MEM USAGE.*CPU TIME" \
"stat headers seen (second iteration)"
# Last line should be a timeout
is "${lines[4]}" "timeout: sending signal TERM.*" "timeout received"
# Confirm that --interval worked (remember that we have a timeout line)
is "${#lines[*]}" "5" "number of stat lines in 15 seconds"
run_podman stop --time=0 foo
run_podman rm -f foo
}
# vim: filetype=sh I really hate this kind of test, because it adds wall-clock time to our already-much-too-long CI. I'm commenting simply to offer the option, but I hope that others decide it is not worth the CI bloat. |
I am fine with not testing this. |
Can we add a test that invalid times are rejected (e.g. Otherwise LGTM |
LGTM, after you add invalid --interval check. |
@mheon before i write the test: do you expect that the code actively rejects (fails) when called with zero value or negative interval? Currently it makes a graceful fallback to 1 second. Failing would make a cli and api test simpler, of course. |
Yes the code should fail, on invalid values being passed in. |
Sry I forgot you also have to add this for podman-remote as well diff --git a/pkg/domain/infra/tunnel/containers.go b/pkg/domain/infra/tunnel/containers.go
index c17d7b54f..1df79c373 100644
--- a/pkg/domain/infra/tunnel/containers.go
+++ b/pkg/domain/infra/tunnel/containers.go
@@ -871,7 +871,7 @@ func (ic *ContainerEngine) ContainerStats(ctx context.Context, namesOrIds []stri
if options.Latest {
return nil, errors.New("latest is not supported for the remote client")
}
- return containers.Stats(ic.ClientCtx, namesOrIds, new(containers.StatsOptions).WithStream(options.Stream))
+ return containers.Stats(ic.ClientCtx, namesOrIds, new(containers.StatsOptions).WithStream(options.Stream).WithInterval(options.Interval))
}
// ShouldRestart reports back whether the container will restart |
Concur with @rhatdan - we should actively error on bad values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on adding this to the pods side of things too? https://github.com/containers/podman/blob/main/cmd/podman/pods/stats.go
@Luap99 : is addressed in latest commit |
I tried to get into your e2e tests in order to extend the existing stats_test.go.
After an hour or so i found the "// +build" line at the top of the file and removed it. It's the only test with exactly this annotation. Is it intentional behavior or a copy/paste bug? Also i would recommend to change the spec descriptions in pod_stats_test.go to differ from stats_test.go. Eg. "podman pod stats ..." vs "podman stats ...". Should i include it in this PR? |
That's definitely a real bug, please remove the
Yes |
Ok, seems like i found a rabbit hole here. Re-enabled the stats tests, found broken test, found that podman stats exits with 0 if a invalid format string is given. Find my tests for the interval flag for negative, zero and positive value. Hope everything is fine now. |
@towe75 the failing check is (correctly) noting that your commit message summary (the first line) is too long. |
podman stats polled by default in a 1 sec period. This can put quite some load on a machine if you run many containers. The default value is now 5 seconds. You can change this interval with a new, optional, --interval, -i cli flag. The api request got also a interval query parameter for the same purpose. Additionally a unused const was removed. Api and cli will fail the request if a 0 or negative value is passed in. Signed-off-by: Thomas Weber <[email protected]>
@edsantiago : sorry, fixed Something is still wrong with the tests. My added tests are fine when running "localintegration". I can see that your CI tests invoke tests using podman remote and errors/exit-code is not propagated automatically/correctly.
Did i miss some plumbing? |
@towe75 Sorry for the silence, I lost track of this PR. You did nothing wrong the stats binding ignored server errors. This diff should make it work: diff --git a/pkg/bindings/containers/containers.go b/pkg/bindings/containers/containers.go
index 86304f392..bc7b0c8c9 100644
--- a/pkg/bindings/containers/containers.go
+++ b/pkg/bindings/containers/containers.go
@@ -223,6 +223,9 @@ func Stats(ctx context.Context, containers []string, options *StatsOptions) (cha
if err != nil {
return nil, err
}
+ if !response.IsSuccess() {
+ return nil, response.Process(nil)
+ }
statsChan := make(chan entities.ContainerStatsReport) |
Renamed podman pod stats test specs to distinguish them from podman stats tests. podman stats tests where disabled by a +build flag. Fix podman stats format test, add negative test. Fix podman stats cli command, exit non-zero on invalid format string. Add tests for podman stats interval flag. Signed-off-by: Thomas Weber <[email protected]>
@Luap99 i think it's ok now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I just want to point out that is is changing the default interval from 1 to 5 seconds. I am fine with that. Let's hear what other Maintainers think. @containers/podman-maintainers PTAL
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Luap99, mheon, towe75 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
podman stats cli and api polls reports by default in a 1 sec period.
This can put some load on a machine if you run many containers.
A simple test is to run e.g. a range of pause containers like so:
seq 35 |xargs -rn1 podman run -d --rm gcr.io/google_containers/pause
Next, run "podman stats" (or the API request), and monitor e.g. "Top" to see the load.
I found that the compat api endpoint polls by default in a 5 sec period which is not so aggressive.
Making it configurable seemed to be a good solution, hence this PR. You can now easily align the poll period and e.g. your metrics collector interval.
There was also a unused 5 sec const in the libpod endpoint, i removed it.
You can now change this interval with a optional --interval cli flag.
The api request got a interval query parameter for the same purpose.
Default behavior is not changed.
I could not find a unit or e2e test for the stats streaming mode, so i was unsure how to provide/enhance a test.
CLI Manual and REST API doc should be covered in the PR.