Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for podman metrics in docker module #41889

Merged
merged 10 commits into from
Dec 10, 2024

Conversation

MichaelKatsoulis
Copy link
Contributor

Proposed commit message

  • WHAT: Enhance docker module so that podman metrics are collected and calculated correctly
  • WHY: support for podman metrics in docker module

Details

Podman offers a docker compatible API for metrics collection. So using docker module to collect podman metrics should be working out of the box.
In reality the memory metrics were not published at all and cpu usage percentage calculation was incorrect.
The reason is that the podman api returns zero values for precpu_stats. These stats refer to the latest's read cpu statistics, needed for cpu percentage calculation. Also due to these stats being zero, the memory metrics were not populated because of a sanity check in the code.

if containerStats.Stats.MemoryStats.Limit == 0 || containerStats.Stats.PreCPUStats.CPUUsage.TotalUsage == 0 {

For docker, the precpu_stats are returned every time.
The solution in the Podman's case is to stream the api response. By default we had set the stream option to false as there was no need in case of docker.

As part of this PR we introduce a new configuration parameter named podman (by default false). If set to true, indicating that podman is used, then the stream parameter is set to true, only for the collection of cpu and memory stats.
The reason is that for cpu and memory stats the precpu_stats are needed.
From the streamed response, we get the second response as it was noticed during testing that the precpu_stats of the first response were incorrect and the cpu was miscalculated.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

There is no disruptive user impact.

How to test this PR locally

  1. start podman machine (In macOS use podman desktop)
  2. start an nginx test container podman run -d -p 8080:80 --name mynginx nginx
  3. exec into mynginx pod and start a cpu intensive task:
    a. podman exec -ti mynginx bash
    b. while true; do :; done &
  4. run metricbeat locally with docker module enabled and podman parameter set to true
  5. watch the docker.* metrics being published to ES with correct values. Compare with podman stats command results.

Related issues

Use cases

Screenshots

Podman Stats command (check for mynginx container)
podman stats

docker cpu percentage

docker memory percentage

@MichaelKatsoulis MichaelKatsoulis requested a review from a team as a code owner December 4, 2024 13:30
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 4, 2024
Copy link
Contributor

mergify bot commented Dec 4, 2024

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @MichaelKatsoulis? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Dec 4, 2024

backport-8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Dec 4, 2024
@gizas
Copy link
Contributor

gizas commented Dec 4, 2024

The relevant docker integration should also be updated with the new variable podman, right?

@bturquet bturquet added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Dec 5, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 5, 2024
@MichaelKatsoulis
Copy link
Contributor Author

@elastic/elastic-agent-data-plane could I get a review here as you are the codeowners?

Copy link
Member

@rdner rdner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one suggestion but can be merged in the current state too.

Comment on lines 150 to 154
if !stream {
if err := decoder.Decode(&event.Stats); err != nil {
return event
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like in either case (error or not) we return the content of the event variable here. Does it even make sense to check for the error in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I was just receiving a linter error. I decided to add an debug log message.

@MichaelKatsoulis MichaelKatsoulis added backport-8.16 Automated backport with mergify backport-8.17 Automated backport with mergify labels Dec 10, 2024
@MichaelKatsoulis MichaelKatsoulis enabled auto-merge (squash) December 10, 2024 08:43
@MichaelKatsoulis MichaelKatsoulis merged commit 1fefdbb into elastic:main Dec 10, 2024
32 checks passed
mergify bot pushed a commit that referenced this pull request Dec 10, 2024
* Add support for podman metrics

(cherry picked from commit 1fefdbb)
mergify bot pushed a commit that referenced this pull request Dec 10, 2024
* Add support for podman metrics

(cherry picked from commit 1fefdbb)
mergify bot pushed a commit that referenced this pull request Dec 10, 2024
* Add support for podman metrics

(cherry picked from commit 1fefdbb)
MichaelKatsoulis added a commit that referenced this pull request Dec 10, 2024
* Add support for podman metrics

(cherry picked from commit 1fefdbb)

Co-authored-by: Michael Katsoulis <[email protected]>
MichaelKatsoulis added a commit that referenced this pull request Dec 10, 2024
* Add support for podman metrics

(cherry picked from commit 1fefdbb)

Co-authored-by: Michael Katsoulis <[email protected]>
MichaelKatsoulis added a commit that referenced this pull request Dec 11, 2024
#41967)

* Add support for podman metrics in docker module (#41889)

---------

Co-authored-by: Michael Katsoulis <[email protected]>
michalpristas pushed a commit to michalpristas/beats that referenced this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify backport-8.17 Automated backport with mergify enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants