Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for podman metrics in docker module #41889

Merged
merged 10 commits into from
Dec 10, 2024
9 changes: 9 additions & 0 deletions metricbeat/docs/modules/docker.asciidoc
Original file line number Diff line number Diff line change
@@ -22,6 +22,9 @@ The Docker module is currently tested on Linux and Mac with the community
edition engine, versions 1.11 and 17.09.0-ce. It is not tested on Windows,
but it should also work there.

The Docker module supports collection of metrics from Podman's Docker-compatible API.
It has been tested on Linux and Mac with Podman Rest API v2.0.0 and above.

[float]
=== Module-specific configuration notes

@@ -30,6 +33,9 @@ It is strongly recommended that you run Docker metricsets with a
Docker API already takes up to 2 seconds. Specifying less than 3 seconds will
result in requests that timeout, and no data will be reported for those
requests.
In the case of Podman, the configuration parameter `podman` should be set to `true`.
This enables streaming of container stats output, which allows for more accurate
CPU percentage calculations when using Podman.


:edit_url:
@@ -62,6 +68,9 @@ metricbeat.modules:
# If set to true, replace dots in labels with `_`.
#labels.dedot: false
# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
MichaelKatsoulis marked this conversation as resolved.
Show resolved Hide resolved
# podman: false
# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
3 changes: 3 additions & 0 deletions metricbeat/metricbeat.reference.yml
Original file line number Diff line number Diff line change
@@ -268,6 +268,9 @@ metricbeat.modules:
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
3 changes: 3 additions & 0 deletions metricbeat/module/docker/_meta/config.reference.yml
Original file line number Diff line number Diff line change
@@ -17,6 +17,9 @@
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
3 changes: 3 additions & 0 deletions metricbeat/module/docker/_meta/config.yml
Original file line number Diff line number Diff line change
@@ -15,6 +15,9 @@
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's Docker-compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
6 changes: 6 additions & 0 deletions metricbeat/module/docker/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
@@ -11,6 +11,9 @@ The Docker module is currently tested on Linux and Mac with the community
edition engine, versions 1.11 and 17.09.0-ce. It is not tested on Windows,
but it should also work there.

The Docker module supports collection of metrics from Podman's Docker-compatible API.
It has been tested on Linux and Mac with Podman Rest API v2.0.0 and above.

[float]
=== Module-specific configuration notes

@@ -19,3 +22,6 @@ It is strongly recommended that you run Docker metricsets with a
Docker API already takes up to 2 seconds. Specifying less than 3 seconds will
result in requests that timeout, and no data will be reported for those
requests.
In the case of Podman, the configuration parameter `podman` should be set to `true`.
This enables streaming of container stats output, which allows for more accurate
CPU percentage calculations when using Podman.
8 changes: 5 additions & 3 deletions metricbeat/module/docker/config.go
Original file line number Diff line number Diff line change
@@ -19,14 +19,16 @@ package docker

// Config contains the config needed for the docker
type Config struct {
TLS *TLSConfig `config:"ssl"`
DeDot bool `config:"labels.dedot"`
TLS *TLSConfig `config:"ssl"`
DeDot bool `config:"labels.dedot"`
Podman bool `config:"podman"`
}

// DefaultConfig returns default module config
func DefaultConfig() Config {
return Config{
DeDot: true,
DeDot: true,
Podman: false,
}
}

4 changes: 3 additions & 1 deletion metricbeat/module/docker/cpu/cpu.go
Original file line number Diff line number Diff line change
@@ -40,6 +40,7 @@ type MetricSet struct {
cpuService *CPUService
dockerClient *client.Client
dedot bool
podman bool
}

// New creates a new instance of the docker cpu MetricSet.
@@ -68,12 +69,13 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
dockerClient: client,
cpuService: &CPUService{Cores: cpuConfig.Cores},
dedot: config.DeDot,
podman: config.Podman,
}, nil
}

// Fetch returns a list of docker CPU stats.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, m.podman)
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
2 changes: 1 addition & 1 deletion metricbeat/module/docker/diskio/diskio.go
Original file line number Diff line number Diff line change
@@ -89,7 +89,7 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {

// Fetch creates list of events with diskio stats for all containers.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, false)
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
36 changes: 28 additions & 8 deletions metricbeat/module/docker/docker.go
Original file line number Diff line number Diff line change
@@ -91,7 +91,7 @@ func NewDockerClient(endpoint string, config Config) (*client.Client, error) {
}

// FetchStats returns a list of running containers with all related stats inside
func FetchStats(client *client.Client, timeout time.Duration) ([]Stat, error) {
func FetchStats(client *client.Client, timeout time.Duration, stream bool) ([]Stat, error) {
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
containers, err := client.ContainerList(ctx, container.ListOptions{})
@@ -108,7 +108,7 @@ func FetchStats(client *client.Client, timeout time.Duration) ([]Stat, error) {
for _, container := range containers {
go func(container types.Container) {
defer wg.Done()
statsQueue <- exportContainerStats(ctx, client, &container)
statsQueue <- exportContainerStats(ctx, client, &container, stream)
}(container)
}

@@ -133,18 +133,38 @@ func FetchStats(client *client.Client, timeout time.Duration) ([]Stat, error) {
// This is currently very inefficient as docker calculates the average for each request,
// means each request will take at least 2s: https://github.com/docker/docker/blob/master/cli/command/container/stats_helpers.go#L148
// Getting all stats at once is implemented here: https://github.com/docker/docker/pull/25361
func exportContainerStats(ctx context.Context, client *client.Client, container *types.Container) Stat {
// In case stream is true, we use get a stream of results for container stats. From the stream we keep the second result.
// This is needed for podman use case where in case stream is false, no precpu stats are returned. The precpu stats
// are required for the cpu percentage calculation. We keep the second result as in the first result, the stats are not correct.
func exportContainerStats(ctx context.Context, client *client.Client, container *types.Container, stream bool) Stat {
var event Stat
event.Container = container

containerStats, err := client.ContainerStats(ctx, container.ID, false)
containerStats, err := client.ContainerStats(ctx, container.ID, stream)
if err != nil {
return event
}

defer containerStats.Body.Close()
decoder := json.NewDecoder(containerStats.Body)
decoder.Decode(&event.Stats)

// JSON decoder
decoder := json.NewDecoder(containerStats.Body)
if !stream {
if err := decoder.Decode(&event.Stats); err != nil {
return event
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like in either case (error or not) we return the content of the event variable here. Does it even make sense to check for the error in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I was just receiving a linter error. I decided to add an debug log message.

// handle stream. Take the second result.
count := 0
for decoder.More() {
if err := decoder.Decode(&event.Stats); err != nil {
return event
}

count++
// Exit after the second result
if count == 2 {
break
}
}
}
return event
}
4 changes: 3 additions & 1 deletion metricbeat/module/docker/memory/memory.go
Original file line number Diff line number Diff line change
@@ -43,6 +43,7 @@ type MetricSet struct {
memoryService *MemoryService
dockerClient *client.Client
dedot bool
podman bool
logger *logp.Logger
}

@@ -64,13 +65,14 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
memoryService: &MemoryService{},
dockerClient: dockerClient,
dedot: config.DeDot,
podman: config.Podman,
logger: logger,
}, nil
}

// Fetch creates a list of memory events for each container.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, m.podman)
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
2 changes: 1 addition & 1 deletion metricbeat/module/docker/network/network.go
Original file line number Diff line number Diff line change
@@ -66,7 +66,7 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {

// Fetch methods creates a list of network events for each container.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, false)
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
Original file line number Diff line number Diff line change
@@ -84,7 +84,7 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *MetricSet) Fetch(ctx context.Context, report mb.ReporterV2) error {

stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, false)
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
3 changes: 3 additions & 0 deletions metricbeat/modules.d/docker.yml.disabled
Original file line number Diff line number Diff line change
@@ -18,6 +18,9 @@
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's Docker-compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
3 changes: 3 additions & 0 deletions x-pack/metricbeat/metricbeat.reference.yml
Original file line number Diff line number Diff line change
@@ -521,6 +521,9 @@ metricbeat.modules:
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.