Skip to content

plazonic/cgroup_exporter

Repository files navigation

cgroup Prometheus exporter

Build Status GitHub release GitHub All Releases codecov

cgroup Prometheus exporter

The cgroup_exporter produces metrics from cgroups.

This is a modified version of https://github.com/treydock/cgroup_exporter tuned to work better for slurm cgroups v1. For cgroups v2 please check the cgroupv2 branch.

This exporter by default listens on port 9306 and all metrics are exposed via the /metrics endpoint.

Usage

The --config.paths flag is required and must point to paths of cgroups to monitor. If there is /sys/fs/cgroup/cpuacct/user.slice then the value for --config.paths would be /user.slice.

Install

Clone the repo

Build from source

To produce the cgroup_exporter binaries:

go build

Process metrics

If you wish to collect process information for a cgroup pass the --collect.proc flag. If this exporter is not running as root then it's required to set capabilities to ensure the user running this exporter can read everything under procfs:

setcap cap_sys_ptrace=eip /usr/bin/cgroup_exporter

Metrics

Example of metrics exposed by this exporter when looking at /user.slice paths:

cgroup_cpu_system_seconds{cgroup="/user.slice/user-20821.slice"} 1.96
cgroup_cpu_total_seconds{cgroup="/user.slice/user-20821.slice"} 3.817500568
cgroup_cpu_user_seconds{cgroup="/user.slice/user-20821.slice"} 1.61
cgroup_cpus{cgroup="/user.slice/user-20821.slice"} 0
cgroup_cpu_info{cgroup="/user.slice/user-20821.slice",cpus=""} 1
cgroup_info{cgroup="/user.slice/user-20821.slice",uid="20821",username="tdockendorf",jobid=""} 1
cgroup_memory_cache_bytes{cgroup="/user.slice/user-20821.slice"} 2.322432e+06
cgroup_memory_fail_count{cgroup="/user.slice/user-20821.slice"} 0
cgroup_memory_rss_bytes{cgroup="/user.slice/user-20821.slice"} 5.378048e+06
cgroup_memory_total_bytes{cgroup="/user.slice/user-20821.slice"} 6.8719476736e+10
cgroup_memory_used_bytes{cgroup="/user.slice/user-20821.slice"} 6.90176e+06
cgroup_memsw_fail_count{cgroup="/user.slice/user-20821.slice"} 0
cgroup_memsw_total_bytes{cgroup="/user.slice/user-20821.slice"} 9.223371968135295e+18
cgroup_memsw_used_bytes{cgroup="/user.slice/user-20821.slice"} 0

Example of metrics exposed by this exporter when looking at /slurm paths:

cgroup_cpu_system_seconds{cgroup="/slurm/uid_20821/job_12"} 0
cgroup_cpu_total_seconds{cgroup="/slurm/uid_20821/job_12"} 0.007840451
cgroup_cpu_user_seconds{cgroup="/slurm/uid_20821/job_12"} 0
cgroup_cpus{cgroup="/slurm/uid_20821/job_12"} 2
cgroup_cpu_info{cgroup="/slurm/uid_20821/job_12",cpus="0,1"} 1
cgroup_info{cgroup="/slurm/uid_20821/job_12",jobid="12",uid="20821",username="tdockendorf"} 1
cgroup_memory_cache_bytes{cgroup="/slurm/uid_20821/job_12"} 4.096e+03
cgroup_memory_fail_count{cgroup="/slurm/uid_20821/job_12"} 0
cgroup_memory_rss_bytes{cgroup="/slurm/uid_20821/job_12"} 3.11296e+05
cgroup_memory_total_bytes{cgroup="/slurm/uid_20821/job_12"} 2.147483648e+09
cgroup_memory_used_bytes{cgroup="/slurm/uid_20821/job_12"} 315392
cgroup_memsw_fail_count{cgroup="/slurm/uid_20821/job_12"} 0
cgroup_memsw_total_bytes{cgroup="/slurm/uid_20821/job_12"} 2.147483648e+09
cgroup_memsw_used_bytes{cgroup="/slurm/uid_20821/job_12"} 315392

Example of metrics exposed by this exporter when looking at /torque paths:

cgroup_cpu_system_seconds{cgroup="/torque/1182958.batch.example.com"} 26.35
cgroup_cpu_total_seconds{cgroup="/torque/1182958.batch.example.com"} 939.568245515
cgroup_cpu_user_seconds{cgroup="/torque/1182958.batch.example.com"} 915.61
cgroup_cpus{cgroup="/torque/1182958.batch.example.com"} 8
cgroup_cpu_info{cgroup="/torque/1182958.batch.example.com",cpus="0,1,2,3,4,5,6,7,8"} 1
cgroup_info{cgroup="/torque/1182958.batch.example.com",jobid="1182958",uid="",username=""} 1
cgroup_memory_cache_bytes{cgroup="/torque/1182958.batch.example.com"} 1.09678592e+08
cgroup_memory_fail_count{cgroup="/torque/1182958.batch.example.com"} 0
cgroup_memory_rss_bytes{cgroup="/torque/1182958.batch.example.com"} 8.2444320768e+10
cgroup_memory_total_bytes{cgroup="/torque/1182958.batch.example.com"} 1.96755132416e+11
cgroup_memory_used_bytes{cgroup="/torque/1182958.batch.example.com"} 5.3434466304e+10
cgroup_memsw_fail_count{cgroup="/torque/1182958.batch.example.com"} 0
cgroup_memsw_total_bytes{cgroup="/torque/1182958.batch.example.com"} 1.96755132416e+11
cgroup_memsw_used_bytes{cgroup="/torque/1182958.batch.example.com"} 5.3434466304e+10

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published