Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added config read from pve #22

Merged
merged 6 commits into from
Apr 20, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions src/pve_exporter/collector.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,51 @@ def collect(self): # pylint: disable=missing-docstring

return itertools.chain(metrics.values(), info_metrics.values())

class ClusterNodeConfigCollector(object):
"""
Collects Proxmox VE VM information directly from config, i.e. boot, name, onboot, etc.
For manual test: "pvesh get /nodes/<node>/<type>/<vmid>/config"

# HELP pve_vm_config_onboot Proxmox vm config onboot value
# TYPE pve_vm_config_onboot gauge
pve_vm_config_onboot{id="qemu/113",node="XXXX",type="qemu"} 1.0
"""

def __init__(self, pve):
self._pve = pve

def collect(self): # pylint: disable=missing-docstring
metrics = {
'onboot': GaugeMetricFamily(
'pve_vm_config_onboot',
'Proxmox vm config onboot value',
labels=['id', 'node', 'type']),
Copy link
Member

@znerol znerol Apr 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prometheus metric naming scheme is something like <prefix>_<name>_<unit>(_<aggregation>), e.g. node_cpu_seconds_total. So I guess my preferred name would be pve_onboot_status. The vm prefix is not necessary in my opinion because that one is given by the labels. Looking on other exporters for naming examples I find that node exporter has node_timex_sync_status. Ceph has metrics ceph_health_status, ceph_mgr_status and ceph_mon_quorum_status. So it looks like status might be a good unit for boolean flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know about this and I make the change as you suggest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only learned about that when they changed a big bunch of metric names in node exporter. There is also a post by Brian Brazil and some docs on the prometheus site about naming.

'memory': GaugeMetricFamily(
'pve_vm_config_memory',
'Proxmox vm config memory value',
labels=['id', 'node', 'type']),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is pve_memory_size_bytes already, so this metric seems duplicate. Do you have a case where the new metric is better than the existing pve_memory_size_bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it seems like duplicate. I wanted to test it with ballooning memory management if the API cluster/resources does not return current value of ballooning, but don't have time for that ATM. Maybe in the futre will sent another PR for this if i'll test that.

}

for node in self._pve.nodes.get():
# Qemu
vmtype = 'qemu'
for vmdata in self._pve.nodes(node['node']).qemu.get():
config = self._pve.nodes(node['node']).qemu(vmdata['vmid']).config.get().items()
for key, metric_value in config:
label_values = ["%s/%s" % (vmtype, vmdata['vmid']), node['node'], vmtype]
if key in metrics:
metrics[key].add_metric(label_values, metric_value)
# LXC
vmtype = 'lxc'
for vmdata in self._pve.nodes(node['node']).lxc.get():
config = self._pve.nodes(node['node']).lxc(vmdata['vmid']).config.get().items()
for key, metric_value in config:
label_values = ["%s/%s" % (vmtype, vmdata['vmid']), node['node'], vmtype]
if key in metrics:
metrics[key].add_metric(label_values, metric_value)

return metrics.values()

def collect_pve(config, host):
"""Scrape a host and return prometheus text format for it"""

Expand All @@ -254,5 +299,6 @@ def collect_pve(config, host):
registry.register(ClusterResourcesCollector(pve))
registry.register(ClusterNodeCollector(pve))
registry.register(ClusterInfoCollector(pve))
registry.register(ClusterNodeConfigCollector(pve))
registry.register(VersionCollector(pve))
return generate_latest(registry)