Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a collector to fetch energy values from /sys/cray/pm_counters on Cray systems #239

Open
mahendrapaipuri opened this issue Dec 9, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@mahendrapaipuri
Copy link
Owner

mahendrapaipuri commented Dec 9, 2024

https://cray-hpe.github.io/docs-csm/en-10/operations/power_management/user_access_to_compute_node_power_data/

@mahendrapaipuri mahendrapaipuri added the enhancement New feature or request label Dec 9, 2024
@jhansonhpe
Copy link

Do you have an HPE system with these counters available to you? I have help facilitate examples/testing/... if you need it.

@mahendrapaipuri
Copy link
Owner Author

Cheers @jhansonhpe for the interest and help offered. Yes, I managed to get some example files from Adastra machine.

If you work for HPE, I have few questions about these counters that you may have answers:

  • What is the scope of these energy counters? What do they exactly measure? For instance, if I take a look at power counters on a node without any accelerators, what parts of blade they measure? On Cray machines, will there be any difference between power reported by pm_counters and BMC?
  • Do HPE generally configure in-band IPMI access to BMC on Cray machines?

Thanks a lot!

@jhansonhpe
Copy link

jhansonhpe commented Dec 20, 2024

Yes I work for HPE. I manage the team that writes monitoring software for the systems.
The energy counters use a special kernel module to reach to the node controller (basically a BMC) and pull off specific values. There is no difference between what is available from pm_counters and the node controller. However the frequency in pm_counters refresh is 10hz while the Redfish on the node controller sends at 1Hz.

The feature of pm_counters is ease of node level access to the values so things like slurm (there is an included plugin from SchedMD) can read the values without having privileged access to Redfish or the system level data collectors.

On a randomly chosen node (same node type as Frontier at ORNL) the counters are
accel0_energy accel1_energy accel2_energy accel3_energy cpu0_temp energy memory_energy power_cap version accel0_power accel1_power accel2_power accel3_power cpu_energy freshness memory_power raw_scan_hz accel0_power_cap accel1_power_cap accel2_power_cap accel3_power_cap cpu_power generation power startup
There is a 1:1 correlation between counter and Redfish sensors (which I can dig up but will take a while) for power and energy.

There is no ipmi in CrayEx node controllers so for ceems to get all the sensor data this becomes a challenge for direct access. Certainly system admins could allow a query to the monitoring databases if there are more metrics desired for ceems to collect or consume from kafka. There will be a slight difference in timestamps as there is a small time in flight delay to get metrics to kafka.

@jhansonhpe
Copy link

bardpeak.TGZ

Is a Redfish Mockup Creator of a node controller for the blade type above.

@mahendrapaipuri
Copy link
Owner Author

Cheers @jhansonhpe for very detailed responses. Appreciate it.

The energy counters use a special kernel module to reach to the node controller (basically a BMC) and pull off specific values. There is no difference between what is available from pm_counters and the node controller. However the frequency in pm_counters refresh is 10hz while the Redfish on the node controller sends at 1Hz.

This is awesome. This would be amazing if the community can standardize the kernel module (like the one from OpenIPMI Driver) that you are using on Cray nodes to give node level access to power/energy counters from generic BMCs. I assume the node controller the kernel module uses now is specific to Cray nodes.

There is no ipmi in CrayEx node controllers so for ceems to get all the sensor data this becomes a challenge for direct access.

Actually, I have added a Redfish Collector to be able to get power metrics from Redfish API server. The little inconvenience is that the BMC network is seldom reachable from the compute node. So, we need to proxy the requests to Redfish from a reverse proxy that must be deployed on a management node where BMC network is reachable. As you rightly pointed out, this will induce slight differences in timestamps due to network latencies.

We have a HPE machine in our center too which is a SGI where in-band IPMI access is configured. So, I was just curious to know if HPE manages Cray nodes in the same way or not.

Thanks for the Redfish mockup responses. Very helpful!!

@jhansonhpe
Copy link

Most hardware is (slowly) moving away from IPMI (insecure, legacy standard) to Redfish. It does come with this exact challenge. I would expect most systems to not permit access to Redfish via direct query (network isolation as you point out plus providing the access credentials AND the possibility for denial of service if the queries are too fast/heavy).

On HPE systems with CSM or HPCM sensor data can be made available in kafka for consumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants