Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Prometheus Metrics to expose resource usage to users #3552

Open
HyeockJinKim opened this issue Jan 27, 2025 — with Lablup-Issue-Syncer · 0 comments
Open

Export Prometheus Metrics to expose resource usage to users #3552

HyeockJinKim opened this issue Jan 27, 2025 — with Lablup-Issue-Syncer · 0 comments
Assignees

Comments

@HyeockJinKim
Copy link
Collaborator

HyeockJinKim commented Jan 27, 2025

Motivation  

  • Currently, users cannot monitor their GPU/NPU utilization in the BAI console. Providing Prometheus metrics for resource usage will allow external tools like Grafana to display utilization data, addressing transparency and user needs.

Required Features

  • Export Prometheus Metrics:
    • Enable resource usage metrics (e.g., GPU/NPU utilization) to be exported via Prometheus for external monitoring.
      • GPU/NPU real-time usage
      • GPU/NPU cumulative usage

Impact  

  • Prometheus Integration
    • Metrics export functionality needs to be implemented to expose GPU/NPU utilization data.
  • External Monitoring Tools
    • Enables tools like Grafana to visualize and monitor the metrics.

Testing Scenarios  

  • Integration with Grafana:
    • Test that the exported Prometheus metrics can be visualized in Grafana.
@HyeockJinKim HyeockJinKim self-assigned this Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant