Skip to content

Commit

Permalink
fixing module name
Browse files Browse the repository at this point in the history
  • Loading branch information
lewinkedrs committed Jan 19, 2024
1 parent 7daed72 commit 368745c
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/eks/gpu-monitoring.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Monitoring NVIDIA GPU Workloads

GPUs play an integral part in data intensive workloads. The base infrastructure module of the Observability Accelerator provides the ability to deploy the NVIDIA DCGM Exporter Dashboard.
The dashboard utilizes metrics scraped from the `/metrics` endpoint that are exposed when running the nvidia gpu operator and NVSMI binary.
GPUs play an integral part in data intensive workloads. The eks-monitoring module of the Observability Accelerator provides the ability to deploy the NVIDIA DCGM Exporter Dashboard.
The dashboard utilizes metrics scraped from the `/metrics` endpoint that are exposed when running the nvidia gpu operator with the [DCGM exporter](https://developer.nvidia.com/blog/monitoring-gpus-in-kubernetes-with-dcgm/) and NVSMI binary.

!!!note
In order to make use of this dashboard, you will need to have a GPU backed EKS cluster and deploy the [GPU operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/amazon-eks.html)
Expand Down

0 comments on commit 368745c

Please sign in to comment.