Add support for MIG and vGPUs in exporter #193
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Exporter estimates a coefficient based on relative number of SMs in MIG profile and it can be used along with dcgm-exporter to estimate power consumption of MIG instance.
Similarly, for the vGPU, we keep track of number of active instances scheduled on either a physical GPU or a MIG instance and estimate the coefficient which can be used to estimate power consumption of each vGPU
Support defining the GPU ordering for SLURM collector as the ordering can be undefined when a mix of MIG and full GPUs are used on compute node.
Split all GPU related functions into a separate file and add more unit tests
Modify mocked resources appropriately to test different scenarios in unit and e2e tests
Update docs and add a new section on power estimation on GPUs when MIG and vGPUs are used on compute nodes.
Closes #187