Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia monitor fixes #239

Merged
merged 6 commits into from
May 3, 2024
Merged

Nvidia monitor fixes #239

merged 6 commits into from
May 3, 2024

Conversation

graeme-a-stewart
Copy link
Member

As noticed by @elmsheus, the output fields from the nvidia-smi pmon have changed, which requires an update to the parsing from prmon.

This PR builds on #238 and fixes the reference files used by prmon for "stand alone" testing.

To help with testing GPU parse code, a small burner script, using pycuda is added, gpu-burner.py.

Closes #238

Johannes Elmsheuser and others added 5 commits April 29, 2024 11:22
Update to the new nvidia-smi pmon output fields
Add a pycuda GPU burner script for tests
This can be a "-" instead of 0
Ensure precooked values are fixed to what we want
@graeme-a-stewart graeme-a-stewart requested a review from amete May 3, 2024 10:59
With latest verisons of black and flake8

There is one import in gpu-burner.py that is needed (pycuda.autoinit)
as it has side effects, so this is marked as excempt for flake8
@graeme-a-stewart graeme-a-stewart merged commit 0696123 into main May 3, 2024
7 checks passed
@graeme-a-stewart graeme-a-stewart deleted the nvidia-mon-fixes branch May 3, 2024 13:12
@krasznaa
Copy link

krasznaa commented May 6, 2024

Hi Graeme, Serhan,

For the longer term, I wonder if we could learn from what Syllo/nvtop is doing. 🤔 I have to admit that I also only learned about that project in the last few days. But it is super nice.

image

It seems to extract information from NVIDIA (and AMD, Intel) GPUs in C(++). And it's a lot more information than what nvidia-smi pmon is giving out. (Including, I think very interestingly, the GPU's estimate for how much power it is using at any given moment.)

This may be a very good project for a GSOC, or some other type of student to investigate. 😉

Cheers,
Attila

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants