Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a dashboard component that displays RMM memory #5740

Closed
wants to merge 15 commits into from

Conversation

shwina
Copy link

@shwina shwina commented Jan 31, 2022

  • Closes #xxxx
  • Tests added / passed
  • Passes pre-commit run --all-files

This PR adds a visualization that provides a bit more information than the current GPU memory utilization plot. It also shows the amount of GPU memory being managed by RMM, and in the case of a RMM pool, the amount of pool memory actually used:

Screen Shot 2022-01-31 at 3 46 54 PM

The purple region of the bar represents the total amount of memory managed by RMM. Dark purple shows the amount of memory actively being utilized (this is relevant when a memory pool is used). The green region represents GPU memory used outside of RMM.

@quasiben
Copy link
Member

quasiben commented Feb 1, 2022

Thanks @shwina this will be very useful when trying to understand memory consumption with RAPIDS workloads.

cc @charlesbluca & @rjzamora

@VibhuJawa
Copy link

VibhuJawa commented Feb 1, 2022

CC: @ayushdg , This will be useful when deciding on the choice of 32GB vs 40 GB vs 80 GB for gpu-bdb workflows .

@github-actions
Copy link
Contributor

github-actions bot commented Feb 11, 2022

Unit Test Results

       12 files  ±  0         12 suites  ±0   5h 57m 7s ⏱️ + 2m 32s
  2 654 tests +  1    2 568 ✔️  - 1    81 💤 +  1  5 +1 
13 031 runs  +10  12 375 ✔️  - 3  650 💤 +12  6 +1 

For more details on these failures, see this check.

Results for commit 34a3cc7. ± Comparison against base commit 1b4d13f.

♻️ This comment has been updated with latest results.

rapids-bot bot pushed a commit to rapidsai/dask-cuda that referenced this pull request Feb 25, 2022
Adds the `rmm_track_allocations` option that enables workers to query the amount of RMM memory allocated at any time via `mr.get_allocated_bytes()`.

This is used in dask/distributed#5740.

Authors:
  - Ashwin Srinath (https://github.com/shwina)
  - Charles Blackmon-Luca (https://github.com/charlesbluca)
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Peter Andreas Entschev (https://github.com/pentschev)

URL: #842
@shwina shwina marked this pull request as ready for review March 7, 2022 15:00
Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

I noticed that RMM doesn't get capitalised correctly in the UI. We special case CPU and GPU so we should update that list.

image

I'm also not super confident on the colours. Purple for RMM makes sense, but it vibrates a bit with the green. I wonder if we would either reduce the opacity of the green or switch to a different purple. In the worker memory plot we use three opacities of the same blue. It would be good to try and stay consistent.

@jacobtomlinson
Copy link
Member

Just wanted to bump this @shwina .

What do you think about my comment on the colours?

I'm also not super confident on the colours. Purple for RMM makes sense, but it vibrates a bit with the green. I wonder if we would either reduce the opacity of the green or switch to a different purple. In the worker memory plot we use three opacities of the same blue. It would be good to try and stay consistent.

@shwina
Copy link
Author

shwina commented May 11, 2022

Hi @jacobtomlinson -- thanks for the ping and sorry to let this PR languish. I'll bring it back to speed and go with "varying opacities of green", if that sounds alright to you.

rapids-bot bot pushed a commit to rapidsai/dask-cuda that referenced this pull request Mar 29, 2023
Tracking RMM allocation will be useful together with dask/distributed#5740 , and will help with the analysis of memory fragmentation when comparing regular pool and the async memory allocator.

Authors:
  - Peter Andreas Entschev (https://github.com/pentschev)

Approvers:
  - Benjamin Zaitlen (https://github.com/quasiben)

URL: #1145
@jacobtomlinson
Copy link
Member

Closing as superseded by #7718. Thanks for getting this started @shwina.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants