-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Topologies for 16-GPU gfx942 SuperNode #1417
base: develop
Are you sure you want to change the base?
Conversation
@BKitor can you elaborate why "ranks" are preferred over "dev" as GPU identifier? |
The dev values aren't guaranteed to be consecutive from 0 - |
Thanks! That's good observation. Can you help applying same dev2rank conversion to other matching functions in rome_models.cc as well? |
0cbf8fa
to
d7de8c1
Compare
@wenkaidu I've refactored the dev2rank mapping stuff a bit, and extended it to the other matching functions. |
@BKitor The patch looks good. However, some model matchings are failing, for example, model 82 and 83. Can you take a look at the issue? |
d7de8c1
to
3b69c50
Compare
- Add GigaIO topologies to tools/topo_expl for dev and testing - Add GigaIO Columba 16 GPU romeModel and adjust topology matching algorithm in rome_models for 16 GPU system - Fix bug which failed to match Rome Model when using subsets of system resources (i.e. ROCR_VISIBLE_DEVICES is set) - Fixes for topo_expl
3b69c50
to
5a0766c
Compare
@BKitor I have trouble building topo_expl with your latest commit. I got error:
|
Commit 3b69c50 was busted, should be fixed with 5a0766c. I've been doing |
Thanks! It works now. |
Unit test fails on Extended pipeline for "rhel8 && 16gfx90a" platform.
|
@BKitor yes, I can confirm issue can by reproduced with topo_expl -m 56 |
Was a one-line fix in Parse1H16P, outputs match what 'develop' generates, shouldn't affect any of the other passing topologies. |
Support for GigaIO's 16x MI300x SuperNode.
Adds a rome_mode and update the searching algorithm to use hardware-efficient rings.
Some of the targeted topologes are provided as .xml files in topo_expl.
There are some other adjustments to the a2a and 4p2h parsing methods. This is so that non-consecutively numbered subsets of the system can still find efficient topologies.