-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show DEVICE_MEMORY in show-gpus
for AWS & Lambda.
#1825
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left minor comments.
device_memory_str = (f'{item.device_memory:.0f}GB' if | ||
not pd.isna(item.device_memory) else '-') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the definition of device memory? IIRC, it is the amount of memory in a single device and does not depend on the device count, right? Then what about TPUs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some definition to -h
. Wdyt? For TPU, we can defer adding documentation for it since we don't have that info in the catalog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @WoosukKwon, PTAL.
device_memory_str = (f'{item.device_memory:.0f}GB' if | ||
not pd.isna(item.device_memory) else '-') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some definition to -h
. Wdyt? For TPU, we can defer adding documentation for it since we don't have that info in the catalog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks.
* ``DEVICE_MEM``: Memory of a single device; does not depend on the device | ||
count of the instance (VM). | ||
|
||
* ``HOST_MEM``: Memory of the host instance (VM). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Do we need `` here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this doesn't look good on -h
but is what's needed for rst/sphinx docs.
Partially address #1764: only AWS & Lambda catalogs have the relevant info for now.
Azure / GCP's offerings will show a
-
under theDEVICE_MEM
column.Tested (run the relevant ones):
sky show-gpus -a
pytest tests/test_smoke.py --aws
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh