Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Fix GPU labelling nomenclature #3274

Merged
merged 1 commit into from
Mar 5, 2024
Merged

Conversation

romilbhardwaj
Copy link
Collaborator

Currently, a RTX A6000 gets labelled as A6000, which is a different card. This further causes confusion when the user tries to run sky launch --gpus a6000 as prompted by sky local up output:

...
Local Kubernetes cluster created successfully with 8 CPUs and 1 a6000 GPUs. `sky launch` can now run tasks locally.

$ sky launch --gpus a6000
ValueError: Accelerator name 'a6000' is ambiguous. Please choose one of ['A6000', 'RTXA6000'].

with this PR, the naming for RTX series card is aligned with runpod, fluidstack and other clouds' catalogs.

  • RTX A6000 - RTXA6000
  • Nvidia Geforce RTX 4090 - RTX4090
    etc.
  • Manual tests - sky local up (which also invokes labelling) on A6000 and A100 machines from fluidstack.

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this and aligning the GPU names @romilbhardwaj! LGTM with a small nit

# 2. remove 'GeForce ' if present (e.g., 'NVIDIA GeForce RTX 3070' -> 'RTX 3070')
# 3. replace 'RTX ' with 'RTX' (without spaces) (e.g., 'RTX 6000' -> 'RTX6000')
# 4. replace any other spaces with dashes (e.g. 'RTX 2080 Ti' -> 'RTX2080-Ti')
gpu_name = gpu_name.lower().replace('nvidia ', '').replace('geforce ', '').replace('rtx ', 'rtx').replace(' ', '-')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make the gpu name in the label uppercase to keep it align with the other parts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally kubernetes labels and values are lowercase, so keeping in line with convention. Also since label matching is case-sensitive, changing now would require deeper backward compatibility changes

@romilbhardwaj romilbhardwaj merged commit 9460735 into master Mar 5, 2024
19 checks passed
@romilbhardwaj romilbhardwaj deleted the k8s_gpu_label_naming branch March 5, 2024 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants