Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sky/feat/show-gpus] adding region based filtering for show-gpus command. #1187

Conversation

vivekkhimani
Copy link

@vivekkhimani vivekkhimani commented Oct 1, 2022

Acceptance Criteria

This PR aims to add the features proposed in #1170. Based on the feedback received from the discussion in the issue, the following changes are introduced:

  • Add support for a --region flag on the show-gpus command.
    • The flag is only valid if the cloud flag is provided.
    • As we expect different cloud providers to have different naming conventions for the regions, the --region argument will be invalidated if it can't be found in the target provider.
  • Add/update the relevant test cases.
  • Update the show-gpus documentation to include the --region flag and the instructions to use it.

@vivekkhimani
Copy link
Author

@romilbhardwaj can you help me approve the workflows blocked for the first-time contributors?

@romilbhardwaj
Copy link
Collaborator

Done, should be running now!

@vivekkhimani
Copy link
Author

vivekkhimani commented Oct 2, 2022

Should be ready for review contingent on the fact that the workflows pass! @romilbhardwaj

@vivekkhimani vivekkhimani marked this pull request as ready for review October 2, 2022 07:41
@vivekkhimani vivekkhimani changed the title [Draft] [sky/feat/show-gpus] adding region based filtering for show-gpus command. [sky/feat/show-gpus] adding region based filtering for show-gpus command. Oct 2, 2022
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @vivekkhimani! Left some comments and notes on unintended behavior I observed.

sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
@romilbhardwaj
Copy link
Collaborator

It would be great to have hints on regions supported by a cloud if the user enters incorrect regions.

For instance, for small typos, I noticed we have automatic suggestions which is great:

(base) romilb@romilbx1yoga:~$ sky show-gpus --region us-west1 --cloud aws
ValueError: Invalid region 'us-west1'
Did you mean one of these: 'us-west-1'?

However, for incorrectly specified regions (severe typos? :) ), the error message isn't very helpful:

(base) romilb@romilbx1yoga:~$ sky show-gpus --region uswest --cloud azure
ValueError: Invalid region 'uswest'

Can we have a line like List of supported azure regions: westus, <list of azure regions> here?

@romilbhardwaj
Copy link
Collaborator

romilbhardwaj commented Oct 2, 2022

Problem

I noticed that we added REGION column to sky show-gpus -a makes the output seem like SkyPilot supports only one region.

For instance, we know that V100 is available in aws us-east-1, aws us-east-2 and aws us-west-2. This is also recorded in our catalog (L4884-L4896 in aws.csv).

However, when I run sky show-gpus -a, the addition of the region column makes it seem SkyPilot supports only us-east-1 on aws:

GPU   QTY  CLOUD  INSTANCE_TYPE       vCPUs  HOST_MEMORY  HOURLY_PRICE  HOURLY_SPOT_PRICE  REGION       
V100  1    AWS    p3.2xlarge          8      61GB         $ 3.060       $ 0.918            us-east-1    
V100  4    AWS    p3.8xlarge          32     244GB        $ 12.240      $ 3.672            us-east-1    
V100  8    AWS    p3.16xlarge         64     488GB        $ 24.480      $ 7.344            us-east-1    
V100  1    Azure  Standard_NC6s_v3    6      112GB        $ 3.060       $ 0.846            centralus    
V100  2    Azure  Standard_NC12s_v3   12     224GB        $ 6.120       $ 1.693            centralus    
V100  4    Azure  Standard_NC24rs_v3  24     448GB        $ 13.460      $ 3.724            centralus    
V100  4    Azure  Standard_NC24s_v3   24     448GB        $ 12.240      $ 3.386            centralus    
V100  1    GCP    (attachable)        -      -            $ 2.480       $ 0.740            us-central1  
V100  2    GCP    (attachable)        -      -            $ 4.960       $ 1.480            us-central1  
V100  4    GCP    (attachable)        -      -            $ 9.920       $ 2.960            us-central1
V100  8    GCP    (attachable)        -      -            $ 19.840      $ 5.920            us-central1  

Of course, we don't want to show all regions in a new line in sky show-gpus -a since that would result in a really long output.

Proposed solution

  • We do not show regions when the user specifies sky show-gpus -a option.
  • We show REGION column only when both a cloud and the gpu_name are specified or --region flag is used. We then print each region in new line. E.g., when I run sky show-gpus V100 --cloud aws:
(base) romilb@romilbx1yoga:~$ sky show-gpus V100 --cloud aws
*NOTE*: for most GCP accelerators, INSTANCE_TYPE == (attachable) means the host VM's cost is not included.

GPU   QTY  CLOUD  INSTANCE_TYPE  vCPUs  HOST_MEMORY  HOURLY_PRICE  HOURLY_SPOT_PRICE  REGIONS     
V100  1    AWS    p3.2xlarge     8      61GB         $ 3.060       $ 0.918            us-east-1
V100  1    AWS    p3.2xlarge     8      61GB         $ 3.060       $ 0.918            us-west-2
V100  4    AWS    p3.8xlarge     32     244GB        $ 12.240      $ 3.672            us-east-1
V100  4    AWS    p3.8xlarge     32     244GB        $ 12.240      $ 3.672            us-west-2
V100  8    AWS    p3.16xlarge    64     488GB        $ 24.480      $ 7.344            us-east-1
V100  8    AWS    p3.16xlarge    64     488GB        $ 24.480      $ 7.344            us-west-2
<show more regions, currently we don't>

GPU        QTY  CLOUD  INSTANCE_TYPE  vCPUs  HOST_MEMORY  HOURLY_PRICE  HOURLY_SPOT_PRICE  REGION     
V100-32GB  8    AWS    p3dn.24xlarge  96     768GB        $ 31.212      $ 9.364            us-east-1  
<show more regions, currently we don't>

Only problem is this will likely be a long output with lots of similar lines, but it does solve the pain point in #1170. @Michaelvll please feel free to suggest any other way you'd like to see per region pricing information.

@vivekkhimani
Copy link
Author

@romilbhardwaj thanks for the review! All of these comments make sense and I have a pretty good idea to fix them. I have been caught up w some school work but I will get to this by end of this week.

@romilbhardwaj
Copy link
Collaborator

Hi @vivekkhimani - any updates on this?

@vivekkhimani
Copy link
Author

It would be great to have hints on regions supported by a cloud if the user enters incorrect regions.

For instance, for small typos, I noticed we have automatic suggestions which is great:

(base) romilb@romilbx1yoga:~$ sky show-gpus --region us-west1 --cloud aws
ValueError: Invalid region 'us-west1'
Did you mean one of these: 'us-west-1'?

However, for incorrectly specified regions (severe typos? :) ), the error message isn't very helpful:

(base) romilb@romilbx1yoga:~$ sky show-gpus --region uswest --cloud azure
ValueError: Invalid region 'uswest'

Can we have a line like List of supported azure regions: westus, <list of azure regions> here?

Added a fix for this @romilbhardwaj


Problem

I noticed that we added REGION column to sky show-gpus -a makes the output seem like SkyPilot supports only one region.

For instance, we know that V100 is available in aws us-east-1, aws us-east-2 and aws us-west-2. This is also recorded in our catalog (L4884-L4896 in aws.csv).

However, when I run sky show-gpus -a, the addition of the region column makes it seem SkyPilot supports only us-east-1 on aws:

GPU   QTY  CLOUD  INSTANCE_TYPE       vCPUs  HOST_MEMORY  HOURLY_PRICE  HOURLY_SPOT_PRICE  REGION       
V100  1    AWS    p3.2xlarge          8      61GB         $ 3.060       $ 0.918            us-east-1    
V100  4    AWS    p3.8xlarge          32     244GB        $ 12.240      $ 3.672            us-east-1    
V100  8    AWS    p3.16xlarge         64     488GB        $ 24.480      $ 7.344            us-east-1    
V100  1    Azure  Standard_NC6s_v3    6      112GB        $ 3.060       $ 0.846            centralus    
V100  2    Azure  Standard_NC12s_v3   12     224GB        $ 6.120       $ 1.693            centralus    
V100  4    Azure  Standard_NC24rs_v3  24     448GB        $ 13.460      $ 3.724            centralus    
V100  4    Azure  Standard_NC24s_v3   24     448GB        $ 12.240      $ 3.386            centralus    
V100  1    GCP    (attachable)        -      -            $ 2.480       $ 0.740            us-central1  
V100  2    GCP    (attachable)        -      -            $ 4.960       $ 1.480            us-central1  
V100  4    GCP    (attachable)        -      -            $ 9.920       $ 2.960            us-central1
V100  8    GCP    (attachable)        -      -            $ 19.840      $ 5.920            us-central1  

Of course, we don't want to show all regions in a new line in sky show-gpus -a since that would result in a really long output.

Proposed solution

  • We do not show regions when the user specifies sky show-gpus -a option.
  • We show REGION column only when both a cloud and the gpu_name are specified or --region flag is used. We then print each region in new line. E.g., when I run sky show-gpus V100 --cloud aws:
(base) romilb@romilbx1yoga:~$ sky show-gpus V100 --cloud aws
*NOTE*: for most GCP accelerators, INSTANCE_TYPE == (attachable) means the host VM's cost is not included.

GPU   QTY  CLOUD  INSTANCE_TYPE  vCPUs  HOST_MEMORY  HOURLY_PRICE  HOURLY_SPOT_PRICE  REGIONS     
V100  1    AWS    p3.2xlarge     8      61GB         $ 3.060       $ 0.918            us-east-1
V100  1    AWS    p3.2xlarge     8      61GB         $ 3.060       $ 0.918            us-west-2
V100  4    AWS    p3.8xlarge     32     244GB        $ 12.240      $ 3.672            us-east-1
V100  4    AWS    p3.8xlarge     32     244GB        $ 12.240      $ 3.672            us-west-2
V100  8    AWS    p3.16xlarge    64     488GB        $ 24.480      $ 7.344            us-east-1
V100  8    AWS    p3.16xlarge    64     488GB        $ 24.480      $ 7.344            us-west-2
<show more regions, currently we don't>

GPU        QTY  CLOUD  INSTANCE_TYPE  vCPUs  HOST_MEMORY  HOURLY_PRICE  HOURLY_SPOT_PRICE  REGION     
V100-32GB  8    AWS    p3dn.24xlarge  96     768GB        $ 31.212      $ 9.364            us-east-1  
<show more regions, currently we don't>

Only problem is this will likely be a long output with lots of similar lines, but it does solve the pain point in #1170. @Michaelvll please feel free to suggest any other way you'd like to see per region pricing information.

@romilbhardwaj very true! fixed this by disabling the region column on -a option.

@vivekkhimani
Copy link
Author

Hi @vivekkhimani - any updates on this?

@romilbhardwaj extremely sorry for the huge delay! I got busy with school and my own project because of a conference deadline and then was traveling during the winter break. I fixed/resolved all the comments and most of the pylint/yapf issues. Can you give me a maintainer approval to run the pipelines again?

@vivekkhimani
Copy link
Author

@romilbhardwaj supporting screenshots for the fix.

  • hints on supported regions:

Screen Shot 2022-12-28 at 4 40 58 PM

- regions column disabled on ```-a``` flag:

Screen Shot 2022-12-28 at 4 42 48 PM

@romilbhardwaj
Copy link
Collaborator

Thanks @vivekkhimani! I'll get to this soon.

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vivekkhimani - I tested this out and it works nicely (including for gcp TPUs)! Left some minor nit comments - should be ready to ship once resolved!

sky/clouds/service_catalog/common.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @vivekkhimani! This looks good to go!

@vivekkhimani
Copy link
Author

Thanks for reviewing it all along, @romilbhardwaj ! Don't think I can merge this as I don't have the write access to repository so do you mind doing it for me?

@romilbhardwaj romilbhardwaj merged commit b42e01f into skypilot-org:master Jan 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants