Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Respect the block list when failover #1585

Merged
merged 8 commits into from
Jan 18, 2023
Merged

[core] Respect the block list when failover #1585

merged 8 commits into from
Jan 18, 2023

Conversation

Michaelvll
Copy link
Collaborator

@Michaelvll Michaelvll commented Jan 12, 2023

Fixes #1584

Tested (run the relevant ones):

  • Any manual or new tests for this PR (please specify below)
    • sky launch --gpus A100:8 --cloud aws, correctly failover through the regions.
    • sky launch --gpus A100:8 --use-spot --cloud aws, correctly failover through the zones.
    • sky launch --instance-type m3-megamem-128 failover correctly (skip the region after getting the information that the whole region does not have quota).
    • sky launch --cloud azure failover on Azure (make some change in the azure-ray.yml.j2 to make the image invalid, to trigger the failover).
  • All smoke tests: pytest tests/test_smoke.py (with [Tests] Add options for test to select clouds #1587 merged)

@Michaelvll Michaelvll marked this pull request as ready for review January 13, 2023 04:30
@Michaelvll Michaelvll requested a review from WoosukKwon January 16, 2023 01:27
Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Michaelvll for fixing my bug! The code looks very good to me. Left only one minor comment.

sky/backends/cloud_vm_ray_backend.py Show resolved Hide resolved
sky/backends/cloud_vm_ray_backend.py Outdated Show resolved Hide resolved
@Michaelvll Michaelvll merged commit 05727f8 into master Jan 18, 2023
@Michaelvll Michaelvll deleted the blocklist branch January 18, 2023 05:49
@Michaelvll Michaelvll mentioned this pull request Jan 24, 2023
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core] Failover does not respect the blocked list
2 participants