-
Notifications
You must be signed in to change notification settings - Fork 529
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[AWS] Bring-your-own-VPC that disables public IPs for all SkyPilot no…
…des. (#1512) * Minor: sky logs hint * Minor: add a FIXME in authentication.py. * New module: sky_config * Backend changes for SSH proxy command support. * spot_launch(): sync up config; pop any proxy command. * AutostopEvent: monkey patch SSHOptions. * aws/config.py: support vpc_name and new use_internal_ips semantics. * Make failover catch our 'ERROR' messages from AWS node provider. * .j2 changes. * Fix local launch hash for workers: must pop ssh_proxy_command. * Fix pylint. * typo * smoke: make printf usage safe. * Use SKYPILOT_ERROR as logging prefix. * Fix Resources.__repr__(). * Avoid printing unnecessary termination errors for VPC-not-found. * Fix a syntax error in codegen. * Read from SKYPILOT_CONFIG env var to permit dynamic generation. * Fix smoke test name. * Fix another test name * Revert "Read from SKYPILOT_CONFIG env var to permit dynamic generation." This reverts commit 0b982cd. * Fix head_node_launch_requested log line. * Optional: env var to read configs for spot, for better isolation. * Make query_head_ip_with_retries() robust to extra output. * aws/config.py: reword comments * events.py: restart_only=True * Fix Resources.__repr__ to handle None fields. * Use SKYPILOT_ERROR_NO_NODES_LAUNCHED * rstrip() for ssh config entries. * authentication.py: reword comment * pylint * Fix logging * Try using reties for handle.{internal,external}_ips(). * Address some easy comments * Typo * backend_utils: fix worker IPs fetch; fix >80-col lines. * Fix test_minimal. * test_smoke: printf -> echo * Query IPs once. * Drop ssh_proxy_command in launch hash when provisioning. * Typo * Typo * Add comment * sky/sky_config.py -> sky/skypilot_config.py * Add: sky/backends/monkey_patches/ * Remove logging * pylint * MANIFEST should add monkey patch file * tests/test_smoke.py: fix extra \n * Fix monkey patching bug. * Remove AutostopEvent monkey patching. * _ray_launch_hash: pop ssh proxy command for head & workers * Make another 'ray up' use patched launch hash fn. * Fix smoke tests. * Fix smoke: K80 VMs could be non-ssh-able (and are more costly).
- Loading branch information
1 parent
34e7fee
commit 8d6f6a9
Showing
17 changed files
with
871 additions
and
253 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,6 @@ | |
|
||
resources: | ||
cloud: aws | ||
accelerators: K80 | ||
accelerators: T4 | ||
|
||
num_nodes: 2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,7 @@ | |
name: job_multinode | ||
|
||
resources: | ||
accelerators: K80:0.5 | ||
accelerators: T4:0.5 | ||
|
||
num_nodes: 2 | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
42 changes: 42 additions & 0 deletions
42
sky/backends/monkey_patches/ray_up_with_monkey_patched_hash_launch_conf.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
"""Runs `ray up` while not using ssh_proxy_command in launch hash. | ||
This monkey patches the hash_launch_conf() function inside Ray autoscaler to | ||
exclude any ssh_proxy_command in hash calculation. | ||
Reasons: | ||
- In the future, we want to support changing the ssh_proxy_command field for | ||
an existing cluster. If the launch hash included this field, then this would | ||
mean upon such a change a new cluster would've been launched, causing | ||
leakage. | ||
- With our patch, ssh_proxy_command will be excluded from the launch hash when | ||
a cluster is first created. This then makes it possible for us to support | ||
changing the proxy command in the future. | ||
""" | ||
import hashlib | ||
import json | ||
import os | ||
|
||
from ray.autoscaler import sdk | ||
|
||
|
||
# Ref: https://github.com/ray-project/ray/blob/releases/2.2.0/python/ray/autoscaler/_private/util.py#L392-L404 | ||
def monkey_patch_hash_launch_conf(node_conf, auth): | ||
hasher = hashlib.sha1() | ||
# For hashing, we replace the path to the key with the key | ||
# itself. This is to make sure the hashes are the same even if keys | ||
# live at different locations on different machines. | ||
full_auth = auth.copy() | ||
full_auth.pop('ssh_proxy_command', None) # NOTE: skypilot changes. | ||
for key_type in ['ssh_private_key', 'ssh_public_key']: | ||
if key_type in auth: | ||
with open(os.path.expanduser(auth[key_type])) as key: | ||
full_auth[key_type] = key.read() | ||
hasher.update( | ||
json.dumps([node_conf, full_auth], sort_keys=True).encode('utf-8')) | ||
return hasher.hexdigest() | ||
|
||
|
||
# Since hash_launch_conf is imported this way, we must patch this imported | ||
# version. | ||
sdk.sdk.commands.hash_launch_conf = monkey_patch_hash_launch_conf | ||
sdk.create_or_update_cluster({ray_yaml_path}, **{ray_up_kwargs}) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.