Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add docker support for SkyPilot (#1910)
* successfully launched on GCP * update image with rsync installed && fix name error in conda command * support dockerimage on task yaml && reformat code * recoup sudo in setup commands * reformat code * support direct ssh to docker && fix mesg warning && reformat * fix gcp port issue * aws successfully launched * fix port range * successfullt launched on Azure * remove unused exception class * reformat setup command to azure.py in plain text for readbility * fix job queue cannot cancel * reformat code * switch to port 22 for docker ssh * move docker image to resources: image_id: docker:<image> and change to an optional function * minor fix for ray yaml j2 * fix error when image_id is a dict * support muli-node * mode docker user setup after handle is created * support images without rsync * add aws && azure support * remove redundant pip3 install * remove error merging * move docker image to resources * minor fixes * format * adjust extrack docker image * change back to port 22 for host and port 10022 for docker, passed stop-start recovery test * add ulimit and gcp 10022 enable outside oslogin * fix gcp suthentication & move ssh authorized keys setup to run_init * fix len(image_id) when image_id is None * use docker stop & start to recover * temporary remove conflict * add back docker user * fix wrong username in add job * use ssh jump server to access docker * remove inbloud rules of 10022 * update comment * now ssh into docker * ux: raise rather than assert * move some setup commands to SkyDockerCommandRunner * monir fix * format * fix multinode ssh config * Update sky/backends/backend_utils.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/backends/backend_utils.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/backends/backend_utils.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/clouds/azure.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/skylet/providers/gcp/config.py Co-authored-by: Zhanghao Wu <[email protected]> * minor fixes * format * move get_docker_user to backend_utils.py * move two constants to skylet and move resources vars to make deployment vars * move SkyDockerCommandRunner to skylet/providers * format * support proxy command with docker * Update sky/backends/backend_utils.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/clouds/aws.py Co-authored-by: Zhanghao Wu <[email protected]> * add comment * quote docker ssh proxy command * explicit checking for Optional object * fix credentials * remove -m in bash script * fix job queue owner * fix job owner and username * add job queue smoke test * add test_docker_preinstalled_package * fix restart error * nit: code style for proxy command * update CloudVmRayResourceHandle version * disable unattended-upgrade with cloud-init * fix UnboundLocalError * fix variable shadow * move checking for targetTags to #2210 * rename * restore some deprecated changes * format * disable tpu with docker * fix acc inexisting problem * add progress bar to docker image pulling * Apply suggestions from code review Co-authored-by: Zhanghao Wu <[email protected]> * Apply suggestions from code review * Try cloud init without base64 encode Co-authored-by: Zhanghao Wu <[email protected]> * Revert "Try cloud init without base64 encode" This reverts commit 418c912. * add comment in cmd runner * add failover for clous that not support docker yet * temporary remove check of docker image * add docker to resources.get_required_cloud_features * format * install pip in run_init * format * disable ssh control when docker is used * add docker user to resource handle's repr * stash some changes for easier merge * minor * add back previously stashed function * disable docker with proxy for now * change constants.DEFAULT_DOCKER_PORT to int * rename NATIVE_DOCKER_SUPPORT to DOCKER_IMAGE * add todo for only support debian-based images * add check for proxy command * fix docker_config={} * upd docker test * upd docker test * move proxy command check to cloud.check_features_are_supported --------- Co-authored-by: Zhanghao Wu <[email protected]>
- Loading branch information