DAG API Enhancements: Introducing Downstream Task Parsing and Explici…

…t Flow Definition (#4067) * provide an example, edited from pipeline.yml * more focus on dependencies for user dag lib * more powerful user interface * load and dump new yaml format * fix * fix: reversed logic in add_edge * [docs] Unroll k8s internal load balancer docs (#4083) unroll load balancer docs * rename * refactor due to reviewer's comments * generate task.name if not given * [docs] `sky status --kubernetes` docs (#4064) * observability docs * comments * [UX] Show log after failure and fix the color issue with narrow window (#4084) * fix narrow window and show log path during exception * format * format * [k8s] `sky status --k8s` refactor (#4079) * refactor * lint * refactor, dataclass * refactor, dataclass * refactor * lint * add comments for add_edge * add `print_exception_no_traceback` when raise * make `Dag.tasks` a property * print dependencies for `__repr__` * move `get_unique_task_name` to common_utils * [Performance] Use new GCP custom images (#4027) * [Performance] Use new custom image to create GCP GPU VMs * update image tags for both CPU and GPU * always generate .sky/python_path --------- Co-authored-by: Yika Luo <[email protected]> * [GCP] Add H100 mega (#4099) * Add H100 mega support on GCP * fix for some other regions * format * fix resource type * fix catalog fetching * [GCP] Add gVNIC support (#4095) * add gvnic support through config.yaml * lint * docs * [Lambda] Lambda Cloud SkyPilot provisioner (#3865) * feat: lambda cloud new provisioner * feat: address cblmemo reviews and other reviews + make multi-node work again * fix: quotes * fix: address some reviews * chore: rm unused option * chore: update typedef * feat: use lists directly * fix: formatting * chore: address reviews * fix: formatting * chore: rm query ports since default impl per review * feat: add back query ports * fix: formatting * chore: add newline at eof * feat: try removing query ports again * [Docs] GKE Nvidia Driver installation instructions update (#4106) * docs * docs * docs * [Performance] Use new AWS custom images (#4091) * rename methods to use downstream/edge terminology * [Performance] Add Packer image generation scripts for GCP and AWS (#4068) * [Performance] Add Packer image generation scripts for GCP and AWS * Add docker install and tests * solve nvidia container issue * Install cuDNN * [Performance] Scripts to copy/delete AWS images for all regions and add cloud deps (#4073) * [Performance] Add AWS script to copy images for all regions * script to delete all AWS images across regions * Add cloud dependencies to image --------- Co-authored-by: Yika Luo <[email protected]> * Disable AWS images.csv refreshing (#4116) * [Docs] .skyignore doc (#4114) * [Docs] .skyignore doc * Correct typos Co-authored-by: Zongheng Yang <[email protected]> --------- Co-authored-by: Zongheng Yang <[email protected]> * [Core] Raise error for none existing cluster when endpoint is called (#4117) raise error for none existing cluster * Refresh local aws images.csv when image not found (#4127) Refresh local aws images.csv by pulling from github catalog when image tag not found * [Docs] News revamps. (#4126) * News revamps. updates updates updates updates updates updates updates updates * Apply suggestions from code review Co-authored-by: Zhanghao Wu <[email protected]> --------- Co-authored-by: Zhanghao Wu <[email protected]> * [Serve] Support manually terminating a replica and with purge option (#4032) * define replica id param in cli * create endpoint on controller * call controller endpoint to scale down replica * add classmethod decorator * add handler methods for readability in cli * update docstr and error msg, and inline in cli * update log and return err msg * add docstr, catch and reraise err, add stopped and nonexistent message * inline constant to avoid circular import * fix error statement and return encoded str * add purge feature * add purge replica usage in docstr * use .get to handle unexpected packages * fix: diff terminate replica when failed/purging or not * fix: stay up to date for `is_controller_accessible` * revert * up to date with current APIs * error handling * when purged remove record in the main loop * refactor due to reviewer's suggestions * combine functions * fix: terminate the healthy replica even with purge option * remove abbr * Update sky/serve/core.py Co-authored-by: Tian Xia <[email protected]> * Update sky/serve/core.py Co-authored-by: Tian Xia <[email protected]> * Update sky/serve/controller.py Co-authored-by: Tian Xia <[email protected]> * Update sky/serve/controller.py Co-authored-by: Tian Xia <[email protected]> * Update sky/cli.py Co-authored-by: Tian Xia <[email protected]> * got services hint * check if not yes in the outside if branch * fix some output messages * Update sky/serve/core.py Co-authored-by: Tian Xia <[email protected]> * set conflict status code for already scheduled termination * combine purge and normal terminating down branch together * bump version * global exception handler to render a json response with error messages * fix: use responses.JSONResponse for dict serialize * error messages for old controller * fix: check version mismatch in generated code * revert mistakenly change update_service * refine already in terminating message * fix: branch code workaround in cls.build * wording Co-authored-by: Tian Xia <[email protected]> * refactor due to reviewer's comments * fix use ux_utils Co-authored-by: Tian Xia <[email protected]> * add changelog as comments * fix messages * edit the message for mismatch error Co-authored-by: Tian Xia <[email protected]> * no traceback when raising in `terminate_replica` * messages decode * Apply suggestions from code review Co-authored-by: Tian Xia <[email protected]> * format * forma * Empty commit --------- Co-authored-by: David Tran <[email protected]> Co-authored-by: David Tran <[email protected]> Co-authored-by: Tian Xia <[email protected]> * [Provisioner] Support docker in Lambda Cloud and TPU (#4115) * [Provisioner] Support docker in Lambda Cloud * fix permission issue * merge with check docker installed * add tpu support & test * patch lambda cloud * add comment * Apply suggestions from code review Co-authored-by: Tian Xia <[email protected]> * change wording all to up/downstream style * Add unique suffix to task names, fallback to timestamp if unnamed * Unify handling of single and multiple tasks without dependencies * Refactor tasks initialization: use list comprehension and fail fast * Fix remove task dependency description: upstream, not downstream Co-authored-by: Tian Xia <[email protected]> * Remove duplicated `self.edges`, use nx api instead * [Serve] Add `ux_utils.print_exception_no_traceback()` for cleaner error output (#4111) * add `ux_utils.print_exception_no_traceback()` for cleaner error output * Empty commit * remove unnecessary with block * Partially revert: Remove unnecessary `ux_utils.print_exception_no_traceback()` wrappers (#4130) fix unnecessary with block for returning * Revert "Add unique suffix to task names, fallback to timestamp if unnamed" Otherwise, users can not refer to the task by name in the DAG. This reverts commit 8486352. * comment the checking used as upstream logic * [examples] Deepspeed fixes + k8s support (#4124) deepspeed kubernetes fixes * Empty commit * [OCI] Support more OS types in addition to ubuntu (#4080) * Bug fix for sky config file path resolution. * format * [OCI] Bug fix for image_id in Task YAML * [OCI]: Support more OS types (esp. oraclelinux) in addition to ubuntu. * format * Disable system firewall * Bug fix for validation of the Marketplace images * Update sky/clouds/oci.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/clouds/oci.py Co-authored-by: Zhanghao Wu <[email protected]> * variable/function naming * address review comments: not to change the service_catalog api. call oci_catalog directly for get os type for a image. * Update sky/clouds/oci.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/clouds/oci.py Co-authored-by: Zhanghao Wu <[email protected]> * Update sky/clouds/oci.py Co-authored-by: Zhanghao Wu <[email protected]> * address review comments --------- Co-authored-by: Zhanghao Wu <[email protected]> * Apply suggestions from code review Co-authored-by: Tian Xia <[email protected]> * fix: typing.cast * add TODOs for future function migration * remove dependencies wording to reduce ambiguity * temporarily add github actions --------- Co-authored-by: Romil Bhardwaj <[email protected]> Co-authored-by: Zhanghao Wu <[email protected]> Co-authored-by: yika-luo <[email protected]> Co-authored-by: Yika Luo <[email protected]> Co-authored-by: Kote Mushegiani <[email protected]> Co-authored-by: Zongheng Yang <[email protected]> Co-authored-by: David Tran <[email protected]> Co-authored-by: David Tran <[email protected]> Co-authored-by: Tian Xia <[email protected]> Co-authored-by: Hysun He <[email protected]>
skypilot-org · Oct 21, 2024 · 7d93b75 · 7d93b75
1 parent 340f384
commit 7d93b75
Show file tree

Hide file tree

Showing 83 changed files with 2,327 additions and 860 deletions.
diff --git a/.github/workflows/format.yml b/.github/workflows/format.yml
@@ -7,10 +7,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   merge_group:
 
 jobs:

diff --git a/.github/workflows/mypy-generic.yml b/.github/workflows/mypy-generic.yml
@@ -9,10 +9,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   merge_group:
 
 jobs:

diff --git a/.github/workflows/mypy.yml b/.github/workflows/mypy.yml
@@ -7,10 +7,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
 jobs:
   mypy:
     runs-on: ubuntu-latest

diff --git a/.github/workflows/pylint.yml b/.github/workflows/pylint.yml
@@ -7,10 +7,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   merge_group:
 
 jobs:

diff --git a/.github/workflows/pytest-generic.yml b/.github/workflows/pytest-generic.yml
@@ -8,10 +8,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   merge_group:
 
 jobs:

diff --git a/.github/workflows/pytest.yml b/.github/workflows/pytest.yml
@@ -6,10 +6,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - advanced-dag
   merge_group:
 
 jobs:

diff --git a/.github/workflows/test-doc-build.yml b/.github/workflows/test-doc-build.yml
@@ -7,10 +7,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - 'advanced-dag/**'
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - 'advanced-dag/**'
   merge_group:
 
 jobs:

diff --git a/.github/workflows/test-poetry-build.yml b/.github/workflows/test-poetry-build.yml
@@ -6,10 +6,12 @@ on:
     branches:
       - master
       - 'releases/**'
+      - 'advanced-dag/**'
   pull_request:
     branches:
       - master
       - 'releases/**'
+      - 'advanced-dag/**'
   merge_group:
 
 jobs:

diff --git a/README.md b/README.md
@@ -26,30 +26,32 @@
 
 ----
 :fire: *News* :fire:
-- [Sep, 2024] Point, Launch and Serve **Llama 3.2** on Kubernetes or Any Cloud: [**example**](./llm/llama-3_2/)
-- [Sep, 2024] Run and deploy [**Pixtral**](./llm/pixtral), the first open-source multimodal model from Mistral AI.
-- [Jul, 2024] [**Finetune**](./llm/llama-3_1-finetuning/) and [**serve**](./llm/llama-3_1/) **Llama 3.1** on your infra
-- [Jun, 2024] Reproduce **GPT** with [llm.c](https://github.com/karpathy/llm.c/discussions/481) on any cloud: [**guide**](./llm/gpt-2/)
-- [Apr, 2024] Serve **Qwen-110B** on your infra: [**example**](./llm/qwen/)
-- [Apr, 2024] Using **Ollama** to deploy quantized LLMs on CPUs and GPUs: [**example**](./llm/ollama/)
-- [Feb, 2024] Deploying and scaling **Gemma** with SkyServe: [**example**](./llm/gemma/)
-- [Feb, 2024] Serving **Code Llama 70B** with vLLM and SkyServe: [**example**](./llm/codellama/)
-- [Dec, 2023] **Mixtral 8x7B**, a high quality sparse mixture-of-experts model, was released by Mistral AI! Deploy via SkyPilot on any cloud: [**example**](./llm/mixtral/)
-- [Nov, 2023] Using **Axolotl** to finetune Mistral 7B on the cloud (on-demand and spot): [**example**](./llm/axolotl/)
+- [Oct 2024] :tada: **SkyPilot crossed 1M+ downloads** :tada:: Thank you to our community! [**Twitter/X**](https://x.com/skypilot_org/status/1844770841718067638)
+- [Sep 2024] Point, Launch and Serve **Llama 3.2** on Kubernetes or Any Cloud: [**example**](./llm/llama-3_2/)
+- [Sep 2024] Run and deploy [**Pixtral**](./llm/pixtral), the first open-source multimodal model from Mistral AI.
+- [Jun 2024] Reproduce **GPT** with [llm.c](https://github.com/karpathy/llm.c/discussions/481) on any cloud: [**guide**](./llm/gpt-2/)
+- [Apr 2024] Serve [**Qwen-110B**](https://qwenlm.github.io/blog/qwen1.5-110b/) on your infra: [**example**](./llm/qwen/)
+- [Apr 2024] Using [**Ollama**](https://github.com/ollama/ollama) to deploy quantized LLMs on CPUs and GPUs: [**example**](./llm/ollama/)
+- [Feb 2024] Deploying and scaling [**Gemma**](https://blog.google/technology/developers/gemma-open-models/) with SkyServe: [**example**](./llm/gemma/)
+- [Feb 2024] Serving [**Code Llama 70B**](https://ai.meta.com/blog/code-llama-large-language-model-coding/) with vLLM and SkyServe: [**example**](./llm/codellama/)
+- [Dec 2023] [**Mixtral 8x7B**](https://mistral.ai/news/mixtral-of-experts/), a high quality sparse mixture-of-experts model, was released by Mistral AI! Deploy via SkyPilot on any cloud: [**example**](./llm/mixtral/)
+- [Nov 2023] Using [**Axolotl**](https://github.com/OpenAccess-AI-Collective/axolotl) to finetune Mistral 7B on the cloud (on-demand and spot): [**example**](./llm/axolotl/)
+
+**LLM Finetuning Cookbooks**: Finetuning Llama 2 / Llama 3.1 in your own cloud environment, privately: Llama 2 [**example**](./llm/vicuna-llama-2/) and [**blog**](https://blog.skypilot.co/finetuning-llama2-operational-guide/); Llama 3.1 [**example**](./llm/llama-3_1-finetuning/) and [**blog**](https://blog.skypilot.co/finetune-llama-3_1-on-your-infra/)
 
 <details>
   <summary>Archived</summary>
 
-- [Apr, 2024] Serve and finetune [**Llama 3**](https://skypilot.readthedocs.io/en/latest/gallery/llms/llama-3.html) on any cloud or Kubernetes: [**example**](./llm/llama-3/)
-- [Mar, 2024] Serve and deploy [**Databricks DBRX**](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm) on your infra: [**example**](./llm/dbrx/)
-- [Feb, 2024] Speed up your LLM deployments with [**SGLang**](https://github.com/sgl-project/sglang) for 5x throughput on SkyServe: [**example**](./llm/sglang/)
-- [Dec, 2023] Using [**LoRAX**](https://github.com/predibase/lorax) to serve 1000s of finetuned LLMs on a single instance in the cloud: [**example**](./llm/lorax/)
-- [Sep, 2023] [**Mistral 7B**](https://mistral.ai/news/announcing-mistral-7b/), a high-quality open LLM, was released! Deploy via SkyPilot on any cloud: [**Mistral docs**](https://docs.mistral.ai/self-deployment/skypilot)
-- [Sep, 2023] Case study: [**Covariant**](https://covariant.ai/) transformed AI development on the cloud using SkyPilot, delivering models 4x faster cost-effectively: [**read the case study**](https://blog.skypilot.co/covariant/)
-- [Aug, 2023] **Finetuning Cookbook**: Finetuning Llama 2 in your own cloud environment, privately: [**example**](./llm/vicuna-llama-2/), [**blog post**](https://blog.skypilot.co/finetuning-llama2-operational-guide/)
-- [July, 2023] Self-Hosted **Llama-2 Chatbot** on Any Cloud: [**example**](./llm/llama-2/)
-- [June, 2023] Serving LLM 24x Faster On the Cloud [**with vLLM**](https://vllm.ai/) and SkyPilot: [**example**](./llm/vllm/), [**blog post**](https://blog.skypilot.co/serving-llm-24x-faster-on-the-cloud-with-vllm-and-skypilot/)
-- [April, 2023] [SkyPilot YAMLs](./llm/vicuna/) for finetuning & serving the [Vicuna LLM](https://lmsys.org/blog/2023-03-30-vicuna/) with a single command!
+- [Jul 2024] [**Finetune**](./llm/llama-3_1-finetuning/) and [**serve**](./llm/llama-3_1/) **Llama 3.1** on your infra
+- [Apr 2024] Serve and finetune [**Llama 3**](https://skypilot.readthedocs.io/en/latest/gallery/llms/llama-3.html) on any cloud or Kubernetes: [**example**](./llm/llama-3/)
+- [Mar 2024] Serve and deploy [**Databricks DBRX**](https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm) on your infra: [**example**](./llm/dbrx/)
+- [Feb 2024] Speed up your LLM deployments with [**SGLang**](https://github.com/sgl-project/sglang) for 5x throughput on SkyServe: [**example**](./llm/sglang/)
+- [Dec 2023] Using [**LoRAX**](https://github.com/predibase/lorax) to serve 1000s of finetuned LLMs on a single instance in the cloud: [**example**](./llm/lorax/)
+- [Sep 2023] [**Mistral 7B**](https://mistral.ai/news/announcing-mistral-7b/), a high-quality open LLM, was released! Deploy via SkyPilot on any cloud: [**Mistral docs**](https://docs.mistral.ai/self-deployment/skypilot)
+- [Sep 2023] Case study: [**Covariant**](https://covariant.ai/) transformed AI development on the cloud using SkyPilot, delivering models 4x faster cost-effectively: [**read the case study**](https://blog.skypilot.co/covariant/)
+- [Jul 2023] Self-Hosted **Llama-2 Chatbot** on Any Cloud: [**example**](./llm/llama-2/)
+- [Jun 2023] Serving LLM 24x Faster On the Cloud [**with vLLM**](https://vllm.ai/) and SkyPilot: [**example**](./llm/vllm/), [**blog post**](https://blog.skypilot.co/serving-llm-24x-faster-on-the-cloud-with-vllm-and-skypilot/)
+- [Apr 2023] [SkyPilot YAMLs](./llm/vicuna/) for finetuning & serving the [Vicuna LLM](https://lmsys.org/blog/2023-03-30-vicuna/) with a single command!
 
 </details>
 

diff --git a/docs/source/examples/syncing-code-artifacts.rst b/docs/source/examples/syncing-code-artifacts.rst
@@ -46,31 +46,7 @@ VMs.  The task is invoked under that working directory (so that it can call
 scripts, access checkpoints, etc.).
 
 .. note::
-
-    **Exclude files from syncing**
-
-    For large, multi-gigabyte workdirs, uploading may be slow because they
-    are synced to the remote VM(s). To exclude large files in
-    your workdir from being uploaded, add them to a :code:`.skyignore` file 
-    under your workdir. :code:`.skyignore` follows RSYNC filter rules. 
-
-    Example :code:`.skyignore` file:
-
-    .. code-block::
-        
-      # Files that match pattern under ONLY CURRENT directory
-      /hello.py
-      /*.txt
-      /dir
-
-      # Files that match pattern under ALL directories
-      *.txt
-      hello.py
-
-      # Files that match pattern under a directory ./dir/
-      /dir/*.txt
-    
-    Do NOT use ``.`` to indicate local directory (e.g. ``./hello.py``).
+  To exclude large files from being uploaded, see :ref:`exclude-uploading-files`.
 
 .. note::
 
@@ -140,6 +116,33 @@ file_mount may be slow because they are processed by ``rsync``.  Use
 :ref:`SkyPilot bucket mounting <sky-storage>` to efficiently handle
 large files.
 
+.. _exclude-uploading-files:
+
+Exclude uploading files
+--------------------------------------
+By default, SkyPilot uses your existing :code:`.gitignore` and :code:`.git/info/exclude` to exclude files from syncing.
+
+Alternatively, you can use :code:`.skyignore` if you want to separate SkyPilot's syncing behavior from Git's.
+If you use a :code:`.skyignore` file, SkyPilot will only exclude files based on that file without using the default Git files.
+
+Any :code:`.skyignore` file under either your workdir or source paths of file_mounts is respected.
+
+:code:`.skyignore` follows RSYNC filter rules, e.g.
+
+.. code-block::
+
+  # Files that match pattern under CURRENT directory
+  /file.txt
+  /dir
+  /*.jar
+  /dir/*.jar
+
+  # Files that match pattern under ALL directories
+  *.jar
+  file.txt
+
+Do _not_ use ``.`` to indicate local directory (e.g., instead of ``./file``, write ``/file``).
+
 .. _downloading-files-and-artifacts:
 
 Downloading files and artifacts

diff --git a/docs/source/reference/config.rst b/docs/source/reference/config.rst
@@ -419,6 +419,15 @@ Available fields and semantics:
     # Default: 'LOCAL_CREDENTIALS'.
     remote_identity: LOCAL_CREDENTIALS
 
+    # Enable gVNIC (optional).
+    #
+    # Set to true to use gVNIC on GCP instances. gVNIC offers higher performance
+    # for multi-node clusters, but costs more.
+    # Reference: https://cloud.google.com/compute/docs/networking/using-gvnic
+    #
+    # Default: false.
+    enable_gvnic: false
+
   # Advanced Azure configurations (optional).
   # Apply to all new instances but not existing ones.
   azure:

diff --git a/docs/source/reference/kubernetes/kubernetes-deployment.rst b/docs/source/reference/kubernetes/kubernetes-deployment.rst
@@ -114,9 +114,9 @@ Deploying on Google Cloud GKE
      # Example:
      # gcloud container clusters get-credentials testcluster --region us-central1-c
 
-3. [If using GPUs] If your GKE nodes have GPUs, you may need to to
-   `manually install <https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/>`_
-   nvidia drivers. You can do so by deploying the daemonset
+3. [If using GPUs] For GKE versions newer than 1.30.1-gke.115600, NVIDIA drivers are pre-installed and no additional setup is required. If you are using an older GKE version, you may need to
+   `manually install <https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#installing_drivers>`_
+   NVIDIA drivers for GPU support. You can do so by deploying the daemonset
    depending on the GPU and OS on your nodes:
 
    .. code-block:: console
@@ -133,7 +133,8 @@ Deploying on Google Cloud GKE
      # For Ubuntu based nodes with L4 GPUs:
      $ kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/master/nvidia-driver-installer/ubuntu/daemonset-preloaded-R525.yaml
 
-   To verify if GPU drivers are set up, run ``kubectl describe nodes`` and verify that ``nvidia.com/gpu`` is listed under the ``Capacity`` section.
+   .. tip::
+      To verify if GPU drivers are set up, run ``kubectl describe nodes`` and verify that ``nvidia.com/gpu`` resource is listed under the ``Capacity`` section.
 
 4. Verify your kubernetes cluster is correctly set up for SkyPilot by running :code:`sky check`:
 

diff --git a/docs/source/reference/kubernetes/kubernetes-getting-started.rst b/docs/source/reference/kubernetes/kubernetes-getting-started.rst
@@ -119,6 +119,57 @@ Once your cluster administrator has :ref:`setup a Kubernetes cluster <kubernetes
     $ kubectl config set-context --current --namespace=mynamespace
 
 
+
+Viewing cluster status
+----------------------
+
+To view the status of all SkyPilot resources in the Kubernetes cluster, run :code:`sky status --k8s`.
+
+Unlike :code:`sky status` which lists only the SkyPilot resources launched by the current user,
+:code:`sky status --k8s` lists all SkyPilot resources in the Kubernetes cluster across all users.
+
+.. code-block:: console
+
+    $ sky status --k8s
+    Kubernetes cluster state (context: mycluster)
+    SkyPilot clusters
+    USER     NAME                           LAUNCHED    RESOURCES                                  STATUS
+    alice    infer-svc-1                    23 hrs ago  1x Kubernetes(cpus=1, mem=1, {'L4': 1})    UP
+    alice    sky-jobs-controller-80b50983   2 days ago  1x Kubernetes(cpus=4, mem=4)               UP
+    alice    sky-serve-controller-80b50983  23 hrs ago  1x Kubernetes(cpus=4, mem=4)               UP
+    bob      dev                            1 day ago   1x Kubernetes(cpus=2, mem=8, {'H100': 1})  UP
+    bob      multinode-dev                  1 day ago   2x Kubernetes(cpus=2, mem=2)               UP
+    bob      sky-jobs-controller-2ea485ea   2 days ago  1x Kubernetes(cpus=4, mem=4)               UP
+
+    Managed jobs
+    In progress tasks: 1 STARTING
+    USER     ID  TASK  NAME      RESOURCES   SUBMITTED   TOT. DURATION  JOB DURATION  #RECOVERIES  STATUS
+    alice    1   -     eval      1x[CPU:1+]  2 days ago  49s            8s            0            SUCCEEDED
+    bob      4   -     pretrain  1x[H100:4]  1 day ago   1h 1m 11s      1h 14s        0            SUCCEEDED
+    bob      3   -     bigjob    1x[CPU:16]  1 day ago   1d 21h 11m 4s  -             0            STARTING
+    bob      2   -     failjob   1x[CPU:1+]  1 day ago   54s            9s            0            FAILED
+    bob      1   -     shortjob  1x[CPU:1+]  2 days ago  1h 1m 19s      1h 16s        0            SUCCEEDED
+
+You can also inspect the real-time GPU usage on the cluster with :code:`sky show-gpus --cloud kubernetes`.
+
+.. code-block:: console
+
+    $ sky show-gpus --cloud kubernetes
+    Kubernetes GPUs
+    GPU   QTY_PER_NODE  TOTAL_GPUS  TOTAL_FREE_GPUS
+    L4    1, 2, 4       12          12
+    H100  1, 2, 4, 8    16          16
+
+    Kubernetes per node GPU availability
+    NODE_NAME                  GPU_NAME  TOTAL_GPUS  FREE_GPUS
+    my-cluster-0               L4        4           4
+    my-cluster-1               L4        4           4
+    my-cluster-2               L4        2           2
+    my-cluster-3               L4        2           2
+    my-cluster-4               H100      8           8
+    my-cluster-5               H100      8           8
+
+
 .. _kubernetes-custom-images:
 
 Using Custom Images

diff --git a/docs/source/reference/kubernetes/kubernetes-ports.rst b/docs/source/reference/kubernetes/kubernetes-ports.rst
@@ -59,40 +59,18 @@ To restrict your services to be accessible only within the cluster, you can set
 
 Depending on your cloud, set the appropriate annotation in the SkyPilot config file (``~/.sky/config.yaml``):
 
-.. tab-set::
-
-    .. tab-item:: GCP
-        :sync: internal-lb-gke
-
-        .. code-block:: yaml
-
-          # ~/.sky/config.yaml
-          kubernetes:
-            custom_metadata:
-                annotations:
-                   networking.gke.io/load-balancer-type: "Internal"
-
-    .. tab-item:: AWS
-        :sync: internal-lb-aws
-
-        .. code-block:: yaml
-
-          # ~/.sky/config.yaml
-          kubernetes:
-            custom_metadata:
-                annotations:
-                  service.beta.kubernetes.io/aws-load-balancer-internal: "true"
-
-    .. tab-item:: Azure
-        :sync: internal-lb-azure
-
-        .. code-block:: yaml
+.. code-block:: yaml
 
-          # ~/.sky/config.yaml
-          kubernetes:
-            custom_metadata:
-                annotations:
-                  service.beta.kubernetes.io/azure-load-balancer-internal: "true"
+    # ~/.sky/config.yaml
+    kubernetes:
+      custom_metadata:
+        annotations:
+          # For GCP/GKE
+          networking.gke.io/load-balancer-type: "Internal"
+          # For AWS/EKS
+          service.beta.kubernetes.io/aws-load-balancer-internal: "true"
+          # For Azure/AKS
+          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
 
 
 .. _kubernetes-ingress: