Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Admin policy enforcement plugin #3966

Merged
merged 67 commits into from
Sep 24, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
cb28b8d
support policy hook
Michaelvll Sep 19, 2024
b64efa0
test task labels
Michaelvll Sep 19, 2024
cf89929
Add test for policy that sets labels
Michaelvll Sep 20, 2024
54c93ea
Fix comment
Michaelvll Sep 20, 2024
1d1c500
format
Michaelvll Sep 20, 2024
a0bdb2c
use -e to make test related files visible
Michaelvll Sep 20, 2024
543e66a
Add config.rst
Michaelvll Sep 20, 2024
520a2a1
Fix test
Michaelvll Sep 20, 2024
b533351
fix config rst
Michaelvll Sep 20, 2024
466f7fe
Apply policy to service
Michaelvll Sep 20, 2024
050dc7a
add policy for serving
Michaelvll Sep 20, 2024
31e0174
Add docs
Michaelvll Sep 20, 2024
0c74f2a
fix
Michaelvll Sep 20, 2024
48a6cc9
format
Michaelvll Sep 20, 2024
1ca5a8a
Update interface
Michaelvll Sep 20, 2024
14b2346
fix
Michaelvll Sep 21, 2024
cb39c73
Fix
Michaelvll Sep 21, 2024
1e3ddef
fix
Michaelvll Sep 21, 2024
aa87df7
Fix test config
Michaelvll Sep 21, 2024
28487a4
Fix mutated config
Michaelvll Sep 21, 2024
d1f0480
fix
Michaelvll Sep 21, 2024
f42ace5
Add policy doc
Michaelvll Sep 21, 2024
c04f3dc
rename
Michaelvll Sep 21, 2024
58f413c
minor
Michaelvll Sep 21, 2024
52053bd
Add additional arguments for autostop
Michaelvll Sep 21, 2024
4a4f682
fix mypy
Michaelvll Sep 21, 2024
a8d1c44
format
Michaelvll Sep 22, 2024
6c73d81
rejected message
Michaelvll Sep 22, 2024
247c0b8
format
Michaelvll Sep 22, 2024
f8a5a64
Update sky/utils/policy_utils.py
Michaelvll Sep 22, 2024
73a4581
Update sky/utils/policy_utils.py
Michaelvll Sep 22, 2024
d78a822
Fix
Michaelvll Sep 22, 2024
8cc963c
Merge branch 'policy-hook' of github.com:skypilot-org/skypilot into p…
Michaelvll Sep 22, 2024
68275f6
Update examples/admin_policy/example_policy/example_policy/__init__.py
Michaelvll Sep 22, 2024
9644622
Update docs/source/reference/config.rst
Michaelvll Sep 22, 2024
17f8fa1
Address comments
Michaelvll Sep 22, 2024
07c4748
format
Michaelvll Sep 22, 2024
15f1062
Merge branch 'policy-hook' of github.com:skypilot-org/skypilot into p…
Michaelvll Sep 22, 2024
994272b
changes in examples
Michaelvll Sep 22, 2024
3597dae
Fix enforce autostop
Michaelvll Sep 22, 2024
43a6088
Fix autostop enforcement
Michaelvll Sep 22, 2024
8770d0b
fix test
Michaelvll Sep 22, 2024
7984beb
Update docs/source/cloud-setup/policy.rst
Michaelvll Sep 23, 2024
d155d60
Update sky/admin_policy.py
Michaelvll Sep 23, 2024
6ffa5ae
Update sky/admin_policy.py
Michaelvll Sep 23, 2024
a6dd900
wip
Michaelvll Sep 23, 2024
4274287
Update docs/source/cloud-setup/policy.rst
Michaelvll Sep 23, 2024
0609482
Update docs/source/cloud-setup/policy.rst
Michaelvll Sep 23, 2024
67552d7
Update docs/source/cloud-setup/policy.rst
Michaelvll Sep 23, 2024
7de757e
fix
Michaelvll Sep 23, 2024
8443ddc
Merge branch 'policy-hook' of github.com:skypilot-org/skypilot into p…
Michaelvll Sep 23, 2024
7fbc30d
fix
Michaelvll Sep 23, 2024
92b68fc
fix
Michaelvll Sep 23, 2024
7d8af9a
Use sky.status for autostop
Michaelvll Sep 23, 2024
5b37f47
update policy
Michaelvll Sep 23, 2024
c7af310
Update docs/source/cloud-setup/policy.rst
Michaelvll Sep 23, 2024
cb232a8
fix policy.rst
Michaelvll Sep 23, 2024
5e9f544
Merge branch 'policy-hook' of github.com:skypilot-org/skypilot into p…
Michaelvll Sep 23, 2024
deb4c92
Add comment
Michaelvll Sep 23, 2024
cbff59d
Fix logging
Michaelvll Sep 23, 2024
1fe350a
fix CI
Michaelvll Sep 23, 2024
2e8e41c
Update docs/source/cloud-setup/policy.rst
Michaelvll Sep 23, 2024
aae42ce
Use sphnix inline code
Michaelvll Sep 23, 2024
73c8fb7
Merge branch 'policy-hook' of github.com:skypilot-org/skypilot into p…
Michaelvll Sep 23, 2024
11bbd5e
Add comment
Michaelvll Sep 23, 2024
3630535
fix skypilot config file mounts for jobs and serve
Michaelvll Sep 23, 2024
e020dea
Merge branch 'master' of github.com:skypilot-org/skypilot into policy…
Michaelvll Sep 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 169 additions & 0 deletions docs/source/cloud-setup/policy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
.. _advanced-policy-config:

Admin Policy Enforcement
concretevitamin marked this conversation as resolved.
Show resolved Hide resolved
========================


SkyPilot allows admins to enforce policies on users' SkyPilot usage by applying
custom validation and mutation logic on user's task and SkyPilot config.

In short, admins offers a Python package with a customized inheritance of SkyPilot's
``AdminPolicy`` interface, and a user just needs to set the ``admin_policy`` field in
the SkyPilot config ``~/.sky/config.yaml`` to enforce the policy to all their
tasks.
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

Overview
--------



User-Side
~~~~~~~~~~

To apply the policy, a user needs to set the ``admin_policy`` field in the SkyPilot config
``~/.sky/config.yaml`` to the path of the Python package that implements the policy.
For example:

.. code-block:: yaml

admin_policy: mypackage.subpackage.MyPolicy


.. hint::

SkyPilot loads the policy from the given package in the same Python environment.
You can test the existance of the policy by running:
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: bash

python -c "from mypackage.subpackage import MyPolicy"


Admin-Side
~~~~~~~~~~

An admin can distribute the Python package to users with pre-defined policy. The
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
policy should follow the following interface:
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: python

import sky

class MyPolicy(sky.AdminPolicy):
@classmethod
def validate_and_mutate(cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
# Logics for validate and modify user requests.
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
...
return sky.MutatedUserRequest(user_request.task,
user_request.skypilot_config)


``UserRequest`` and ``MutatedUserRequest`` are defined as follows:
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

.. code-block:: python

class UserRequest:
task: sky.Task
skypilot_config: sky.NestedConfig
operation_args: sky.OperationArgs

class MutatedUserRequest:
task: sky.Task
skypilot_config: sky.NestedConfig

That said, an ``AdminPolicy`` can mutate any fields of a user request, including
the :ref:`task <yaml-spec>` and the :ref:`global skypilot config <config-yaml>`,
giving admins a lot of flexibility to control user's SkyPilot usage.

An ``AdminPolicy`` is responsible to both validate and mutate user requests. If
a request should be rejected, the policy should raise an exception.


Example Policies
----------------

Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
Reject All
~~~~~~~~~~

.. code-block:: python

class RejectAllPolicy(sky.AdminPolicy):
@classmethod
def validate_and_mutate(cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
raise RuntimeError("This policy rejects all user requests.")

.. code-block:: yaml

admin_policy: examples.admin_policy.reject_all.RejectAllPolicy


Add Kubernetes Labels for all Tasks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

class AddLabelsPolicy(sky.AdminPolicy):
@classmethod
def validate_and_mutate(cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
config = user_request.skypilot_config
labels = config.get_nested(('kubernetes', 'labels'), {})
labels['app'] = 'skypilot'
config.set_nested(('kubernetes', 'labels'), labels)
return sky.MutatedUserRequest(user_request.task, config)

.. code-block:: yaml

admin_policy: examples.admin_policy.add_labels.AddLabelsPolicy


Always Disable Public IP for AWS Tasks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

class DisablePublicIPPolicy(sky.AdminPolicy):
@classmethod
def validate_and_mutate(cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
config = user_request.skypilot_config
config.set_nested(('aws', 'use_internal_ip'), True)
if config.get_nested(('aws', 'vpc_name'), None) is None:
# If no VPC name is specified, it is likely a mistake. We should
# reject the request
raise RuntimeError('VPC name should be set. Check organization '
'wiki for more information.')
return sky.MutatedUserRequest(user_request.task, config)

.. code-block:: yaml

admin_policy: examples.admin_policy.disable_public_ip.DisablePublicIPPolicy


Enforce Autostop for all Tasks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

class EnforceAutostopPolicy(sky.AdminPolicy):
@classmethod
def validate_and_mutate(cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
operation_args = user_request.operation_args
# Operation args can be None for jobs and services, for which we
# don't need to enforce autostop, as they are already managed.
if operation_args is None:
return sky.MutatedUserRequest(
task=user_request.task,
skypilot_config=user_request.skypilot_config)
idle_minutes_to_autostop = operation_args.idle_minutes_to_autostop
# Enforce autostop/down to be set for all tasks for new clusters.
if not operation_args.cluster_exists and (
idle_minutes_to_autostop is None or
idle_minutes_to_autostop < 0):
raise RuntimeError('Autostop/down must be set for all newly '
'launched clusters.')
return sky.MutatedUserRequest(
task=user_request.task,
skypilot_config=user_request.skypilot_config)

.. code-block:: yaml

admin_policy: examples.admin_policy.enforce_autostop.EnforceAutostopPolicy
3 changes: 2 additions & 1 deletion docs/source/docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,8 @@ Read the research:
../cloud-setup/cloud-permissions/index
../cloud-setup/cloud-auth
../cloud-setup/quota

../cloud-setup/policy

.. toctree::
:hidden:
:maxdepth: 1
Expand Down
6 changes: 3 additions & 3 deletions docs/source/reference/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -89,11 +89,11 @@ Available fields and semantics:

# Custom policy to be applied to all tasks.
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
#
# The policy function to be applied and mutate all tasks, which can be used to
# The policy class to be applied and mutate all tasks, which can be used to
# enforce certain policies on all tasks.
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
#
# See details in: <TODO: add link to policy docs>
policy: my_package.skypilot_policy_fn_v1
# The policy class should implement the sky.AdminPolicy interface.
admin_policy: my_package.SkyPilotPolicyV1

# Advanced AWS configurations (optional).
# Apply to all new instances but not existing ones.
Expand Down
1 change: 1 addition & 0 deletions examples/admin_policy/config_label_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
admin_policy: example_policy.ConfigLabelPolicy
1 change: 1 addition & 0 deletions examples/admin_policy/enforce_autostop.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
admin_policy: example_policy.EnforceAutostopPolicy
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"""Example module for SkyPilot admin policies."""
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

from example_policy.skypilot_policy import ConfigLabelPolicy
from example_policy.skypilot_policy import EnforceAutostopPolicy
from example_policy.skypilot_policy import RejectAllPolicy
from example_policy.skypilot_policy import TaskLabelPolicy
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
import copy
import getpass

import sky


class TaskLabelPolicy(sky.AdminPolicy):
"""Example policy: add label for task with the local user name."""
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

@classmethod
def validate_and_mutate(
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
"""Add label for task with the local user name."""
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
local_user_name = getpass.getuser()

# Add label for task with the local user name
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
task = user_request.task
for r in task.resources:
r.labels['local_user'] = local_user_name

return sky.MutatedUserRequest(
task=task, skypilot_config=user_request.skypilot_config)


class ConfigLabelPolicy(sky.AdminPolicy):
"""Example policy: add label for skypilot_config with the local user name."""
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved

@classmethod
def validate_and_mutate(
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
"""Add label for skypilot_config with the local user name."""
local_user_name = getpass.getuser()

# Add label for skypilot_config with the local user name
concretevitamin marked this conversation as resolved.
Show resolved Hide resolved
skypilot_config = copy.deepcopy(user_request.skypilot_config)
skypilot_config.set_nested(('gcp', 'labels', 'local_user'),
local_user_name)
return sky.MutatedUserRequest(task=user_request.task,
skypilot_config=skypilot_config)


class RejectAllPolicy(sky.AdminPolicy):
"""Example policy: reject all user requests."""

@classmethod
def validate_and_mutate(
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
"""Reject all user requests."""
del user_request
raise RuntimeError('Reject all policy')


class EnforceAutostopPolicy(sky.AdminPolicy):
"""Example policy: enforce autostop for all tasks."""

@classmethod
def validate_and_mutate(
cls, user_request: sky.UserRequest) -> sky.MutatedUserRequest:
"""Enforce autostop for all tasks."""
operation_args = user_request.operation_args
if operation_args is None:
return sky.MutatedUserRequest(
task=user_request.task,
skypilot_config=user_request.skypilot_config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is surprising. Does operation_args == None mean non sky launch? Need comments here and in the dataclass wrapping operation_args.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is in the UserRequest's comment. Adding it to doc as well.

idle_minutes_to_autostop = operation_args.idle_minutes_to_autostop
# Enforce autostop/down to be set for all tasks for new clusters.
if not operation_args.cluster_exists and (
idle_minutes_to_autostop is None or
idle_minutes_to_autostop < 0):
raise RuntimeError('Autostop/down must be set for all newly '
'launched clusters.')
return sky.MutatedUserRequest(
task=user_request.task,
skypilot_config=user_request.skypilot_config)
1 change: 1 addition & 0 deletions examples/admin_policy/reject_all_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
admin_policy: example_policy.RejectAllPolicy
File renamed without changes.
1 change: 1 addition & 0 deletions examples/admin_policy/task_label_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
admin_policy: example_policy.TaskLabelPolicy
1 change: 0 additions & 1 deletion examples/policy/config_label_config.yaml

This file was deleted.

2 changes: 0 additions & 2 deletions examples/policy/example_policy/example_policy/__init__.py

This file was deleted.

31 changes: 0 additions & 31 deletions examples/policy/example_policy/example_policy/skypilot_policy.py

This file was deleted.

1 change: 0 additions & 1 deletion examples/policy/task_label_config.yaml

This file was deleted.

16 changes: 12 additions & 4 deletions sky/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,10 +110,14 @@ def set_proxy_env_var(proxy_var: str, urllib_var: Optional[str]):
from sky.jobs.core import spot_tail_logs
from sky.optimizer import Optimizer
from sky.optimizer import OptimizeTarget
from sky.policy import MutatedUserTask
from sky.policy import UserTask
# Admin Policy interfaces
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
from sky.policy import AdminPolicy
from sky.policy import MutatedUserRequest
from sky.policy import OperationArgs
from sky.policy import UserRequest
from sky.resources import Resources
from sky.skylet.job_lib import JobStatus
from sky.skypilot_config import NestedConfig
from sky.status_lib import ClusterStatus
from sky.task import Task

Expand Down Expand Up @@ -187,6 +191,10 @@ def set_proxy_env_var(proxy_var: str, urllib_var: Optional[str]):
# core APIs Storage Management
'storage_ls',
'storage_delete',
'UserTask',
'MutatedUserTask',
# Admin Policy
'UserRequest',
'MutatedUserRequest',
'AdminPolicy',
'NestedConfig',
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
'OperationArgs',
]
5 changes: 5 additions & 0 deletions sky/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,3 +286,8 @@ class ServeUserTerminatedError(Exception):

class PortDoesNotExistError(Exception):
"""Raised when the port does not exist."""


class UserRequestRejectedByPolicy(Exception):
"""Raised when a user request is rejected by policy."""
Michaelvll marked this conversation as resolved.
Show resolved Hide resolved
pass
Loading
Loading