Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Refactor replica record handling using enum #4031

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

andylizf
Copy link
Collaborator

@andylizf andylizf commented Oct 4, 2024

This PR introduces the following improvements:

  1. Replaces the unclear condition:
    if removal_reason is not None:
    with a more explicit enum-based approach.
  2. Adds a ReplicaRecordAction enum to clearly define possible actions for replica records.
  3. Refactors the logic for determining replica record actions into a separate function.
  4. Simplifies the main code flow by using the new enum and function.

These changes make the replica management process more explicit and easier to extend in the future.

Copy link
Collaborator

@cblmemo cblmemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the refactoring @andylizf ! It is awesome. Left some discussions ;)

sky/serve/replica_managers.py Outdated Show resolved Hide resolved
sky/serve/replica_managers.py Outdated Show resolved Hide resolved
Comment on lines 824 to 828
class ReplicaRecordAction(Enum):
KEEP = auto()
REMOVE_SCALE_DOWN = auto()
REMOVE_PREEMPTION = auto()
REMOVE_VERSION_MISMATCH = auto()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class ReplicaRecordAction(Enum):
KEEP = auto()
REMOVE_SCALE_DOWN = auto()
REMOVE_PREEMPTION = auto()
REMOVE_VERSION_MISMATCH = auto()
class ReplicaRecordRemoveReason(Enum):
SCALE_DOWN = 'scale_down'
PREEMPTION = xxx
VERSION_MISMATCH = xxx

nit: how about we only keep the remove reason and return a None in the get_replica_record_action function to indicate keep it? Please also rename this function accordingly.
also, use a str for its value for more information.

@andylizf
Copy link
Collaborator Author

andylizf commented Oct 7, 2024

I will do this after the #4032 merge, and see how properly handle the relationship between the new classes ReplicaRecordAction and ReplicaStatusProperty.

Co-authored-by: Tian Xia <[email protected]>

fix nits

f
@andylizf andylizf force-pushed the enhance-replica-record-handling branch from c9651f8 to 6e6648f Compare October 11, 2024 07:44
Copy link
Contributor

github-actions bot commented Feb 9, 2025

This PR is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Feb 9, 2025
@andylizf andylizf removed the Stale label Feb 9, 2025
@cblmemo
Copy link
Collaborator

cblmemo commented Feb 10, 2025

is this still planned?

@andylizf
Copy link
Collaborator Author

is this still planned?

Yes, sure. Let me reschedule it.

@andylizf
Copy link
Collaborator Author

We definitely need a clearer ReplicaStatus and ReplicaStatusProperty. And a class representing the removal in serve_state in the proper place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants