-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SkyServe] Support min_replicas = 0
#2938
Conversation
min_replicas = 0
min_replicas = 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this feature! Left some comments 🫡
@cblmemo Thanks for the reviews! PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix! Some more comments ;)
btw, could you update the tests you run? Please make sure the case mentioned in #2893 does not happen after this PR 🫡
@cblmemo Thanks! Added an example yaml. Checked that a curl will trigger immediate scale up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix! It is mostly ready for me, except for some nits.
# The endpoint will be printed in the console. | ||
# Querying the endpoint will trigger a scale up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstr for how to make the replicas to 0, i.e. don;t send any traffic for how many time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Related issue: #2893
Add a new status
NO_REPLICAS
.Lower autoscaler decision interval when
target_num_replicas = 0
Tested (run the relevant ones):
bash format.sh
sky serve up examples/serve/min_replicas_zero.yaml