-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spot Instance interruption notice support #1899
Comments
I'm interested in the end product here, but curious about some of the effects of adding this polling into CAPA:
I'm curious if perhaps we could do some kind of integration with https://github.com/aws/aws-node-termination-handler? This is a DaemonSet that polls the metadata API, which does not count against EC2 request rate limits. Node Termination Handler minimally will cordon a Node and apply some labels to it. I wonder if there's some kind of configurable lifecycle mechanism we could use in CAPI to detect certain labels on Nodes. I do have certain feelings about CAPI knowing about an AWS-specific Node label though... 🤔 |
I think integration with the node termination handler would be better. If anything, trying to cut down the amount of polling of EC2 APIs. We also have #1871 |
Upstream PR kubernetes-sigs/cluster-api#3668 has discussion on how we can do this in CAPI such that nothing needs to be done here. |
Further discussions in kubernetes-sigs/cluster-api#3668 resulted in kubernetes-sigs/cluster-api#3504 and kubernetes-sigs/cluster-api#3817. The requirement for CAPA is to add AWSMachine.status.interruptible (bool), and set it to true when AWSMachine.spec.spotMarketOptions is non-nil. |
/help |
@ncdc: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign |
/kind feature
Describe the solution you'd like
Spot instance support has come in with #1868 however when a spot instance terminates, it drops the workload without notice.
Augment the spot instance support with interruption notice polling. When an instance receives the normal 2 minute notice, attempt to drain it using the same lifecycle process that the provider uses when scaling down a pool.
Anything else you would like to add:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/EventTypes.html#spot-instance-event-types
Environment:
kubectl version
): N/A/etc/os-release
): N/AThe text was updated successfully, but these errors were encountered: