Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warm Up Nodes Options (Hibernation) #3798

Open
abebars opened this issue Apr 24, 2023 · 14 comments
Open

Warm Up Nodes Options (Hibernation) #3798

abebars opened this issue Apr 24, 2023 · 14 comments
Labels
feature New feature or request

Comments

@abebars
Copy link

abebars commented Apr 24, 2023

Tell us about your request

Allow Kaprneter to provision more nodes in a hibernated state which would decrease the new nodes provisioning time for rapid scaling.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

Kaprneter is excellent for optimizing the cluster capacity; however, on the other side, applications that require rapid scaling will need to wait until the new nodes are provisioned.
There is a proposal here to add headroom logic, But that means we are still going to have running nodes with no workloads which they are being charged for.

Another option is to support Hibernation (Stopped Instances), which will be bootstrapped and ready to join the clusters once needed. This feature is already supported out of the box as Warm Pool for Auto Scaling Group

Are you currently working around this issue?

Using Low Priority Pods could be less practical from a cost-saving perspective. Similar to #3240

Additional Context

No response

Attachments

No response

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@abebars abebars added the feature New feature or request label Apr 24, 2023
@jonathan-innis
Copy link
Contributor

It seems like you may need some combination of kubernetes-sigs/karpenter#749 with an option to specify that manual node provisioning as a "warm pool?"

Do you know what the capacity is going to look like and you want the warm pool to be right-sized? Or are you just looking to specify some constraints on a manually provisioned warm pool that would look like being able to manually launch Karpenter capacity like listed in kubernetes-sigs/karpenter#749.

@abebars
Copy link
Author

abebars commented Apr 26, 2023

It seems like you may need some combination of kubernetes-sigs/karpenter#749 with an option to specify that manual node provisioning as a "warm pool?"

Do you know what the capacity is going to look like and you want the warm pool to be right-sized? Or are you just looking to specify some constraints on a manually provisioned warm pool that would look like being able to manually launch Karpenter capacity like listed in kubernetes-sigs/karpenter#749.

@jonathan-innis I think having a manual node could be helpful to some sort but it doesn't really align well with the provisioner idea unless it's referencing it in some sort.
so if we are doing a manual node I would expect something like

apiVersion: karpenter.sh/v1alpha5
kind: NodeGroup
metadata:
  name: default
spec:
  replicas: 2
  provisionerRef:
    name: my-provisioner

However, I am looking for something more like

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  ......
  # Resource limits constrain the total size of the cluster.
  # Limits prevent Karpenter from creating new instances once the limit is exceeded.
  limits:
    resources:
      cpu: "1000"
      memory: 1000Gi
 # Buffer will be added to the total number required to ensure there is extra space for scaling
 # This can be an absolute number or percentage of the total provisioned nodes 
  buffer:
    resources:
      cpu: "10" OR "10%"
      memory: 100Gi OR "10%"
      warm: true # If this is true, nodes will be hibernated; otherwise, nodes are available and in a ready state. 

@jonathan-innis
Copy link
Contributor

Yeah, I think this is being tracked over here #3240. Do you mind including your use-case over there? I think this issue looks like a duplicate of the discussion that's occurring over there.

@jonathan-innis
Copy link
Contributor

Closing this as a duplicate of #3240

@a7i
Copy link

a7i commented Jun 14, 2023

@jonathan-innis Why was this closed as duplicate? This issue is about a similar option as Warm Pools for ASGs in Karpenter.
The duplicate issue referenced is for overprovisioner.

@andrewleech
Copy link

I agree that this is not a duplicate of #3240

That one is about keeping extra nodes active all the time, ready to pick up jobs.

This issue is about having some nodes (AWS instances) in shutdown state rather than terminated, such that when a new node is needed the existing machine can be restarted rather than needing to create a new machine from scratch.

I use karpenter for managing Gitlab CI build machines, so when a new build job comes in it starts a new machine to run that build job, then shuts the machine down again afterwards. For most of the day there are no machines running, just occasional ones started when a git commit is pushed.

Currently, I have a ~1.5 minute delay to a build job while it's creating and provisioning the machine, but at least I'm only paying money while the job is running.

I'm in the process of getting going with the new windows support for windows build jobs - it's looking like up to 20 minutes to provision a windows machine and pull a (rather large) docker build image.

With #3240 I'd basically end up with at least one "warm" machine running 24/7, incurring significant cost.

With the proposal in this issue, I'd have one shutdown machine in AWS ready to restart when a job comes in, which should start up significantly faster, but only cost a little bit of storage fee when shut down.

@FernandoMiguel
Copy link
Contributor

@andrewleech you can bake EBS snapshots with most common images you frequently need, and attach those to karpenter nodes, avoiding having to download them on every new node.
should improve your boot time considerably

@andrewleech
Copy link

Thanks @FernandoMiguel that's interesting, I didn't realise that was possible.

On windows I guess almost everything is based on one of two windows base/core images so it'd certainly be good to have them preloaded, though we use a range of different things in Linux so not sure what I'd load there, worth thinking about though.

However on any OS it would mean extra processes needed to create and maintain those snapshots (security updates etc).

It's definitely worth testing at least to see how much time it saves, vs the initial time to just create the machine.

@andrewleech
Copy link

I've tested building a custom windows AMI (using AWS image builder) for my windows nodes with a bunch of container images pre-pulled with crictl.

I was also able to enable AWS Fast Start on the image.

Using this image is faster with Karpenter, but there's still a ~ 6 minute start up time.

The pod logs show the pre-pulled images are all being used, so that did help. I was really hoping for a lot faster though.

@jonathan-innis
Copy link
Contributor

Apologize for missing the back-and-forth here and not re-opening this one earlier. You're correct that I misclassified this one on first glance.

The pod logs show the pre-pulled images are all being used, so that did help. I was really hoping for a lot faster though

Would shutdown instances still help here or are there other areas that are bottlenecking that you can see?

@Bryce-Soghigian
Copy link
Contributor

Bryce-Soghigian commented Apr 3, 2024

Another Data point: Cluster autoscaler managed on AKS has a "deallocate" scale down mode. Where rather than deleting vms, we put them in "deallocated mode" which essentially is the same as hibernation. Then when you need to scale up you wake up one of the hibernated instances.

Jack is taking a stab at upstreaming the change here for reference.

Some users who require 1s latency are ok paying for the os disk with the tradeoff that the VM will start immediately when they need it.

Would shutdown instances still help here or are there other areas that are bottlenecking that you can see?

I am also curious the full breakdown of the bottlenecks you are facing. If the bottleneck is with image pull, hibernated instances may not save you as much time, and something optimizing image pull may make more sense like you tried but you can probably go deeper.

Hibernated instances may save you 30-45s, but for some larger container images such as sagemathinc/cocalc that take 405.3s to Start the image, can be reduced to 2.9s using things like Artifact Streaming and overlaybd
Screenshot 2024-04-02 at 10 37 45 PM

Source

Solving at the node bootstrapping layer is just one layer of potential latency. Haven't dove deep on the aws side but imagine similar things are achievable via completely optimizing image pull

@myloginid
Copy link

Given the number of upvotes on this and linked issues, will this feature be made available soon?

@jtdoepke
Copy link

jtdoepke commented Jun 3, 2024

Here's a blog post showing how using shutdown instances can decrease boot time: https://depot.dev/blog/faster-ec2-boot-time

I imagine something like that, combined with pre-loading images, could make adding new nodes very fast.

@abebars
Copy link
Author

abebars commented Jan 21, 2025

@jonathan-innis How can we make this reality? In theory and practical, I have seen that works and I have a good idea on how may be able to execute that, wdyt the next steps should be

It's 2nd on the top requested feature, https://github.com/aws/karpenter-provider-aws/issues?q=is%3Aissue%20state%3Aopen%20sort%3Areactions-%2B1-desc

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

8 participants