-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelizes capacity creation #518
Conversation
7e0262d
to
e060836
Compare
} | ||
} | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this function always be returning true? Wouldn't we want it to only return true if it does contain the string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Range function continues if you return true. We want to process the entire list, so we always return true. The check for adding the LT to the list is on line 103.
return nil, fmt.Errorf("expected a single instance, got %d", len(describeInstancesOutput.Reservations[0].Instances)) | ||
} | ||
instance := *describeInstancesOutput.Reservations[0].Instances[0] | ||
zap.S().Infof("Launched instance: %s, type: %s, zone: %s, hostname: %s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to include this outside of the cloudprovider, so that the logging logic remains the same across cloud providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this code was just moved, but unfortunately, it's a little difficult to wire this up with our current return type being a node, which doesn't hold this specific information until it actually comes online.
@@ -68,7 +68,7 @@ func (u *Utilization) markUnderutilized(ctx context.Context, provisioner *v1alph | |||
if err := u.kubeClient.Patch(ctx, node, client.MergeFrom(persisted)); err != nil { | |||
return fmt.Errorf("patching node %s, %w", node.Name, err) | |||
} | |||
zap.S().Debugf("Added TTL and label to underutilized node %s", node.Name) | |||
zap.S().Infof("Added TTL and label to underutilized node %s", node.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the two log lines in this file should be Debug. I think that adding TTL and removing TTL could happen a lot, and don't think it's as important as creating and terminating an instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to discuss offline. I'm not sure this should be happening a lot, and if it is, I think it makes sense to be visible.
packedNodes, err := c.CloudProvider.Create(ctx, provisioner, packings) | ||
// 6. Create capacity | ||
errs := make([]error, len(packings)) | ||
workqueue.ParallelizeUntil(ctx, len(packings), len(packings), func(index int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should limit the workers so that we don't get throttled immediately, 5 might be a good starting point based on https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html#throttling-limits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this limiting should happen on the cloud provider side. I don't want to artificially limit things on the karp/core side now that we have the async interface. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah definitely on CP side 👍
pkg/cloudprovider/types.go
Outdated
@@ -27,7 +27,7 @@ import ( | |||
// CloudProvider holds contains the methods necessary in a cloud provider | |||
type CloudProvider interface { | |||
// Create a set of nodes for each of the given constraints. | |||
Create(context.Context, *v1alpha3.Provisioner, []*Packing) ([]*PackedNode, error) | |||
Create(context.Context, *v1alpha3.Provisioner, *Packing) (*v1.Node, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned this limits the cloud providers' ability to do any optimization. We're kind of brute-forcing our way out of it at the controller level. At the expense of making the Create interface more complicated, would it be worth making it asynchronous with a richer response than just a v1.Node. The controller could track the actual provisioning status and drive the packings down until the desired state is reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One optimization that can probably happen pretty easily in the cloud provider currently is on the usual scale up case where Create will be launching a few or a bunch of the largest nodes possible. We can probably get significantly better scaling speeds for this case if we do one request for multiple instances. We should even be able to maintain a diverse capacity request since the pod density should be the same in this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want us to limit that capability as we find more cases or more capabilities are introduced to improve the cloud provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree with you here and played with the idea of using a callback model.
Create(ctx, provisioner, packing, func(node v1.Node) error {})
In the callback, we'd execute all of the bind logic.
Can we punt on this until we're ready to make more than one request at a time? I can also play with it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ha love it!
159bd71
to
7b18a4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work on this!
"github.com/awslabs/karpenter/pkg/utils/project" | ||
"go.uber.org/zap" | ||
v1 "k8s.io/api/core/v1" | ||
"knative.dev/pkg/apis" | ||
) | ||
|
||
const ( | ||
// CreationQPS limits the number of requests per second to CreateFleet | ||
// https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html#throttling-limits | ||
CreationQPS = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe configurable.. but we can wait and maybe figure out how to do that.. maybe
} | ||
|
||
// NewWorkQueue instantiates a new WorkQueue | ||
func NewWorkQueue(qps int, burst int) *WorkQueue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Description of changes:
Previously, the capacity API took multiple instances to allow cloud providers to batch create/get/delete API calls. However, this had a linearizing side effect such that all instances must be launched before karpenter was able to bind pods. This resulted in increased cases where the scheduler would schedule to capacity (launch by the linear loop) before Karpenter could execute the binds. Parallelizing also has the obvious side effect of making things launch a lot faster (unmeasured).
There are some long term considerations to this change:
The primary benefits are:
Additionally, I swapped the verbosity of a few log statements.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.