Parallelizes capacity creation #518

ellistarn · 2021-07-19T00:45:06Z

Description of changes:
Previously, the capacity API took multiple instances to allow cloud providers to batch create/get/delete API calls. However, this had a linearizing side effect such that all instances must be launched before karpenter was able to bind pods. This resulted in increased cases where the scheduler would schedule to capacity (launch by the linear loop) before Karpenter could execute the binds. Parallelizing also has the obvious side effect of making things launch a lot faster (unmeasured).

There are some long term considerations to this change:

Cloud providers will no longer (easily) be able to batch capacity API calls
Cloud providers will no longer be able to see multiple packings at once, which could allow for re-packing.

The primary benefits are:

The Capacity API is now simpler and more natural.
Launching capacity for constraint groups is now parallelized in vendor neutral code.
A slow cloud provider implementation can no longer hang on linear capacity creation.
More code (node object construction, parallelization) is moved out of the AWS Cloud Provider and into Karpenter's core.

Additionally, I swapped the verbosity of a few log statements.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

pkg/cloudprovider/aws/fake/ec2api.go

njtran · 2021-07-19T21:21:37Z

pkg/cloudprovider/aws/fake/ec2api.go

 		}
-	}
+		return true


Why would this function always be returning true? Wouldn't we want it to only return true if it does contain the string?

The Range function continues if you return true. We want to process the entire list, so we always return true. The check for adding the LT to the list is on line 103.

pkg/cloudprovider/aws/fake/ec2api.go

pkg/cloudprovider/aws/node.go

njtran · 2021-07-19T21:25:35Z

pkg/cloudprovider/aws/node.go

+		return nil, fmt.Errorf("expected a single instance, got %d", len(describeInstancesOutput.Reservations[0].Instances))
+	}
+	instance := *describeInstancesOutput.Reservations[0].Instances[0]
+	zap.S().Infof("Launched instance: %s, type: %s, zone: %s, hostname: %s",


Is it possible to include this outside of the cloudprovider, so that the logging logic remains the same across cloud providers?

Note that this code was just moved, but unfortunately, it's a little difficult to wire this up with our current return type being a node, which doesn't hold this specific information until it actually comes online.

pkg/controllers/allocation/bind.go

pkg/packing/packer.go

njtran · 2021-07-19T21:44:01Z

pkg/controllers/reallocation/utilization.go

@@ -68,7 +68,7 @@ func (u *Utilization) markUnderutilized(ctx context.Context, provisioner *v1alph
 		if err := u.kubeClient.Patch(ctx, node, client.MergeFrom(persisted)); err != nil {
 			return fmt.Errorf("patching node %s, %w", node.Name, err)
 		}
-		zap.S().Debugf("Added TTL and label to underutilized node %s", node.Name)
+		zap.S().Infof("Added TTL and label to underutilized node %s", node.Name)


I think the two log lines in this file should be Debug. I think that adding TTL and removing TTL could happen a lot, and don't think it's as important as creating and terminating an instance.

Happy to discuss offline. I'm not sure this should be happening a lot, and if it is, I think it makes sense to be visible.

pkg/cloudprovider/types.go

pkg/controllers/allocation/controller.go

bwagner5 · 2021-07-19T21:38:49Z

pkg/controllers/allocation/controller.go

-	packedNodes, err := c.CloudProvider.Create(ctx, provisioner, packings)
+	// 6. Create capacity
+	errs := make([]error, len(packings))
+	workqueue.ParallelizeUntil(ctx, len(packings), len(packings), func(index int) {


maybe we should limit the workers so that we don't get throttled immediately, 5 might be a good starting point based on https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html#throttling-limits

I think this limiting should happen on the cloud provider side. I don't want to artificially limit things on the karp/core side now that we have the async interface. WDYT?

yeah definitely on CP side 👍

bwagner5 · 2021-07-19T21:49:44Z

pkg/cloudprovider/types.go

@@ -27,7 +27,7 @@ import (
 // CloudProvider holds contains the methods necessary in a cloud provider
 type CloudProvider interface {
 	// Create a set of nodes for each of the given constraints.
-	Create(context.Context, *v1alpha3.Provisioner, []*Packing) ([]*PackedNode, error)
+	Create(context.Context, *v1alpha3.Provisioner, *Packing) (*v1.Node, error)


I'm concerned this limits the cloud providers' ability to do any optimization. We're kind of brute-forcing our way out of it at the controller level. At the expense of making the Create interface more complicated, would it be worth making it asynchronous with a richer response than just a v1.Node. The controller could track the actual provisioning status and drive the packings down until the desired state is reached.

One optimization that can probably happen pretty easily in the cloud provider currently is on the usual scale up case where Create will be launching a few or a bunch of the largest nodes possible. We can probably get significantly better scaling speeds for this case if we do one request for multiple instances. We should even be able to maintain a diverse capacity request since the pod density should be the same in this situation.

I don't want us to limit that capability as we find more cases or more capabilities are introduced to improve the cloud provider.

I completely agree with you here and played with the idea of using a callback model.

Create(ctx, provisioner, packing, func(node v1.Node) error {})

In the callback, we'd execute all of the bind logic.

Can we punt on this until we're ready to make more than one request at a time? I can also play with it here.

Implemented this.

ha love it!

bwagner5

Very nice work on this!

bwagner5 · 2021-07-21T15:47:20Z

pkg/cloudprovider/aws/cloudprovider.go

 	"github.com/awslabs/karpenter/pkg/utils/project"
 	"go.uber.org/zap"
 	v1 "k8s.io/api/core/v1"
 	"knative.dev/pkg/apis"
 )

 const (
+	// CreationQPS limits the number of requests per second to CreateFleet
+	// https://docs.aws.amazon.com/AWSEC2/latest/APIReference/throttling.html#throttling-limits
+	CreationQPS = 2


maybe configurable.. but we can wait and maybe figure out how to do that.. maybe

bwagner5 · 2021-07-21T16:39:42Z

pkg/utils/parallel/workqueue.go

+}
+
+// NewWorkQueue instantiates a new WorkQueue
+func NewWorkQueue(qps int, burst int) *WorkQueue {


bwagner5

lgtm

ellistarn force-pushed the tuning branch from a2f6415 to 757c363 Compare July 19, 2021 00:49

ellistarn changed the title ~~Improve bind behavior at larger scales and simplified cloud provider API~~ Parallelizes capacity creation to improve bind behavior at larger scales Jul 19, 2021

ellistarn force-pushed the tuning branch 2 times, most recently from 7e0262d to e060836 Compare July 19, 2021 01:14

ellistarn changed the title ~~Parallelizes capacity creation to improve bind behavior at larger scales~~ Parallelizes capacity creation Jul 19, 2021

ellistarn force-pushed the tuning branch from e060836 to 08720a9 Compare July 19, 2021 01:36

ellistarn added 4 commits July 19, 2021 11:32

Improve bind behavior at larger scales

4a2ce28

Added parallelism

dfdb294

Thread safety for aws tests

60e758c

Added thread safety to tests

e74b90d

ellistarn force-pushed the tuning branch from 3f193ab to e74b90d Compare July 19, 2021 18:33

Implemented bind parallelization

8594811

njtran reviewed Jul 19, 2021

View reviewed changes

bwagner5 reviewed Jul 19, 2021

View reviewed changes

Moved Cloudprovider API to async model

9b730fa

ellistarn force-pushed the tuning branch from 417fc70 to 9b730fa Compare July 20, 2021 17:11

Added rate limiting

31477df

ellistarn force-pushed the tuning branch 3 times, most recently from 159bd71 to 7b18a4a Compare July 21, 2021 00:29

bwagner5 previously approved these changes Jul 21, 2021

View reviewed changes

Simplified fleet error logging

832fcd3

ellistarn dismissed bwagner5’s stale review via 832fcd3 July 21, 2021 16:58

ellistarn force-pushed the tuning branch from 7b18a4a to 832fcd3 Compare July 21, 2021 16:58

bwagner5 approved these changes Jul 21, 2021

View reviewed changes

ellistarn merged commit cf2a869 into aws:main Jul 21, 2021

ellistarn deleted the tuning branch July 21, 2021 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelizes capacity creation #518

Parallelizes capacity creation #518

ellistarn commented Jul 19, 2021 •

edited

Loading

njtran Jul 19, 2021

ellistarn Jul 20, 2021

njtran Jul 19, 2021

ellistarn Jul 20, 2021

njtran Jul 19, 2021

ellistarn Jul 20, 2021

bwagner5 Jul 19, 2021

ellistarn Jul 20, 2021

bwagner5 Jul 20, 2021

bwagner5 Jul 19, 2021

bwagner5 Jul 19, 2021

bwagner5 Jul 19, 2021

ellistarn Jul 20, 2021

ellistarn Jul 20, 2021

bwagner5 Jul 20, 2021

bwagner5 left a comment

bwagner5 Jul 21, 2021

bwagner5 Jul 21, 2021

bwagner5 left a comment

Parallelizes capacity creation #518

Parallelizes capacity creation #518

Conversation

ellistarn commented Jul 19, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwagner5 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwagner5 left a comment

Choose a reason for hiding this comment

ellistarn commented Jul 19, 2021 •

edited

Loading