Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented pod batching in the allocation/provisioner controller #503

Merged
merged 18 commits into from
Jul 14, 2021

Conversation

bwagner5
Copy link
Contributor

@bwagner5 bwagner5 commented Jul 2, 2021

Issue, if available:
#467

Description of changes:

  • This PR removes the interval reconciliation in-favor for watching pod update events and using a batcher (included in this PR) to batch a reconciliation run per provisioner.
  • The batcher uses a MaxTimeout and an IdleTimeout. I think this will create a good trade-off on reconciling small pod batches more quickly (for example, if someone scales up a deployment by 30, that will most like be seen quickly, within a second or two, the idle time will timeout the window and reconciliation will kick off in 4 seconds rather than waiting the max 10 seconds that is set for the MaxTimeOut in this PR).
  • Allocation controller now runs 4 reconciliation workers (but only once can run per provisioner at a time)
  • Removed Generic controller and converted the other controllers and tests to the new Register() method that was already done for the Expiration controller.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@ellistarn
Copy link
Contributor

ellistarn commented Jul 3, 2021

This PR also adds a new method on the GenericController to configure controllerruntime's concurrent reconciliations per controller.

FYI, I'm planning on deprecating GenericController, since our controllers have drifted significantly.

Checkout the alternative here: #498

pkg/controllers/types.go Outdated Show resolved Hide resolved
pkg/controllers/allocation/batch.go Outdated Show resolved Hide resolved
pkg/controllers/allocation/batch.go Outdated Show resolved Hide resolved
pkg/controllers/allocation/batch.go Outdated Show resolved Hide resolved
@bwagner5 bwagner5 changed the title [WIP] implement pod batching in the allocation/provisioner controller [WIP] Implemented pod batching in the allocation/provisioner controller Jul 8, 2021
@bwagner5 bwagner5 changed the title [WIP] Implemented pod batching in the allocation/provisioner controller Implemented pod batching in the allocation/provisioner controller Jul 12, 2021
cmd/controller/main.go Outdated Show resolved Hide resolved
@@ -12,17 +12,19 @@ See the License for the specific language governing permissions and
limitations under the License.
*/

package allocation
package allocation_test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change up the package here? I know it's a testing style, but it's outside of the convention of the project thus far. Should we be moving to adopt this pattern? I find it a bit of a surprising convention of golang and I haven't yet seen the value over just using the same package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a protection against using non-public fields. For example, when I changed the AllocationController to have public fields so we can configure the batcher in the tests, it caught it in it's own suite test which was more noticeable since I was working in that package. I actually thought about changing it back without the "_test", but the cloudprovider suite tests also needs to create a controller, so it really should behave the same.

In general, I think we should absolutely stick with this convention.

Copy link
Contributor

@ellistarn ellistarn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done on many fronts. I'm a bit wary of the complexity of the batcher, but I think it's necessary complexity. I'd love to see a comprehensive suite of tests exercising the batcher, but feel free to implement in a followon PR.

Otherwise just a couple naming nits here and there. Most of the batcher naming nits are just to help me wrap my mind around it.


// retrieveProvisionerFrom fetches the provisioner and returns a raw provisioner that was persisted in the api server
// and a provisioner w/ default runtime values added that should not be persisted
func (c *Controller) retrieveProvisionerFrom(ctx context.Context, req reconcile.Request) (*v1alpha2.Provisioner, *v1alpha2.Provisioner, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a fan of the convention provisionerFor(ctx, req)

ellistarn
ellistarn previously approved these changes Jul 14, 2021
@bwagner5 bwagner5 merged commit 4c17e4a into aws:main Jul 14, 2021
gfcroft pushed a commit to gfcroft/karpenter-provider-aws that referenced this pull request Nov 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set aside pods that fail to pack to avoid continually retrying and failing to pack them.
4 participants