Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable setting the resource request/limits via annotations for queue-proxy side-car container #4151

Merged

Conversation

raushan2016
Copy link
Contributor

…proxy side-car container

Fixes #
#4134

Proposed Changes

  • Allow setting up the resource request and limits for the proxy-queue via annotations

Release Note


@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label May 23, 2019
@knative-prow-robot knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 23, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raushan2016: 0 warnings.

In response to this:

…proxy side-car container

Fixes #
#4134

Proposed Changes

  • Allow setting up the resource request and limits for the proxy-queue via annotations

Release Note


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot
Copy link
Contributor

Hi @raushan2016. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. area/API API objects and controllers labels May 23, 2019
@raushan2016
Copy link
Contributor Author

raushan2016 commented May 23, 2019

@vagababov As per your comments in my last PR, can you help in with your comments.

  1. Where i can add integration test.
  2. How can we fail the configuration crd deployment if someone post a crd with invalid resource quantity in annotation. #Resolved

@vagababov
Copy link
Contributor

Hi,
Tests are in ./test/e2e or ./test/conformance. Not sure where this would go.
As for validation, we can validate in the webhook and reject invalid values, though I don't think we do that for annotations right now.

@knative-prow-robot knative-prow-robot added the area/test-and-release It flags unit/e2e/conformance/perf test issues for product features label May 24, 2019
@raushan2016
Copy link
Contributor Author

raushan2016 commented May 24, 2019

Hi,
Tests are in ./test/e2e or ./test/conformance. Not sure where this would go.
As for validation, we can validate in the webhook and reject invalid values, though I don't think we do that for annotations right now.

Added webhook validation.
Sample error:
Error from server (InternalError): error when creating "knativeapp.yaml": Internal error occurred: admission webhook "webhook.serving.knative.dev" denied the request: mutation failed: queue.sidecar.serving.knative.dev/limitCPU=50m is less than queue.sidecar.serving.knative.dev/requestCPU=100m: spec.template.queue.sidecar.serving.knative.dev/limitCPU, spec.template.queue.sidecar.serving.knative.dev/requestCPU

Added the integration test as well

@raushan2016
Copy link
Contributor Author

raushan2016 commented May 24, 2019

/cc @mattmoor #Resolved

@raushan2016
Copy link
Contributor Author

raushan2016 commented May 24, 2019

/cc @mattmoor

As vagababov is on vacation #Resolved

@mattmoor mattmoor self-assigned this May 26, 2019
@mattmoor
Copy link
Member

/hold

As discussed in slack, I have serious reservations about adding 4 annotations for this.

I feel like a more appropriate near-/medium-term solution to this would be to make the queue-proxy's allocation a simple function (e.g. fraction w/ minimum value?) of the user-specified resources. Given documented cases where this simple function is inadequate, I would consider a single annotation to control the fraction on a per-Revision basis, but leaving the gate with 4 annotations is a non-starter.

I feel like (armed with today's knowledge) the most appropriate long-term solution to this is VPA in the autoscaler.

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 26, 2019
@raushan2016
Copy link
Contributor Author

raushan2016 commented May 28, 2019

/hold

As discussed in slack, I have serious reservations about adding 4 annotations for this.

I feel like a more appropriate near-/medium-term solution to this would be to make the queue-proxy's allocation a simple function (e.g. fraction w/ minimum value?) of the user-specified resources. Given documented cases where this simple function is inadequate, I would consider a single annotation to control the fraction on a per-Revision basis, but leaving the gate with 4 annotations is a non-starter.

I feel like (armed with today's knowledge) the most appropriate long-term solution to this is VPA in the autoscaler.

Thanks for the feedback. Was looking around how to fit in the fraction function. We have a multi-tenant scenario for running machine learning models. Now some models are high in cpu and some are high in memory usage. Do you have suggestions how can I define the function to avoid giving proxy container excess resources. #Resolved

@raushan2016
Copy link
Contributor Author

raushan2016 commented May 28, 2019

/hold
As discussed in slack, I have serious reservations about adding 4 annotations for this.
I feel like a more appropriate near-/medium-term solution to this would be to make the queue-proxy's allocation a simple function (e.g. fraction w/ minimum value?) of the user-specified resources. Given documented cases where this simple function is inadequate, I would consider a single annotation to control the fraction on a per-Revision basis, but leaving the gate with 4 annotations is a non-starter.
I feel like (armed with today's knowledge) the most appropriate long-term solution to this is VPA in the autoscaler.

Thanks for the feedback. Was looking around how to fit in the fraction function. We have a multi-tenant scenario for running machine learning models. Now some models are high in cpu and some are high in memory usage. Do you have suggestions how can I define the function to avoid giving proxy container excess resources.

As discussed over slack https://knative.slack.com/archives/C93E33SN8/p1559064154040200?thread_ts=1558381918.374600&cid=C93E33SN8

Have a annotation for % like 0.03 of user container, With upper and lower bound as safeguard.
request.cpu = 0.03% with [25m, 100m]
limit.cpu = 0.03% with [40m , 500m]
memory.request = 0.03% with [50Mi, 200 Mi]
memory.limit = 0.03% with [200Mi,500Mi] #Resolved

@knative-prow-robot knative-prow-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 29, 2019
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/revision_validation.go 89.8% 88.9% -0.9

@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/revision_validation.go 89.8% 88.9% -0.9
pkg/reconciler/revision/resources/queue.go 100.0% 96.4% -3.6
pkg/reconciler/revision/resources/resourceboundary.go Do not exist 100.0%

@vagababov
Copy link
Contributor

/ok-to-test

@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/revision_validation.go 89.8% 88.9% -0.9
pkg/reconciler/revision/resources/queue.go 100.0% 96.4% -3.6
pkg/reconciler/revision/resources/resourceboundary.go Do not exist 100.0%

@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/revision_validation.go 89.8% 88.9% -0.9
pkg/reconciler/revision/resources/queue.go 100.0% 96.4% -3.6
pkg/reconciler/revision/resources/resourceboundary.go Do not exist 100.0%

Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2019
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattmoor, raushan2016, vagababov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 4, 2019
@knative-prow-robot knative-prow-robot removed the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2019
Copy link
Member

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2019
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/serving/v1alpha1/revision_validation.go 89.8% 88.9% -0.9
pkg/reconciler/revision/resources/resourceboundary.go Do not exist 100.0%

@mattmoor
Copy link
Member

mattmoor commented Jun 4, 2019

/retest

@knative-prow-robot knative-prow-robot merged commit 0d106c5 into knative:master Jun 4, 2019
vagababov added a commit to vagababov/serving that referenced this pull request Jun 4, 2019
Move to corev1.PodSpec now that vN-1 supports the containers field. (knative#4221)

Previously we defined our own partial PodSpec because the corev1 version
lacks `omitempty` and appears as `containers: null` in requests from generated
clients, even if unspecified, which would have broken webhook validation.

Now that the field has been out for a release, we can switch to the common
PodSpec.

Scaling Roadmap 2019 (knative#3040)

* Scaling 2019 roadmap stub.

* Descriptions for all 2019 goals.

* Goals, POCs and Github projects for each.

* Remove recap (will do later).

* Remove indent.

* Add Pluggability and HPA line item.

* Yanwei as POC for layering.

* Update docs/roadmap/scaling-2019.md

Co-Authored-By: josephburnett <[email protected]>

* Update docs/roadmap/scaling-2019.md

Co-Authored-By: josephburnett <[email protected]>

* Clarify overload handling for 0 and non-0 cases.

* Refactor cold-start goal.

* Remove POC.

* Autoscaler scalability.

* More edits.

* HPA Interation.

* Minor edits.

* Propose section on migration K8s Deployments

* Reworked parts of the Scaling roadmap.

- Unified some wording (capitalization mostly).
- Removed prescriptive key steps. These should be captured by the respective projects, which will be more dynamically changeable than this document.

Enable setting the resource request/limits via annotations for queue-proxy side-car container (knative#4151)

* Enable setting the resource request/limits via annotations for queue-proxy side-car container

* Last PR comments

* more

* added integration tests

* more

* testfix

* integrationtest

* comments

* integration test fix

* PR comments

* more

* final

* more pr comments

* added error ErrInvalidValue

* code coverage of queue.go

Remove unused constants. (knative#4238)

Update DEVELOPMENT.md (knative#4230)

Auto TLS landed in v0.6, so this documentation is out of date

golang format tools (knative#4241)

Produced via:
  `gofmt -s -w $(find -path './vendor' -prune -o -type f -name '*.go' -print))`
  `goimports -w $(find -name '*.go' | grep -v vendor)`

Move Metric interfaces into the general autoscaling package. (knative#4236)

* Move Metric interfaces into the general autoscaling package.

This used to be KPA specific but will soon be needed to be used by HPA resources as well to trigger metric collection. Decider interfaces and types stay KPA specific.

* Move the Metrics resource interface next to the metric implementation.

* Move Deciders interface for consistency.

Apply various fixes pointed out by staticcheck. (knative#4242)

* Transform string(buf.Bytes()) to buf.String().

* Remove a bunch of unused code.

* Fix error capitalization.

* Fix issue with error overlapping.

* Fix deprecated usage of Apps without version.

* Fix file permission resolution.

* Fix comparison to boolean.

* Fix issue with variable never being used.

* Remove unused conditionsets.

* Fix error checks after fixing capitalization.

* Remove unused values in performance tests.

* Remove some more unused code.

steadier state

Format markdown (knative#4240)

Produced via: `prettier --write --prose-wrap=always $(find -name '*.md' | grep -v vendor | grep -v .github)`

Drop DeprecatedName from service_test.go (knative#4243)

some junk

things work
joshrider pushed a commit to joshrider/serving that referenced this pull request Jun 10, 2019
…proxy side-car container (knative#4151)

* Enable setting the resource request/limits via annotations for queue-proxy side-car container

* Last PR comments

* more

* added integration tests

* more

* testfix

* integrationtest

* comments

* integration test fix

* PR comments

* more

* final

* more pr comments

* added error ErrInvalidValue

* code coverage of queue.go
@Iamlovingit
Copy link

/retest

if i change the pkg/reconciler/revision/resources/resourceboundary.go boundary, but i do not know which image has changed. is queue image? or controller image?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/test-and-release It flags unit/e2e/conformance/perf test issues for product features cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants