Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing resource version when updating the scale subresource of custom resource #80572

Merged
merged 2 commits into from
Nov 14, 2019

Conversation

knight42
Copy link
Member

@knight42 knight42 commented Jul 25, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
Not to clear the resource version of custom resource when saving it to etcd.

Which issue(s) this PR fixes:
Fixes #80515

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Scale custom resource unconditionally if resourceVersion is not provided

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 25, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @knight42. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knight42 knight42 changed the title fix: be able to scale CustomResource fix: be able to scale CustomResource using kubectl Jul 25, 2019
@k8s-ci-robot k8s-ci-robot added area/kubectl sig/cli Categorizes an issue or PR as relevant to SIG CLI. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 25, 2019
@knight42
Copy link
Member Author

/assign @apelisse

@apelisse
Copy link
Member

I don't think that we should be doing a GET before the request. Scale is an imperative command and unless you specify the resource-version, it should just be executed as specified.

This comment from #80515 in particular lets me think that there is something wrong:

On that note, I also noticed that if you want to update a CR by crafting the request manually and using curl, you also need to specify the resource version or you will get the same error. I'm not sure if this is expected and related, but I'm writing it down as it might be related. As far as I know, this is not required for some native resources at all, but I'm not sure is this the case for CRDs.

I'd like to understand why that is the case, I suspect it may be a bug/behavior we don't understand in the CRD handler.

@knight42
Copy link
Member Author

@apelisse It makes sense, I' ll try to deep dive into it.

@knight42 knight42 changed the title fix: be able to scale CustomResource using kubectl [WIP] fix: be able to scale CustomResource using kubectl Jul 28, 2019
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 28, 2019
@knight42
Copy link
Member Author

@apelisse I thought I have found out the root cause.

When updating an object, apiserver will check if the object could be updated unconditionally:

https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L481-L489

If not, the validation will fail:

https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L531-L537

And custom resource could not be updated unconditionally now:

https://github.com/kubernetes/apiextensions-apiserver/blob/102230e288fd77afcf1a6e7258eac008a891d885/pkg/registry/customresource/strategy.go#L158-L161

@apelisse Do you have a clue why custom resource could not do an unconditional update?

@apelisse
Copy link
Member

@knight42
Copy link
Member Author

knight42 commented Jul 29, 2019

@apelisse Yeah it is, but the problem is still related to https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L531-L537.

Here is how it goes:
When the handler is handling the update request of scale subresource, the scale variable defined here is actually the request body sent from kubectl, which contains no resource version. Later the handler executes cr.SetResourceVersion(scale.ResourceVersion), the resource version of CR is still not set. As a result, r.store.Update, which invokes https://github.com/kubernetes/apiserver/blob/781c3cd1b3dc5b6f79c68ab0d16fe544600421ef/pkg/registry/generic/registry/store.go#L453, returns an error because of missing resource version

So I guess the fix may be simply removing this line:

since the CR got in the handler contains resource version. What do you think?

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jul 31, 2019
@knight42
Copy link
Member Author

@apelisse PTAL

@knight42 knight42 changed the title [WIP] fix: be able to scale CustomResource using kubectl fix: be able to scale CustomResource using kubectl Jul 31, 2019
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jul 31, 2019
@knight42
Copy link
Member Author

/assign @lavalamp

@liggitt
Copy link
Member

liggitt commented Nov 10, 2019

one comment update, and one additional test, then this LGTM

@knight42
Copy link
Member Author

@liggitt I found that the "retry on conflicts" mechanism may not be optimal. As shown in https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/80572/pull-kubernetes-bazel-test/1193803978668773382, in extreme cases, such as frequent concurrent patches, we may keep retrying until timeout then return the last error. So I decided to switch to patching the replicas filed on the server side, how is that sound to you?

@liggitt
Copy link
Member

liggitt commented Nov 13, 2019

two comments on the test, then lgtm

@liggitt
Copy link
Member

liggitt commented Nov 14, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knight42, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2019
@liggitt
Copy link
Member

liggitt commented Nov 14, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 14, 2019
@k8s-ci-robot k8s-ci-robot merged commit a6f51da into kubernetes:master Nov 14, 2019
@knight42 knight42 deleted the fix/scale-cr branch November 14, 2019 09:40
@aermakov-zalando
Copy link

Any chance for this to be backported to 1.15/1.16? @liggitt

@liggitt
Copy link
Member

liggitt commented Jan 24, 2020

This is a more invasive change than is typically backported. Note that #81342 made it into 1.16 and modifies kubectl to use patch when scaling, which works with custom resources in 1.15/1.16 servers.

@aermakov-zalando
Copy link

@liggitt I see, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubectl scale fails to scale a CustomResource if resourceVersion is not provided
9 participants