Do not fail instantly on CRD resolution error when handling a manifest #1506

booti386 · 2021-11-23T15:19:57Z

Description

CRD resolution failure can happen e.g. when the service account does not have list access to the crd API.
However, this should not prevent from trying to resolve the resource type using the cluster OpenAPI.
This is fixed by moving the error check after the attempt with OpenAPI.

Acceptance tests

Have you added an acceptance test for the functionality being added?
Have you run the acceptance tests on this branch?

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccXXX'

...

Release Note

Release note for CHANGELOG:

...

References

CRD api access error on kubernetes_manifest prevents a non-CRD resource to be created. #1507

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

hashicorp-cla · 2021-11-23T15:20:00Z

All committers have signed the CLA.

CRD resolution failure can happen e.g. when the service account does not have list access to the crd API. However, this should not prevent from trying to resolve the resource type using the cluster OpenAPI. This is fixed by moving the error check after the attempt with OpenAPI.

alexsomesan · 2021-11-23T23:45:47Z

Hi,

Thanks for raising this issue. I can see how permissions to list CRDs should not be required as long as one doesn't actually want to manage CRDs.
However, the changes proposed in this PR will have unexpected side-effects when used on CRDs with a service account that doesn't have list permissions. CRDs are also included in the OpenAPI document, but their schema is imprecisely translated to OpenAPI v2.
In that case, if I understand this correctly, the lookup for the CRD will fail, but then it would be found in the OpenAPI document and thus the provider would end up using the incomplete schema expressed in OpenAPI v2. This will cause the same resource to have different structure based on the permission of the calling identity. Something we obviously cannot have happen.

Unfortunately, even if we were to distinguish on error type returned on CRD lookup (to handle the missing permissions error specifically), we would still be missing the information whether the resource is a CRD or not. We cannot just continue and extract the OpenAPI entry for it if we don't know for certain it's not a CRD (for reasons mentioned above).

At this time, I don't really see a way forward without those permissions being available. Let me know if you have some other options in mind.

booti386 · 2021-11-24T20:54:29Z

Then the only thing I can think of is to add a flag to explicitly define if the manifest refers to a CRD or not.
It will also have the advantage of making the manifest scheme more consistent. (I was a bit confused before understanding that it can effectively refer to both CRD and non-CRD resources).

alexsomesan · 2021-11-25T16:01:14Z

The documentation for the kubernetes_manifest resource doesn't not mention it being specific to CRD. On the contrary, it includes an example of using it to create a ConfigMap as well as a CRD. We'll try to emphasize that more, to avoid this kind of confusion.

With regards to the flag you mentioned, I'm afraid that would degrade the user experience more than it would help. For example, we expect users to use automation tools like https://github.com/jrhouston/tfk8s to convert batches of vanilla Kubernetes manifests to Terraform and adding such a flag would complicate this process unnecessarily.

While I'm thinking of alternatives, would you mind explaining in a bit more detail while you're unable to set permissions to list the CRDs in the cluster? They're schema definitions and do not carry sensitive information usually. I'm trying to understand the concerns behind not allowing permissions to list them (read-only).

booti386 · 2021-11-26T14:45:59Z

The documentation for the kubernetes_manifest resource doesn't not mention it being specific to CRD. On the contrary, it includes an example of using it to create a ConfigMap as well as a CRD. We'll try to emphasize that more, to avoid this kind of confusion.

After prrofreading the doc I think it is actually ok, it was more the unexpected error message that caused confusion (it puzzled me for a while that I got an error message about CRD endpoint while I was not using it at all, until I read the source code).

With regards to the flag you mentioned, I'm afraid that would degrade the user experience more than it would help. For example, we expect users to use automation tools like https://github.com/jrhouston/tfk8s to convert batches of vanilla Kubernetes manifests to Terraform and adding such a flag would complicate this process unnecessarily.

What I wanted to suggest was to add a tristate e.g. use_crd with the following possible values:

unset/null, we get the current behavior
true, we always try to retrieve a crd
false, we always try to retrieve other things

This way it shouldn't break any existing code.

While I'm thinking of alternatives, would you mind explaining in a bit more detail while you're unable to set permissions to list the CRDs in the cluster? They're schema definitions and do not carry sensitive information usually. I'm trying to understand the concerns behind not allowing permissions to list them (read-only).

Some kubernetes clusters have very strict permissions set, and don't always expose CRDs for various reasons (mostly because the API endpoint is not exposed by default rbac configuration iirc).

mike-sol · 2023-09-28T05:47:56Z

So, what are most people doing here? I gave in and gave the serviceaccount access to the CRD API, read-only, with a ClusterRole and a ClusterRoleBinding.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: crd-readonly
rules:
  - apiGroups: ["apiextensions.k8s.io"]
    resources: ["customresourcedefinitions"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: whatever-crd-readonly
subjects:
  - kind: ServiceAccount
    name: whatever
    namespace: whatever
roleRef:
  kind: ClusterRole
  name: crd-readonly
  apiGroup: rbac.authorization.k8s.io

It does seem like it shouldn't be required (in this case, my manifest is for a PersistentVolumeClaim), but it also doesn't seem to be a huge problem unless you had some serious custom resources going on, on a cluster where this user cannot be trusted to read those definitions. Someone facing that would have to design around the restriction with separated clusters for their concerns.

github-actions bot added the size/XS label Nov 23, 2021

booti386 mentioned this pull request Nov 23, 2021

CRD api access error on kubernetes_manifest prevents a non-CRD resource to be created. #1507

Closed

booti386 force-pushed the patch-1 branch from d4868ba to 88abc8b Compare November 23, 2021 15:40

booti386 force-pushed the patch-1 branch from 88abc8b to 1147909 Compare November 23, 2021 15:42

booti386 changed the title ~~Do not fail on CRD resolution error when handling a manifest~~ Do not fail instantly on CRD resolution error when handling a manifest Nov 23, 2021

bfanyuk mentioned this pull request Mar 29, 2022

kubernetes_manifest resource should NOT make any CRD call #1665

Open

github-actions bot added the stale label Sep 28, 2024

github-actions bot closed this Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not fail instantly on CRD resolution error when handling a manifest #1506

Do not fail instantly on CRD resolution error when handling a manifest #1506

booti386 commented Nov 23, 2021 •

edited

Loading

hashicorp-cla commented Nov 23, 2021 •

edited

Loading

alexsomesan commented Nov 23, 2021 •

edited

Loading

booti386 commented Nov 24, 2021

alexsomesan commented Nov 25, 2021

booti386 commented Nov 26, 2021 •

edited

Loading

mike-sol commented Sep 28, 2023

Do not fail instantly on CRD resolution error when handling a manifest #1506

Do not fail instantly on CRD resolution error when handling a manifest #1506

Conversation

booti386 commented Nov 23, 2021 • edited Loading

Description

Acceptance tests

Release Note

References

Community Note

hashicorp-cla commented Nov 23, 2021 • edited Loading

alexsomesan commented Nov 23, 2021 • edited Loading

booti386 commented Nov 24, 2021

alexsomesan commented Nov 25, 2021

booti386 commented Nov 26, 2021 • edited Loading

mike-sol commented Sep 28, 2023

booti386 commented Nov 23, 2021 •

edited

Loading

hashicorp-cla commented Nov 23, 2021 •

edited

Loading

alexsomesan commented Nov 23, 2021 •

edited

Loading

booti386 commented Nov 26, 2021 •

edited

Loading