Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not fail instantly on CRD resolution error when handling a manifest #1506

Closed
wants to merge 1 commit into from

Conversation

booti386
Copy link

@booti386 booti386 commented Nov 23, 2021

Description

CRD resolution failure can happen e.g. when the service account does not have list access to the crd API.
However, this should not prevent from trying to resolve the resource type using the cluster OpenAPI.
This is fixed by moving the error check after the attempt with OpenAPI.

Acceptance tests

  • Have you added an acceptance test for the functionality being added?
  • Have you run the acceptance tests on this branch?

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccXXX'

...

Release Note

Release note for CHANGELOG:

...

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

@hashicorp-cla
Copy link

hashicorp-cla commented Nov 23, 2021

CLA assistant check
All committers have signed the CLA.

CRD resolution failure can happen e.g. when the service account does not have list access to the crd API.
However, this should not prevent from trying to resolve the resource type using the cluster OpenAPI.
This is fixed by moving the error check after the attempt with OpenAPI.
@booti386 booti386 changed the title Do not fail on CRD resolution error when handling a manifest Do not fail instantly on CRD resolution error when handling a manifest Nov 23, 2021
@alexsomesan
Copy link
Member

alexsomesan commented Nov 23, 2021

Hi,

Thanks for raising this issue. I can see how permissions to list CRDs should not be required as long as one doesn't actually want to manage CRDs.
However, the changes proposed in this PR will have unexpected side-effects when used on CRDs with a service account that doesn't have list permissions. CRDs are also included in the OpenAPI document, but their schema is imprecisely translated to OpenAPI v2.
In that case, if I understand this correctly, the lookup for the CRD will fail, but then it would be found in the OpenAPI document and thus the provider would end up using the incomplete schema expressed in OpenAPI v2. This will cause the same resource to have different structure based on the permission of the calling identity. Something we obviously cannot have happen.

Unfortunately, even if we were to distinguish on error type returned on CRD lookup (to handle the missing permissions error specifically), we would still be missing the information whether the resource is a CRD or not. We cannot just continue and extract the OpenAPI entry for it if we don't know for certain it's not a CRD (for reasons mentioned above).

At this time, I don't really see a way forward without those permissions being available. Let me know if you have some other options in mind.

@booti386
Copy link
Author

Then the only thing I can think of is to add a flag to explicitly define if the manifest refers to a CRD or not.
It will also have the advantage of making the manifest scheme more consistent. (I was a bit confused before understanding that it can effectively refer to both CRD and non-CRD resources).

@alexsomesan
Copy link
Member

The documentation for the kubernetes_manifest resource doesn't not mention it being specific to CRD. On the contrary, it includes an example of using it to create a ConfigMap as well as a CRD. We'll try to emphasize that more, to avoid this kind of confusion.

With regards to the flag you mentioned, I'm afraid that would degrade the user experience more than it would help. For example, we expect users to use automation tools like https://github.com/jrhouston/tfk8s to convert batches of vanilla Kubernetes manifests to Terraform and adding such a flag would complicate this process unnecessarily.

While I'm thinking of alternatives, would you mind explaining in a bit more detail while you're unable to set permissions to list the CRDs in the cluster? They're schema definitions and do not carry sensitive information usually. I'm trying to understand the concerns behind not allowing permissions to list them (read-only).

@booti386
Copy link
Author

booti386 commented Nov 26, 2021

The documentation for the kubernetes_manifest resource doesn't not mention it being specific to CRD. On the contrary, it includes an example of using it to create a ConfigMap as well as a CRD. We'll try to emphasize that more, to avoid this kind of confusion.

After prrofreading the doc I think it is actually ok, it was more the unexpected error message that caused confusion (it puzzled me for a while that I got an error message about CRD endpoint while I was not using it at all, until I read the source code).

With regards to the flag you mentioned, I'm afraid that would degrade the user experience more than it would help. For example, we expect users to use automation tools like https://github.com/jrhouston/tfk8s to convert batches of vanilla Kubernetes manifests to Terraform and adding such a flag would complicate this process unnecessarily.

What I wanted to suggest was to add a tristate e.g. use_crd with the following possible values:

  • unset/null, we get the current behavior
  • true, we always try to retrieve a crd
  • false, we always try to retrieve other things

This way it shouldn't break any existing code.

While I'm thinking of alternatives, would you mind explaining in a bit more detail while you're unable to set permissions to list the CRDs in the cluster? They're schema definitions and do not carry sensitive information usually. I'm trying to understand the concerns behind not allowing permissions to list them (read-only).

Some kubernetes clusters have very strict permissions set, and don't always expose CRDs for various reasons (mostly because the API endpoint is not exposed by default rbac configuration iirc).

@mike-sol
Copy link

So, what are most people doing here? I gave in and gave the serviceaccount access to the CRD API, read-only, with a ClusterRole and a ClusterRoleBinding.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: crd-readonly
rules:
  - apiGroups: ["apiextensions.k8s.io"]
    resources: ["customresourcedefinitions"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: whatever-crd-readonly
subjects:
  - kind: ServiceAccount
    name: whatever
    namespace: whatever
roleRef:
  kind: ClusterRole
  name: crd-readonly
  apiGroup: rbac.authorization.k8s.io

It does seem like it shouldn't be required (in this case, my manifest is for a PersistentVolumeClaim), but it also doesn't seem to be a huge problem unless you had some serious custom resources going on, on a cluster where this user cannot be trusted to read those definitions. Someone facing that would have to design around the restriction with separated clusters for their concerns.

@github-actions github-actions bot added the stale label Sep 28, 2024
@github-actions github-actions bot closed this Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants