Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cluster resource Includes/Excludes #5120

Closed
27149chen opened this issue Jul 14, 2022 · 21 comments · Fixed by #5333 or #5838
Closed

Support cluster resource Includes/Excludes #5120

27149chen opened this issue Jul 14, 2022 · 21 comments · Fixed by #5333 or #5838
Assignees
Labels
Milestone

Comments

@27149chen
Copy link
Contributor

27149chen commented Jul 14, 2022

Describe the problem/challenge you have

The current filter (IncludedResources/ExcludedResources + IncludeClusterResources flag) is not enough for some special cases, such as:

  1. all namespaced resources + some kind of cluster resource
  2. all namespaced resources + cluster resource excludes
  3. no namespaced resources + all cluster resources

Describe the solution you'd like

Add new field IncludedClusterResources and ExcludedClusterResources, examples:
(wildcard rules: includeResources: empty for all, excludeResources/includeClusterResources: "*" for all, excludeResources/includeClusterResources/excludeClusterResources: empty for nothing)

  1. no namespaced resources + no cluster resources:
excludeResources: ["*"]
  1. no namespaced resources + some cluster resources:
excludeResources: ["*"]
includeClusterResources: ["persistentvolumes"]
excludeResources: ["*"]
excludeClusterResources: ["persistentvolumes"]
  1. no namespaced resources + all cluster resources
excludeResources: ["*"]
includeClusterResources: ["*"]
  1. some namespaced resources + no cluster resources:
includeResources: ["configmaps"]
excludeResources: ["configmaps"]
  1. some namespaced resources + some cluster resources:
includeResources: ["configmaps"]
includeClusterResources: ["persistentvolumes"]
excludeResources: ["configmaps"]
includeClusterResources: ["persistentvolumes"]
includeResources: ["configmaps"]
excludeClusterResources: ["persistentvolumes"]
excludeResources: ["configmaps"]
excludeClusterResources: ["persistentvolumes"]
  1. some namespaced resources + all cluster resources:
includeResources: ["configmaps"]
includeClusterResources: ["*"]
excludeResources: ["configmaps"]
includeClusterResources: ["*"]
  1. all namespaced resources + no cluster resources
includeResources:
  1. all namespaced resources + some cluster resources:
includeClusterResources: ["persistentvolumes"]
excludeClusterResources: ["persistentvolumes"]
  1. all namespaced resources + all cluster resources:
includeClusterResources: ["*"]

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Velero version (use velero version):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "The project would be better with this feature added"
  • 👎 for "This feature will not enhance the project in a meaningful way"
@reasonerjt
Copy link
Contributor

Agreed that current combination of include/exclude filters of namespace resources and cluster resources causes confusion, IMO, ideally we should clarify that current include/exclude resource filters will only apply to namespace resources and we'll introduce new include/exclude cluster-resource filters for filtering cluster resources

@sseago
Copy link
Collaborator

sseago commented Jul 18, 2022

Actually, I think the current excludes/excludes do apply to cluster-scoped resources. If you include PersistentVolumes in excludedResources, then I don't think any will be backed up. If you have an includedResources list with both cluster-scoped and namespaced resources, then both sets will be included (but nothing else). The issue here is that once you have an includes list, then nothing outside the list is included. So you can't say "include all namespaced resources but only include X, Y, and Z cluster-scoped resources."

It's worth noting that if lncludeClusterResources is nil (rather than true or false) then Velero will pull in certain resources as-needed. In particular, if your main use case for cluster-scoped resources is persistentVolumes, then you probably don't want to include all PVs, just the ones associated with PVCs you're backing up, so what you want is to set includeClusterResources to nil (or leave the field out of spec entirely, which is equivalent).

@27149chen
Copy link
Contributor Author

27149chen commented Jul 19, 2022

so what you want is to set includeClusterResources to nil (or leave the field out of spec entirely, which is equivalent).

@sseago The problem is that I do want to set includeClusterResources to true since I need some cluster resources. And the current filter (IncludedResources/ExcludedResources + IncludeClusterResources flag) can not meet most of the cases I mentioned above

@blackpiglet
Copy link
Contributor

@27149chen
I agree with @sseago. --include-resources and --exclude-resources are not limited to namespace resource.

Usually, Velero backup will not separate resources into namespace and cluster scope. Taking backup resources in namespace test as an example, velero backup create backup01 --include-namespaces=test this command will backup all namespace resources in namespace test, and related cluster scope resources are included too.

What is the background of the special use cases you mentioned?

  • all namespaced resources + some kind of cluster resource
  • all namespaced resources + cluster resource excludes
  • no namespaced resources + all cluster resources

@27149chen
Copy link
Contributor Author

related cluster scope resources are included too

it does not meet our requirement, we want to use includeClusterResources flag (according to you words, this is a "bad" flag which I should never use it, otherwise, some unexpected behavior will occur. If so, what's the point of its existence?)

@27149chen
Copy link
Contributor Author

What is the background of the special use cases you mentioned?

most of them are valid requirements from our customer

@blackpiglet
Copy link
Contributor

blackpiglet commented Jul 19, 2022

@27149chen
IMO, there are ways to do the similar effect for some of the cases.

  • all namespaced resources + some kind of cluster resource: may not have an easy way to achieve this by current parameters. Need to use --excluded-resources parameter to list all not included resources.
  • all namespaced resources + cluster resource excludes: backup all things, except the specified cluster resources. --excluded-resources=<exclude-cluster-resources-list>.
  • no namespaced resources + all cluster resources: create an empty namespace, and use the two parameters in backup command --include-namespaces=<empty-namespace-name> --include-cluster-resources=true.

By my understanding, the missing piece is a parameter to specify resources to include, and it will not exclude all the other things as the current parameter --included-resources.

@sseago
Copy link
Collaborator

sseago commented Jul 19, 2022

So setting includeClusterResources to nil does include some cluster resources, but it's limited to those resources that are directly relevant to other resources being backed up (this handles the usual needs for PVs, CRDs, etc.) If your use case is "I want all of cluster resources X, Y, and Z but none of A, B, and C" then you can handle that right now by just setting includeClusterResources to true and adding A, B, and C to excludedResources.

As @blackpiglet said, the only thing you can't do right now directly is "only include X, Y, and Z cluster resources" unless you're willing/able to list everything not A, B, and C in excludeResources. Will the "list all cluster resources you don't want in the exclude list" meet your needs?

@sseago
Copy link
Collaborator

sseago commented Jul 19, 2022

As for "no namespaced resources but all cluster resources", I wonder whether we can handle this by just excluding all namespaces while seting includeClusterResources to true. I know that wildcards don't currently work on exclude, and the issue is an empty include list is equivalent to "*" -- what's less clear to me is whether it would work to have includedNamespaces=["foo"] and also have excludedNamespaces=["foo"] -- this might mean that the full list of namespaces is ["foo"] - ["foo"] == [], but I haven't tested that scenario myself.

@27149chen
Copy link
Contributor Author

27149chen commented Jul 20, 2022

@sseago @blackpiglet , are we talking about workarounds or a formal solution?

@27149chen
Copy link
Contributor Author

So setting includeClusterResources to nil does include some cluster resources, but it's limited to those resources that are directly relevant to other resources being backed up (this handles the usual needs for PVs, CRDs, etc.) If your use case is "I want all of cluster resources X, Y, and Z but none of A, B, and C" then you can handle that right now by just setting includeClusterResources to true and adding A, B, and C to excludedResources.

As @blackpiglet said, the only thing you can't do right now directly is "only include X, Y, and Z cluster resources" unless you're willing/able to list everything not A, B, and C in excludeResources. Will the "list all cluster resources you don't want in the exclude list" meet your needs?

@sseago sorry but I think you do not understand my problem, if includeClusterResources is set to true, all pvs will be backed up, not just the relevant ones, see details in #5119

@27149chen
Copy link
Contributor Author

namespaces is ["foo"] - ["foo"] == [], but I haven't tested that scenario myself.

@sseago please test it first before adding this comment, AFAIK, it is a invalid config if an item is both in includes and excludes, and an error will be raised

@blackpiglet
Copy link
Contributor

blackpiglet commented Jul 20, 2022

@27149chen
We all see your cases' value, and try to find out a way reasonable for all of us.
Just my opinion, IncludedResources/ExcludedResources don't separate resource by two groups depending on whether they are limited in namespace, so there is no need to add two new parameters focusing on cluster-scope things.
I also see there is gap for your need.
all namespaced resources + some kind of cluster resource: may not have an easy way to achieve this by current parameters. Need to use --excluded-resources parameter to list all not included resources.
If either adding a new flag that can specify including a list of resources without excluding all the others or change the behavior of IncludedResources parameter, but don't add two parameters as you proposed, does that work for you?

@27149chen
Copy link
Contributor Author

27149chen commented Jul 20, 2022

@blackpiglet I'm happy if you can find a way to resolve this problem without adding two parameters for cluster resources.

I have listed 16 cases in my issue description, and provided a way that I think is simple and effective. If you can find another way to do it clean and simple, please give your examples.

Thanks.

@sseago
Copy link
Collaborator

sseago commented Jul 20, 2022

So one thing to consider -- if we were to implement your proposal, we'd need a different name. Your proposal doesn't account for the nil value of includeClusterResources -- i.e. only include relevant resources, vs. none vs. all. I think we would still need that even if we added some notion of additional filtering/including. For example, your proposal loses the abiliy to "only include PVs that are needed" So the include list would need to have a name other than the existing bool pointer "includeClusterResources" which has 3 valid values now -- true, false, and nil.

Also, I've seen other requests to allow for wildcards on the exclude list. If implemented, this would allow for "no namespaced resources" by simply specifying "*" for excludedNamespaces. That would handle cases 1-3.

Case 4 is already handled by setting includeClusterResources to false

For case 5, examples 1 and 4 already work today by setting includeClusterResources to true and putting both namespaced and cluster resources in the include or exclude list. 2 and 3 are trickier:

excludeResources: ["configmaps"]
includeClusterResources: ["persistentvolumes"]
includeResources: ["configmaps"]
excludeClusterResources: ["persistentvolumes"]

The only way to do the above right now would be to reformat this into a list of all excludes or all includes. But how central is this use case? I imagine someone else might say "I want per-namespace include/exclude lists" -- but once we get to that level of granularity of control, the recommendation would be to create two backups. I wonder whether the same would work here -- a cluster-scoped backup and a namespaced backup.

6 and 8 are similar. The second example can be done today by just using the existing include/exclude lists. The first would need 2 backups.

7 and 9 are supported by the existing options.

So out of the 16 total examples, it looks like 8 are supported today. 4 more would be supported by allowing for wildcards in excludeNamespaces, and 4 could be done today by using a partitioned backup scheme, but to do in a single backup would require two additional config params (but without removing/changing the existing true/false/nil includeClusterResources parameter).

The concern here is that the cluster/namespaced partition in the config doesn't really feel natural to me. If we were going to redesign the selection mechanism to be more flexible than it is today, I'm not certain that this is the only use case we'd care about adding. I'd want a more flexible config that might be less confusing than some combination of include/excluderesources, includeClusterResources true/false/nil, and two new cluster-scoped include/exclude parameters.

@27149chen
Copy link
Contributor Author

27149chen commented Jul 23, 2022

@sseago please give you resource filter spec for every case as I did in the description so that we can discuss one by one and find the gap

@reasonerjt reasonerjt added this to the 1.10 milestone Jul 27, 2022
@reasonerjt reasonerjt removed the 1.10-candidate The label used for 1.10 planning discussion. label Jul 27, 2022
@waldner
Copy link

waldner commented Aug 2, 2022

I'd like to add the case "all resources in a namespace" + "some selected cluster resources", which seems to not be possible ATM. Well one can run two separate backups, ok, but ideally it should be possible in the same backup.

@waldner
Copy link

waldner commented Aug 3, 2022

Also, if I can suggest, a filter based on resource names (not labels) would be most welcome, at least for my use cases.

@27149chen
Copy link
Contributor Author

Also, if I can suggest, a filter based on resource names (not labels) would be most welcome, at least for my use cases.

@waldner you are talking about #5118 , which is also important to me

@waldner
Copy link

waldner commented Aug 3, 2022

Thanks! I subscribed to that one.

@blackpiglet
Copy link
Contributor

After discussion, I think it's reasonable to create a new filtering parameter to focus on namespace resources, because currently there are three ways to filtering resources:

  • Label based filter: OrLabelSelectors and LabelSelector.
  • Resource type based filter: IncludedResources, ExcludedResources and IncludeClusterResources.
  • Namespace based filter: IncludedNamespaces and ExcludedNamespaces.
    Label based filter currently only applied to namespaced resources, so it can be regards as namespace based filter.

There are senarios that resource type based filter and namespace based filter interfere mutually. Seperate --include-resources and --exclude-resources into cluster scope and namespace scope parameters can resolve these issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment