Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter by labels/annotations #45

Closed
kopf-archiver bot opened this issue Aug 18, 2020 · 0 comments
Closed

Filter by labels/annotations #45

kopf-archiver bot opened this issue Aug 18, 2020 · 0 comments
Labels
archive enhancement New feature or request

Comments

@kopf-archiver
Copy link

kopf-archiver bot commented Aug 18, 2020

An issue by nolar at 2019-04-26 09:52:24+00:00
Original URL: zalando-incubator/kopf#45
 

Currently, all objects are watched (either globally or by namespace, see #32), and all objects are handled. This is a normal case for the operator that "owns" the handled objects.

For the cases when an operator spies on the objects tat it does not "own", such as pods (#30), or the log (#46 ), it should be able to filter out all objects definitely not in the scope of interest.

The easiest way is by labels, as it is supported by the Kubernetes API, and can be put into a query to be performed server-side. Also, filtering by annotations is probably possible — via the field-selectors.

Example usage:

import kopf

@kopf.on.event('', 'v1', 'pods', labels={'model': None, 'model-id': '123abc')
def model_pod_changed(**kwargs):
    pass

The same is for create/update/delete/field handlers, and, when implemented, for the event & log handlers.


Additionally, the label filtering should be accepted on the command line (same semantics as kubectl):

kopf run handlers.py -l experiment=expr1

That can be useful for development and debugging purposes, when it is not desired to put the labels to the code permanently.


Commented by dlmiddlecote at 2019-06-19 18:39:58+00:00
 

Hey nolar,

I’d like to take this one on if possible.

I have a few questions about this:

  • You say that labels “can be put into a query to be performed server-side”, is that true? What happens if there are different filters for the same resource on different handlers, would we have to make 2 queries to the Kubernetes API? Im then thinking that this should then be handled in the code itself, do you agree? (similar for annotations).
  • What is the semantic around the {‘model’: None} label shown above; is that “the model key exists but with any value”, or “the model key exists with the value null”?
  • What should happen if labels are specified on the command line, and in the handler, is it a join of the two? Only the one in the handler, or command line wins?

Thanks!


Commented by nolar at 2019-06-20 13:56:07+00:00
 

dlmiddlecote

Hm. Probably, that little note was done when only the command-line filtering was in mind, not per-handler — i.e. global. This makes no sense with per-handler filtering, of course. The logic of label-matching must be inside of Kopf then, not in the API queries and server-side.


{'model': None} meant that the model label should be there, but it does not matter with which value. An equivalent of kubectl get pod -l mylabel vs kubectl get pod -l mylabel=myvalue. I think, it is impossible to have a label with value null (never tried though).

Keep in mind: this syntax snippets are just suggestions. They can be varied as needed if some problematic details pop up. E.g., a special object kopf.ANY can be introduced instead of None for label values — the same way as in mock library (when matching the call() arguments).


It is an interesting question. Maybe, the command-line label filtering must be removed at all, leaving only the per-handler filtering.

Initially, I would say that from the user point of view, this must be an intersection of the labels (not the union) — i.e. AND, not OR — i.e. it must match both, or be ignored.

The command-line filtering can be used when restricting an operator's scope on deployment time (e.g. -l experiment=expr1), while the per-handler labels can be used to express the object relations (e.g. {"parent-run-id": ANY/None} for pods).

However, if started with both, this causes problems and confusion: according to this logic, that should be a pod with parent-run-id present AND restricted to experiment=expr1. However, there is no place where this experiment label is put on the created pod, unless a developer explicitly implemented that.

And so, the internal logic of the operator code (handlers' code) is interacting with the outer logic of deployment (CLI filters).

If we go that way, Kopf must also implicitly add the command-line-specified labels on all created children objects (e.g. in kopf.adopt(), in the assumption that they all go through it). Which is not so thought-through territory, so I would recommend to avoid it for now.

Just per-handler filtering is enough. If the developers want it, they can define experiment as an env var, and add these labels themselves to the handler declarations as a global filter.


Commented by nolar at 2019-06-20 13:59:50+00:00
 

(reserved for dlmiddlecote)


Commented by nolar at 2019-07-24 09:34:52+00:00
 

Released as kopf==0.20

Docs: https://kopf.readthedocs.io/en/latest/handlers/#filtering
Announcement: https://twitter.com/nolar/status/1153971321560129543

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archive enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

0 participants