Filter by parent/owner type #58

kopf-archiver · 2020-08-18T19:43:12Z

An issue by nolar at 2019-05-10 14:12:07+00:00
Original URL: zalando-incubator/kopf#58

For cross-object orchestration, it is needed that the parent object's operator is able to watch and react on the events on the children objects it creates, which are produces by the children objects' main operator — so that the parent operator could update the status of its own served parent objects accordingly.

The check can be performed by the metadata.ownerReferences, which generally defines the child-parent relationship. There are no reasons to introduce any other ways of marking the hierarchical relations (e.g. special labels/annotations, but see #45 ).

It should NOT react to any other object that it did not create, e.g. of those were created by other operators/controllers or manually — i.e. if there is NO ownerReference of the specified kind.

The individual objects (uids) should not be taken into account on the DSL level, and can be filtered in the handler code. Only the resource types relationships are important.

The parent information should be used to separate the handler progress storage instead of the default status.kopf field. Otherwise, the main operator of that resource will collide with the side-handlers of the parent operator.

Example syntax:

@kopf.on.delete('', 'v1', 'pods', 
                parent=('zalando.org', 'v1', 'kopfexamples)):
def child_deleted(body, parent, **_):
    child_name = body['metadata']['name']
    parent_name = parent['metadata']['name']
    api = kubernetes.client.CustomObjectsApi()
    api.patch_namespaced_custom_resource(
        ...,
       name=name,
       body={'status': {'children': {child_name: 'DONE'}}}
    )

Or we can introduce a convention to assume the same group/version for the related resources (e.g. here, it would be zalando.org/v1/parents):

@kopf.on.delete('zalando.org', 'v1', 'children', parent='parents'):
def child_deleted(body, parent, **_):
    ....

Silent handlers (spies) #30 for silent handlers (no status/progress storage).
Support for built-in resources (pods, jobs, etc) #84 for pods support (but can be tested with another custom resource).
Implicitly map kinds<=>plurals<=>singulars #57 for short notations of the resources (instead of plurals).
Filter by labels/annotations #45 for filtering by labels/annotations.
Filter by arbitrary callback function #98 for filtering by arbitrary callbacks.

Checklist:

Acceptance criteria:
- Event handler registering with the parent resource declaration.
- Events are filtered by the ownerReferences of the declared kinds.
- parent kwarg injected for such handlers (can be None for regular handlers).
- Relative resource group/version references are understood.
- Parent information is used to separate the handler progress fields instead of status.kopf.
Tests.
Documentation.

Commented by dlmiddlecote at 2019-07-31 16:56:33+00:00

Hi nolar!

I was going to look into this issue if that's okay.

I have a few questions to begin with, hope you can help.

Where do you expect the parent kwarg to be populated from? Given we'll only know a little bit of information about the parent when we get notified of the child event, do we need to fetch the parent resource at that point?
I don't quite understand what the statement Parent information is used to separate the handler progress fields instead of status.kopf. (from the acceptance criteria) means. Would you be able to elaborate?

Thanks,
Dan.

Commented by nolar at 2019-08-02 01:10:04+00:00

dlmiddlecote I suggest that we postpone this issue for a while. I drafted few ideas on the cross-object relations & communication (gist; better if read from below). This topic is quite complicated, and "parents" (owners) are not the only type of relations possible.

I'm still trying to crystallise the vision on the cross-object communication: what? why? how?

Regarding "Parent information is used to separate the handler progress fields instead of status.kopf" — it meant that status.kopf must be configurable, not fixed — see also #23 for user-defined configuration.

This is only important if you have 2+ operators working on the same resource: one is the main operator, another one is kind of "customer": creates the resources for the main operator, links them to its own resources as "parents", and monitors their status in its own handlers — but only for the resources with its own owner references set.

Example:

Operator X serves resources Xn.
Operator Z serves resources Zn.
A human user creates a resource X1.
Operator X creates resources Z1a, Z1b as "tasks" for operator Z, and links them to its own resource X1, for which they were created (e.g. X1<-Z1a,Z1b).
Operator X monitors the resources Z's statuses, but only of those it has created — via owner references.
A human user creates a resource Z2 directly. It is served by operator Z the same way. But it is ignored by the operator X (i.e. not monitored), as it was not originated from X (no owner references). The operator X sees only resources Z1a, Z1b, but not Z2.

The schema is more or less the same as e.g. Deployments<-ReplicaSets<-Pods: the ReplicaSet's controller ignores all Pods which are not made for ReplicaSets, but were created manually or for e.g. Jobs.

The problem was that if you have 2+ operators on the same resource, they both will store their handler progress into the same status.kopf, and it will lead to problems. They must be separated to status.kopf-X and status.kopf-Z or something like that.

Commented by nolar at 2019-08-02 08:23:46+00:00

dlmiddlecote Or, alternatively, we can reduce this issue to filtering only, with parent kwarg and dynamic status storage excluded from the criteria.

The filtering by existence of ownerReferences to a predefined resource kind will be needed anyway, in any approach — as described in the previous comment. And it enriches what was already done in other filtering issues (by labels/annotations).

Commented by cliffburdick at 2020-01-30 00:31:03+00:00

nolar I think this is related to #301 I opened. The way I'm imagining it is you create a custom scheduler using kopf that simply looks for a CRD to be created, then spawns pod(s) from that. Once created, a separate kopf, which is the real controller, is watching for child events of the pods below it, and taking appropriate action on deletion/crash/etc. Does that sound doable?

Commented by elemental-lf at 2020-01-30 21:29:15+00:00

Stumbled (again) over this issue after I got alerted about the new comment and I wanted to say something about this paragraph:

The check can be performed by the metadata.ownerReferences, which generally defines the child-parent relationship. There are no reasons to introduce any other ways of marking the hierarchical relations (e.g. special labels/annotations, but see #45 ).

I can think of one reason why another way could be needed: cross-namespace references. ownerReferences don't support this and this is by design. From the Kubernetes documentation:

Cross-namespace owner references are disallowed by design. This means: 1) Namespace-scoped dependents can only specify owners in the same namespace, and owners that are cluster-scoped. 2) Cluster-scoped dependents can only specify cluster-scoped owners, but not namespace-scoped owners.

I'm currently using labels for this use case and the filtering by label feature of Kopf helps a lot. Such a mechanism could also be implemented in Kopf to assist the user in defining such cross-namespace relationships.

Commented by nolar at 2020-01-31 09:42:12+00:00

elemental-lf That field is fuzzy at the moment. I had few ideas drafted in https://gist.github.com/nolar/8c6233778a30a32fafd7f8d3a55a2cb4#file-example-relations-py-L274-L276 on how the cross-resource interactions can be expressed via decorators: e.g. by implicitly remembering from which handler/object a child was created (if Kopf's adoption was used), and using that as a filter. And few more ideas on this topic — nothing is implemented yet.

I agree here that the ownerReferences are not the best tool to use — we also have a convention of per-app namespaces, and then a top-level cluster-scoped operator to orchestrate them all — so that the k8s native ownership becomes impossible. And we also use labels for that.

Commented by nolar at 2020-01-31 09:58:22+00:00

cliffburdick Are you talking about CRDs or CRs?

If it is about CRs (i.e. objects), then it is already doable now when the parent's handler puts the labels on the children, and the children handlers use label filtering with those labels. — This is how we have it implemented in our apps even cross-namespace.

Commented by elemental-lf at 2020-01-31 10:49:41+00:00

nolar thanks for the pointer, this looks promising. I haven't thought about this in depth but I have a feeling that using a handler reference as origin might be confusing to the user. I assume that the advantage would be that the parent custom resource is implied by the decorator on the parent handler. On the other hand it could convey the notion that this one handler and not the CR is the owner of the child resource. And this is confusing as there could be multiple handlers that are triggered by the same CR event and the child handler would be called for children created by any of them and not only for children originating from this one handler. Or am I missing something?

Commented by nolar at 2020-01-31 11:32:49+00:00

elemental-lf It is not yet defined. These cross-resource relations & handlers are quite a big topic, so I didn't even start looking into it deeper than that draft.

As far as I know, there are no implemented equivalents of such a high-level approach in other languages (I can be wrong here); in K8s itself, the relations are implemented via the labels (as in job-name for jobs<->pods) and label-selectors (as in deployments, services & co).

So, it is a completely new field to explore.

Nevertheless, I prefer to build new features based on the naturally evolved approaches rather than some artificial hypothetical assumptions. I currently watch on how we do this in our in-house apps and look for how other people solve the same problem in theirs — to see what would be the best solution here. It seems, the labelling way wins.

Commented by cliffburdick at 2020-01-31 15:51:27+00:00

cliffburdick Are you talking about CRDs or CRs?

If it is about CRs (i.e. objects), then it is already doable now when the parent's handler puts the labels on the children, and the children handlers use label filtering with those labels. — This is how we have it implemented in our apps even cross-namespace.

Sorry, I meant CRs (an instance of the CRD). I wrote a much more detailed explanation in #301 .

The text was updated successfully, but these errors were encountered:

kopf-archiver bot added the archive label Aug 18, 2020

kopf-archiver bot closed this as completed Aug 18, 2020

kopf-archiver bot changed the title ~~[archival placeholder]~~ Filter by parent/owner type Aug 19, 2020

kopf-archiver bot added the enhancement New feature or request label Aug 19, 2020

kopf-archiver bot reopened this Aug 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by parent/owner type #58

Filter by parent/owner type #58

kopf-archiver bot commented Aug 18, 2020 •

edited

Loading

Filter by parent/owner type #58

Filter by parent/owner type #58

Comments

kopf-archiver bot commented Aug 18, 2020 • edited Loading

kopf-archiver bot commented Aug 18, 2020 •

edited

Loading