Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry Design #1015

Merged
merged 27 commits into from
May 6, 2019
Merged

Conversation

nachocano
Copy link
Contributor

@nachocano nachocano commented Apr 1, 2019

Initial design for the Registry feature request: #929

Proposed Changes

  • Initial design for Registry.

Release Note

NONE

@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Apr 1, 2019
@knative-prow-robot knative-prow-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 1, 2019
@nachocano
Copy link
Contributor Author

nachocano commented Apr 1, 2019

@grantr @akashrv @Harwayne @evankanderson @n3wscott @Abd4llA @duglin

Moving the registry design conversation to this PR. Please refer to the other PR for previous comments. And let's keep on commenting on this one. Feel free to include whoever might be interested.


- `type` is authoritative. This refers to the Cloud Event type as it enters into the eventing mesh.

- `source`: an identifier of where we receive the event from. This might not necessarily be the Cloud Event source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, would it be possible to just consume types? Regardless what submits them ?

Let;s assume I am an app that is interest all "storage provisioned" events, the app is not directly interested in keeping track of all the possible products/services that could emit those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in that case you can just create a trigger filtering on the type field, and forget about source.
Does that answer the question?


**1. Event Source CR installation**

Upon installation of an Event Source CR, the source will register its EventTypes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just the case for the GH source, to sorta "filter" the events to "provide"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH is one source that potentially has many types, and we know those types in advance.
But for the other sources, where we know what event types will be sending, IMO we should go ahead and add them to the registry.
For example, AWSSQS always sends one eventType (aws.sqs.message), cronjob as well (dev.knative.cronjob.event). Also GCPPubSub (google.pubsub.topic.publish), and kafka (dev.knative.kafka.event).
When we instantiate a particular CR (say kafka with topic1 and topic2), we will populate the registry with two EventTypes: 1) type: dev.knative.kafka.event, source: topic1, and 2) type: dev.knative.kafka.event, source: topic2.

Does it make sense?

broker: dev
```

This would register the EventType named `repofork` with type `repo:fork`, source `my-other-user/my-other-repo`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see my-other-user/my-other-repo inside the GH source, but not as name .. am I missing something ?

Copy link
Contributor Author

@nachocano nachocano Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as source you mean?

As the cloud event source field is somewhat useless. For example, the current GH adapter for a pull request sends as cloud event source https://github.com/<user>/<repo>/pull/<pull_id>, there is no way to do exact matching on that on a trigger.
I propose to augment the cloud event in adapters with some custom extension that will allow us to do better filtering. In the case of GH, have that user/repo, in the case of kafka, maybe just the consumerGroup/topicName, and so on.
I know this is not in this README as of now, I will add these details for further clarification. Thanks for pointing that out.
In case you want to take a look at some initial work that I did, you can take a look at knative/eventing-contrib#300

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nachocano I don't really get this statement about the source field being useless. Why is exact matching needed?
By the way, in the CE group there are currently discussions going on. They are also discussing github examples.
@duglin What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deissnerk I meant that the Ce-Source is too specific and can contain information only known at runtime. For example, that pull_id, or some commit_id, or whatever.
If a downstream consumer wants to consume events from a particular Ce-Source, then in our current model the consumer needs to create a Trigger filtering for events matching that Ce-Source. As of now, we only support exact matching on Ce-Source, which IMHO is somewhat useless. With more advanced filtering mechanisms we should be OK.
But even then, if we rely on the cloud event source to uniquely identify EventTypes, e.g., if we use (type, source, schema) as "primary key" for EventTypes, then the registry will blow up in the sense that we will be adding a bunch of identical EventTypes... For example (omitting irrelevant fields):

type: dev.knative.source.github.pull_request
source: https://github.com/nachocano/eventing/pull/1

type: dev.knative.source.github.pull_request
source: https://github.com/nachocano/eventing/pull/2

...

type: dev.knative.source.github.pull_request
source: https://github.com/nachocano/eventing/pull/100

Does it make sense?

Yes, @n3wscott mentioned this new subject field they are proposing in the spec. I think it will be much more useful than source, but not sure when it will be available.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nachocano The subject field was approved yesterday. So it should be there in the upcoming CE version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deissnerk thanks for the heads up!

Copy link
Member

@daisy-ycguo daisy-ycguo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for moving the design document to a separate PR. I moved my questions in the previous PR here, for easily discussion.

metadata:
name: repopush
spec:
type: repo:push
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An EventType may support multiple types in this proposal. I think it's a little bit confusing. I spend a few minutes to understand the relationship between EventType and type. Then I figure out EventType is something like a type of event sources, and type is something like different kinds of events (e.g. create/update/delete are different events of databases.) that this type of event sources issue. If my understanding is correct, I suggest to change these two terms for easy understanding. My suggestions are: change EventType to EventSourceType and change type to eventTypes or events for simplicity.

Copy link
Member

@daisy-ycguo daisy-ycguo Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading the proposal several times, I think I finally understand EventType. EventType means a specific event in this proposal, is that correct? I had thought it was Event Source Type before. It's really a confusing word.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment Daisy, and sorry for the confusion.

An EventType object is just that, a type of Event that can flow through the system.
Although you will expect that the name field of the EventType object to be sufficient to identify it, we need the actual cloud event type to perform filtering on Triggers later on (and also the source). As that cloud event type might not meet k8s naming conventions, we cannot use it in EventType.name here. Thus we just have it in EventType.spec.type, and we autogenerate EventType.name based on it (by stripping invalid chars, e.g., :).

Does it make more sense?

- `source`: an identifier of where we receive the event from. This might not necessarily be the Cloud Event source
attribute. If we receive the event from our receive adaptors, the info might come in a Cloud Event custom extension (e.g., from).

- `schema` is a URI with the EventType schema. It may be a JSON schema, a protobuf schema, etc. It is optional.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive me if my understanding is wrong. I think here source and schema describes the necessary parameters of an EventType. EventType to Event is as Class to Object.

I think parameters to a specific kind of event source may be divided into below types:

  • authentication
  • uri to identify this event source, e.g. ip:address to a database server or a message broker, url for a GitHub repo, and etc
  • others, e.g. topic in a Kafka event, namespace in a K8s event, and etc.

Where do you plan to put authentication values then ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source is mandatory, yes.
schema is optional.
Those parameters are used to identify the EventType that can flow through the system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think parameters to a specific kind of event source

Each source has its way of specifying the authentication information. GitHubSource uses secrets, and so on. But that is for sources, not for EventTypes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a danger of confusing "event source" in the Cloud Events sense and "code that produces events", where the producer code requires authentication info — the "source" field is just a description of where the event came from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right.
The "source" field refers to the CloudEvents source.
Event Source is the "code that produces events", e.g., GitHubSource.


```

By applying this file, two EventTypes will be registered, with types `push` and `pull_request`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused here. I may get EventType definition wrong. My understanding here is 1 EventType is created: GitHubSource in Registry and 2 Events are created: push and pull_request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that the single Event source CR, github-source-sample, creates an EventType for each type of event it will create. In this case, it will create two, one for push and another for pull_request. So something like:

apiVersion: eventing.knative.dev/v1alpha1
kind: EventType
metadata:
  name: gh-push
  namespace: default
  owner: # Owned by github-source-sample?
spec:
  type: repo:push
  source: my-other-user/my-other-repo
  broker: default
---
apiVersion: eventing.knative.dev/v1alpha1
kind: EventType
metadata:
  name: gh-pull-request
  namespace: default
  owner: # Owned by github-source-sample?
spec:
  type: repo:pull_request
  source: my-other-user/my-other-repo
  broker: default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHubSource is a Source object, which gets created as you are applying it.

And then two EventTypes are also created in the Registry, one with EventType.spec.type = push and EventType.spec.source = my-other-user/my-other-repo, and the other with EventType.spec.type = pull_request and EventType.spec.source = my-other-user/my-other-repo....

Does it make more sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding the auto-registered EventType's YAML to the doc will make it clearer what is going on.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks nachocano. I got it.

Our design revolves around the following core requirements:

1. We should have a Registry per namespace to enforce isolation.
2. The Registry should contain the event types that can be consumed from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the user really want to share event types (i.e. events he/she defined with his/her authentication info) in Registry ? After applying this new feature, every event a user created has to be shared with others who has access to the namespace. Will it cause any security concern ?

Copy link
Contributor Author

@nachocano nachocano Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good question, and the answer is that I'm not sure :)
You are right, as of now, EventTypes are namespaced-scope and will potentially be consumed from anyone within the namespace that is consuming from the particular broker that has that EventType.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

events he/she defined with his/her authentication info

What does this mean? My reading of the proposal is that the EventTypes themselves have no authentication info inside them. They do imply that some other piece has the authentication info (e.g. something must have authentication info to connect to GitHub's my-other-user/my-other-repo). But visibility of the registry itself does not reveal the authentication info, because it doesn't contain any. It may act as a kind of oracle to show what other auth info must exist.


In general I think there are two possible security concerns (please correct me if there are others):

  1. A user that didn't know about super-secret-event looks in the registry and sees that it exists.
  2. The user now uses that knowledge to create a Trigger to process super-secret-events.

For 2, I don't think this is a problem. Because if the user can create Triggers, they always can create a Trigger that accepts all events. So knowing super-secret-event exists may make things easier, but does not actually expose more information.

For 1, this seems like a fairly narrow case. It involves users that can read EventTypes, but not create Triggers. And for the most part, can't read Triggers (as then they could see a Trigger with a filter for super-secret-event). I haven't thought it all the way through if this leaks any useful information.


1. We should have a Registry per namespace to enforce isolation.
2. The Registry should contain the event types that can be consumed from
the eventing mesh. If an event type is not ready for consumption, we should explicitly indicate so.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have tended to use the nomenclature that each Broker is its own 'eventing mesh'. As the registry is per namespace, not per Broker, does this mean that it is marked as ready for consumption if any Broker in the namespace has it available?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is explained below by having each EventType target exactly one Broker.


- `schema` is a URI with the EventType schema. It may be a JSON schema, a protobuf schema, etc. It is optional.

- `broker` refers to the Broker that can provide the EventType.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a list? Or would different Brokers in the same namespace use distinct EventTypes? E.g. if both Brokers a and b have an EventType repopush, do I expect/require there to be two EventType CRs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current proposal, yes. You will expect to have two EventType CRs. But it might not be correct though.
I think having it as a string it makes it easier for checking in the Broker ingress whether the EventType is registered or not using a label selector on the broker name.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brokerName?

Should there be a convention of naming the EventType with the name of the broker if the broker isn't default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used broker as the TriggerSpec uses broker instead of brokerName, but I'm ok with both.

I'm not entirely sure if we should add the name of the broker as part of the EventType name. Maybe in the future we want to support cross-broker EventTypes, and so on, and the names might be confusing? Though not sure.

I was thinking that by adding additional columns (e.g.., broker) to the output of kubectl get eventtypes, we could easily identify the EventTypes of each broker.


- `type` is authoritative. This refers to the Cloud Event type as it enters into the eventing mesh.

- `source`: an identifier of where we receive the event from. This might not necessarily be the Cloud Event source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this source not lining up with Trigger's spec.filter.sourceAndType.source. I think because the fields are named the same, they should have the same value. Otherwise there is cognitive overload when creating a Trigger (I want to only get events that match this one EventType from the registry, how I do I?). Therefore, this by definition must be the CloudEvent source attribute. However, the receive adapter MAY change that attribute before sending it to the Broker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added more explanations to the source field. Let me see if it makes more sense.
I'm not changing the cloud event source, but rather adding a new custom extension.
Both the broker ingress and the triggers will see if that extension is present, and if so, use that. If not, then resort to the cloud event source. This was my way of making 'source' useful.

Might not need this when start supporting advanced filtering on cloud event sources.
But even then, if we rely on the cloud event source to uniquely identify EventTypes, e.g., (spec.type, spec.source, spec.schema), then the Registry will blow up in the sense that we will be adding a bunch of identical EventTypes...

For example (omitting irrelevant fields)

type: dev.knative.source.github.pull_request
source: https://github.com/nachocano/eventing/pull/1

type: dev.knative.source.github.pull_request
source: https://github.com/nachocano/eventing/pull/2

...

type: dev.knative.source.github.pull_request
source: https://github.com/nachocano/eventing/pull/100

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its going to be confusing if the CE.source != the event producer, especially if we then add a new field ("from") to hold that info. This means that the sink/app code will need to have specialized CE processing logic to know to look for the true event producer in a non-CE-standard location.

```

By applying this file, two EventTypes will be registered, with types `push` and `pull_request`,
source `my-other-user/my-other-repo`, for the `default` Broker in the `default` namespace.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source my-other-user/my-other-repo

I thought GitHub's source was longer, including info such as the exact pull request this event is about (e.g. my-other-user/my-other-repo/pull/1015). Does this imply that the GH receive adapter is now sending this shortened form as its CloudEvent source attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm sending it as a new custom extension 'from'. Will add details here, as I missed it.
Some of the proposed changes are here: knative/eventing-contrib#300

docs/registry/README.md Outdated Show resolved Hide resolved

```

By applying this file, two EventTypes will be registered, with types `push` and `pull_request`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is that the single Event source CR, github-source-sample, creates an EventType for each type of event it will create. In this case, it will create two, one for push and another for pull_request. So something like:

apiVersion: eventing.knative.dev/v1alpha1
kind: EventType
metadata:
  name: gh-push
  namespace: default
  owner: # Owned by github-source-sample?
spec:
  type: repo:push
  source: my-other-user/my-other-repo
  broker: default
---
apiVersion: eventing.knative.dev/v1alpha1
kind: EventType
metadata:
  name: gh-pull-request
  namespace: default
  owner: # Owned by github-source-sample?
spec:
  type: repo:pull_request
  source: my-other-user/my-other-repo
  broker: default

docs/registry/README.md Outdated Show resolved Hide resolved
docs/registry/README.md Outdated Show resolved Hide resolved
docs/registry/README.md Outdated Show resolved Hide resolved

1. We should have a Registry per namespace to enforce isolation.
2. The Registry should contain the event types that can be consumed from
the eventing mesh. If an event type is not ready for consumption, we should explicitly indicate so.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would cause an EventType to "not [be] ready for consumption"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., that the broker doesn't exist or the broker is not Ready

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So EventType's ready is just a proxy for its Broker's ready?

Copy link
Contributor Author

@nachocano nachocano Apr 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, yes. Broker existence and broker readiness.
I think is hard to add conditions like whether the receiver adapter is ready if you manually create the EventType pointing to a GH repo, for example.

@devguyio
Copy link
Contributor

devguyio commented Apr 3, 2019

as already mentioned on slack, after re-reading the registry design proposal, I can’t help feeling if we introduce in the EventType CRD a notion to the eventing source (i.e. GitHub source) it came from, we will remove that tension between CE source sometimes being too specific and the need to understand from where that EventType was originally registered. So I can imagine an EventType that looks as follows

apiVersion: eventing.knative.dev/v1alpha1
kind: EventType
metadata:
  name: repopush
  namespace: default
spec:
  type: dev.knative.source.github.pull_request
  source: https://github.com/nachocano/eventing/pull/1
  sourceRef:
      name: "my-other-user/my-other-repo"
      namespace: "default"
  schema: /my-schema
  broker: default

In general this looks really good to me and the registry is a feature that IMHO will improve the user experience significantly .

@nachocano
Copy link
Contributor Author

https://github.com/nachocano/eventing/pull/1

@Abd4llA one thing that I missed to point out in slack is that this information is not available at the time you instantiate the EventType (unless it is auto-registered in the Broker ingress).
You don't have that <pull_id> in advance, that is why the Cloud Event source is somewhat useless.
Also, by adding this source, we will probably start having a bunch of duplicated EventTypes, in case we use the triplet (spec.type, spec.source, spec.schema) as "primary key" for the EventType.

That is mainly why I decided to just use the useful bits, i.e., my-other-user/my-other-repo. Although the field name source might not be the right name.

Thanks for reviewing the proposal!

broker: dev
```

This would register the EventType named `repofork` with type `repo:fork`, source `my-other-user/my-other-repo`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nachocano I don't really get this statement about the source field being useless. Why is exact matching needed?
By the way, in the CE group there are currently discussions going on. They are also discussing github examples.
@duglin What do you think?


### EventType CRD

We propose having a namespaced EventType CRD. Here is an example of how a CR would look like:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have my doubts that a CRD is the right approach for this kind of information. If I understand it correctly, the registry is supposed to be a way for the user to look up event types. The CRs do not describe any desired state on the cluster, and they replicate information the broker got from or through the sources.

Copy link
Contributor Author

@nachocano nachocano Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, the registry is supposed to be a way for the user to look up event types.

Yes, so that users can then do kubectl get eventtypes -n some-namespace and get the EventTypes that were instantiated.
What do you think will be a better approach instead of a CRD? The reason I'm asking is because I'm kind of new in this k8s, knative world, so maybe I'm missing some important piece.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nachocano I forgot to share also that I've a concern about this being a CRD per EventType for the cases when there are a large number of EventTypes. It is typical for some of our "Event Sources/Enterprise Application" to have around 100+ event types. what are your thoughts around that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deissnerk @Abd4llA sorry I may have expressed this wrong.

There is just one CRD with kind EventType. And then EventType CRs will be created that will instantiate that CRD. There is no CRD per EventType, is just one.

Similar to Triggers, there is one CRD with kind Trigger, and then you can instantiate Trigger CRs with what you want to consume.

Does that make more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

around 100+ event types

I guess we will have 100+ EventType CRs in the Registry. Do you think that can be a problem?
Same thing can happen with Triggers (if you create very specific ones to just do exact matching on type for example). Although probably the typical usage pattern will be to have # triggers << # eventTypes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nachocano
The challenge with our event types is that they are delivered using some messaging middleware. The source implementation will not know the event types. It might subscribe to a specific topic that is reserved to one event type and source, but it could also do a wildcard subscription.
Perhaps we have to distinguish between the events that are available to a Trigger and the events, for which you could configure a Source.
Of our 100+ events you probably would see a small subset, as most Source definitions wouldn't subscribe to all of them. Consequently creating the Source CRs may be a bigger challenge than the Triggers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deissnerk makes sense.
Yeah, for the cases where the Source do not actually know in advance the eventTypes that can be sent (e.g., upon instantiation of a containerSource), the current proposal is to resort to the broker auto-registration policy. But that is kind of a workaround and I'm not entirely convinced.
We definitively need further discussions on this. Thanks for bringing up that point.

@devguyio
Copy link
Contributor

devguyio commented Apr 3, 2019 via email

@nachocano
Copy link
Contributor Author

I wonder how do you find it from a UX when the user kubectl get eventtype and gets hundreds of objects.

Hmm, my first thought is trying to filter things out with jsonPaths or something like that. You list the EventTypes for a namespace, and also use jsonPaths to filter them, for example list just for a particular broker. It's out-of-the-box k8s stuff, I'm not much aware of a better mechanism.

@grantr or @Harwayne do you guys know if there is a better way of doing this?

## Objective

Design an **initial** version of the Registry that can support discoverability of
the different event types that can be consumed from the eventing mesh.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try but I doubt I'll be able to make the call today... but we can more added here to explain why this is needed and why going directly to the event producer for this info (like people do today) isn't sufficient? As a user of this system, unless the Registry contains all possible event types produced (and that I can see) then it doesn't provide me much value over going directly to the producer. For example, only have the list of events "seen" doesn't seem very interesting since it's almost random. If the events in the registry do not align (naming-wise) with that I see from the producer, then it's less interesting to me because I'm now locked-in.

So far this doc explains the design but not the rationale behind why we're doing it and what the benefit is to the user over what they do today.

Referring to the User Stories document, which seems that have been forgotten.
Mainly emphasizing that this is intended for a Registry MVP.
@nachocano nachocano mentioned this pull request Apr 8, 2019
@duglin
Copy link

duglin commented Apr 9, 2019

Based on yesterday's call I wanted to document some of the things that I think the design docs needs to cover to ensure we're all headed in the same direction and so that new folks can easily understand the goals of the Registry.

  • is the Registry meant to be used just for creating Triggers or for also setting up EventSources?
  • related, if I have a simple use case where I'm just setting up an EventSource and my KnService is its Sink, (no triggers involved), is there a need/use for the Registry?
  • is the Registry meant to be used in a single-user environment where the same person is setting up both the EventSource and the destination Sink?
  • does a user need to know which type of environment they're in before they should know if they should look at the Registry? ie. is a Registry always going to be there and if not under what conditions will it be?
  • once an EventSource is created, how is a new one created with different auth in an env where the user is really just meant to deal with Triggers? This may not be a Registry specific question (I'm not sure) but if one of the goals of the Registry to make it so that the user only deals with Triggers and the info in the Registry, I think this aspect comes into play
  • I've heard conflicting messages around whether the Registry is just a list of Event Types or it will also be a list of Event Sources so that the user doesn't need to query the CRDs to get the list. We need to be clear about this.

If a Registry is meant to provide some level of abstraction for people, so they don't need to go to each event producer's docs to know how to subscribe, it seems to me that the discovery side of the Registry needs to be available prior to an EventSource being instantiated - mainly, so that the event types can be use in that resource and not just in Triggers. Therefore, I can't help but think that the population of the Registry should be done prior to (and independent of) the creation of the EventSources. I view the creation of an EventSource similar to creating a Subscription, so I need that discovery info before I can create a Subscription. In order to do this, I wonder if the EventSource populating the Registry should happen when the EventSource is loaded into the system - meaning when the EventSource's CRD is defined (not when an instance of the CRD is created). This would the mean the event types in the Registry are not based on "what's been seen" and are available immediately to the user regardless of whether they're creating just an EventSource, or Triggers.

@nachocano
Copy link
Contributor Author

Thanks for the questions @duglin .
Just added a FAQ section at the bottom of the doc (plus some updates to the doc itself) with some of the questions and my current thoughts.

docs/registry/README.md Outdated Show resolved Hide resolved

It's mainly intended for helping the `Event Consumer` with the `discoverability` use case. Therefore,
it is meant for helping out creating Triggers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for MVP of the registry, the expectation is that in order for a user to create a Trigger, they will need to know what event types are available, and hence there should be a way to discover them.
For MVP, these are Event Types that have been configured to be sent to a given broker, hence a Trigger for them can be created.

I think Event Source CRD is a "Potential" Event. Without instantiating one, you have no events. Hence the "Potential". Once an Event Source CR has been created, then those events are available to create Triggers for.

Now, I could certainly see a world (not in MVP), where an Event Source CRD could be pointed to by a registry.
So, Event Source CRD could be added to a registry, roughly looking like this:
{Broker, Source, Type} -> Ref To the CRD.

This way if a user looks at the Registry, they would see the Event Types available. Now, in order to actually create the CR, they will probably need to specify additional information (Like Credentials, etc.), and this is where things get wonky because you don't want to necessarily put those into the Trigger spec.

So, I think that' goal is doable, I just think by providing the MVP experience that focuses on the discovery of event types available (not potentially available) would allow us to provide value to the user and then add more niceties on top of it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, it's not only the matter of say providing credentials, there needs to be a way to describe the "subscription schema" that a user can inspect and understand things like, what format the credentials are (unless we come up with a standard, convention, or something like that), but in any case, I just think that providing a Registry as a concept makes users lives better, then in the next release we could start tackling these additional trickier bits.

- Does a user need to know which type of environment they're in before they should know if they should look at the Registry?
In other words, is a Registry always going to be there and if not under what conditions will it be?

If there are Sources pointing to Brokers, then there should be a Registry. `Event Consumers` will always be able to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since registry is made of CR entries, there's always a registry, but there just might not be any CR => empty registry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the right way of putting it. You are right, you can always kubectl get eventtypes, in case there is no CR, then the registry will be empty. But there will always be one.
Will update that. Thanks for the comment!


- Once an Event Source is created, how is a new one created with different auth in an env where the user is really just meant
to deal with Triggers? This may not be a Registry specific question but if one of the goals of the Registry is to make it
so that the user only deals with Triggers using the info in the Registry, I think this aspect comes into play.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so I think we should approach this in two steps. Provide the MVP where the Source needs to be created separately and say in .7 release tackle something like possibly being able to create a Trigger from a Source CRD. Namely the credentials will take some work, etc.

docs/registry/README.md Outdated Show resolved Hide resolved
docs/registry/README.md Outdated Show resolved Hide resolved
@nachocano
Copy link
Contributor Author

Thanks for your comments @sixolet.

I guess the main issue I see from your comments is that currently we have source as a single field instead of a list. And that it should probably be optional. It's a valid point and I think you might be right. My only worry with a list is how are we going to efficiently update it. In case Event Sources send CloudEvents with the source attribute containing dynamic info, it can become quite big for a particular EventType. I might be wrong though, and there is no need to update it?
Let's see what other people think.

docs/registry/README.md Outdated Show resolved Hide resolved
@nachocano nachocano changed the title Registry Design Proposal Registry Design Apr 26, 2019
@vaikas
Copy link
Contributor

vaikas commented May 3, 2019

/approve

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nachocano, vaikas-google

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 3, 2019
@vaikas
Copy link
Contributor

vaikas commented May 6, 2019

/lgtm

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label May 6, 2019
@knative-prow-robot knative-prow-robot merged commit e4637d3 into knative:master May 6, 2019
syedriko pushed a commit to syedriko/eventing that referenced this pull request May 8, 2019
* Registry proposal

* Updates after comments

* adding broker ingress policies for reference

* More clarifications. Restructuring the document a bit.
Referring to the User Stories document, which seems that have been forgotten.
Mainly emphasizing that this is intended for a Registry MVP.

* started a FAQ list at the bottom, for further clarifications.

* updates after Ville's comments

* addressing some of Doug's comments.

* typos

* removing the custom extension from, as we will be able to use
Ce-source.

* typo

* some updates

* adding Grant's example

* cosmetics

* cosmetic

* updating types

* typo

* removing broker ingress policies

* changing CR to CO

* updating source values

* adding the protocol to match the examples in CloudEvents spec
chizhg pushed a commit to chizhg/eventing that referenced this pull request May 13, 2019
* Registry proposal

* Updates after comments

* adding broker ingress policies for reference

* More clarifications. Restructuring the document a bit.
Referring to the User Stories document, which seems that have been forgotten.
Mainly emphasizing that this is intended for a Registry MVP.

* started a FAQ list at the bottom, for further clarifications.

* updates after Ville's comments

* addressing some of Doug's comments.

* typos

* removing the custom extension from, as we will be able to use
Ce-source.

* typo

* some updates

* adding Grant's example

* cosmetics

* cosmetic

* updating types

* typo

* removing broker ingress policies

* changing CR to CO

* updating source values

* adding the protocol to match the examples in CloudEvents spec
matzew added a commit to matzew/eventing that referenced this pull request Dec 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.