Add KEP for AddOnPlacementScoreGenerator #93

mvazquezc · 2023-04-13T14:05:13Z

As discussed at the community meeting on April 6th, this is the KEP for adding the AddOnPlacementScoreGenerator API.

haoqing0110 · 2023-04-23T03:20:33Z

enhancements/sig-architecture/32-extensiblescheduling/score-generator/README.md

+  addOnPlacementSelector:
+    name: latency
+    namespace: cluster1
+  latencies:


My question is, should this field latencies be a more general field? and how could a user define a URL here, what's the requirement for the URL and the output?

Hey @haoqing0110 yes, other than more generic to me this would be "any string" and even we could have more than one, something like this for example:

apiVersion: cluster.open-cluster-management.io/v1alpha1 kind: AddOnPlacementScoreGenerator metadata: name: cluster1-generator namespace: cluster1 spec: addOnPlacementSelector: name: latency namespace: cluster1 latencies: - name: redhat-com-avgLatency url: https://redhat.com runs: 2 waitBetweenRuns: 10s - name: google-com-avgLatency url: https://google.com - name: linux-com-avgLatency url: https://linux.com energy: - name: some-name someValue: some-value

Then, the controller interested in latencies will consume latencies , the controller interested in energy will consume energy.

Thanks for the explanation. This AddOnPlacementScoreGenerator sounds like a configuration for the Addon controller, and to generate different scores, developers still need to develop their own addon controller, correct? Sounds similar to the idea of https://github.com/open-cluster-management-io/enhancements/tree/main/enhancements/sig-architecture/58-addon-configuration, not sure if this can satisfy the motivation of this proposal, and this proposal is like to propose a new addon configuration called AddOnPlacementScoreGenerator.

And I'm wondering could the controller to consume latencies and energy could be a unified controller, it can consume different kind of data source, eg, an URL https://redhat.com or some host path /some/path, or some other things. And this API just defines the supported data source types.

cc @qiujian16 , want to know your suggestions.

@haoqing0110 the use cases defined by 58-addon-configuration would work for us, I have checked the examples and the example with ManagedProxyConfiguration is very similar to what we want.

And answering your second comment, yes, the same controller could use latencies and energy data.

I think this is more like a cluster scoped resource that applies to all clusters, it is less likely that we want different configs for different clusters. Also how could we parse the result of the http responses? It might be straightforward for latency. But if we try to fetch data from a metrics server or 3rd party services to generate the score, the returned output varies and we might need a way to define how to parse the result.

Hey @qiujian16,

I think that for latencies global object or per-cluster object would both work. In terms of data, I was thinking of an unstructured structure where people developing custom controllers with the addon framework will consume the data the way they need it. I don't think this should be a controller taking care of multiple data sources, instead a single controller taking care of a data source, so if anything were wrong with the data itself it wouldn't affect the other add-ons.

+1 on Mario's approach on a single controller taking care of a single data source type.

To my understanding, in this API we should define

how to get the raw data. It can not be only latency. It probably could be a field type and latency could be a certain type to get raw data.

how to convent raw data to the score

Signed-off-by: Mario Vazquez <[email protected]>

openshift-ci · 2023-07-03T08:48:54Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mvazquezc
Once this PR has been reviewed and has the lgtm label, please assign qiujian16 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

qiujian16

put some comment, but I haven't thought through the whole idea yet. Will take some time this week and add more comments.

qiujian16 · 2023-05-19T01:45:31Z

enhancements/sig-architecture/32-extensiblescheduling/score-generator/README.md

+  name: cluster1-generator
+  namespace: cluster1
+spec:
+  addOnPlacementSelector:


is this actually the name of the score?

not really, it's the name of the AddOnPlacementScore. This selector would be used by the controller generating the AddOnPlacementScore to get the AddOnPlacementScoreGenerator that applies to this specific AddOnPlacementScore.

qiujian16 · 2023-05-19T01:48:15Z

enhancements/sig-architecture/32-extensiblescheduling/score-generator/README.md

+
+~~~yaml
+apiVersion: cluster.open-cluster-management.io/v1alpha1
+kind: AddOnPlacementScoreGenerator


Maybe AddOnPlacementScoreDataSource?

To the comment below:

I don't think we want to create types... I was thinking of an unstructured data type where the controller consuming the API should take care of reading the fields it expects to be present in order to generate the required scores.

Controller should take care of that.

I would not suggest unstructured type, it will lose the schema validation and make the API unbounded.

qiujian16 · 2023-07-04T04:02:45Z

enhancements/sig-architecture/32-extensiblescheduling/score-generator/README.md

+  addOnPlacementSelector:
+    name: latency
+    namespace: cluster1
+  latencies:


To my understanding, in this API we should define

how to get the raw data. It can not be only latency. It probably could be a field type and latency could be a certain type to get raw data.

how to convent raw data to the score

qiujian16

I am still not convinced that this should be put in the API repo. The major uncertainty is how this API can be evolved and consumed by multiple score providers. It seems to me a new score provider has to update this API or build their own CRDs to config how score is generated. It does not seems change the fact that AddOnPlacementScore is the only integration point. But I am 100% ok if we create a separated repo and maintain this API and controller there.

qiujian16 · 2023-07-14T02:37:34Z

enhancements/sig-architecture/32-extensiblescheduling/score-generator/README.md

+
+~~~yaml
+apiVersion: cluster.open-cluster-management.io/v1alpha1
+kind: AddOnPlacementScoreGenerator


I would not suggest unstructured type, it will lose the schema validation and make the API unbounded.

mvazquezc · 2023-07-14T07:08:15Z

I am still not convinced that this should be put in the API repo. The major uncertainty is how this API can be evolved and consumed by multiple score providers. It seems to me a new score provider has to update this API or build their own CRDs to config how score is generated. It does not seems change the fact that AddOnPlacementScore is the only integration point. But I am 100% ok if we create a separated repo and maintain this API and controller there.

I'm not sure I follow you. The idea behind this CRD is to allow the customization of score generations by introducing a new API that can be consumed by the addon controllers, I don't think we want to have a dedicated controller for this API.

Multiple score providers will be required to read this API in case they want to get score-generation customizations. In case these score providers don't need customizations they're fine not reading this CRD.

To me, this API would be like having your config in a ConfigMap and reading it upon controller start.

qiujian16 · 2023-07-14T07:50:55Z

I think I understand the intention, but I do not see how it can be achieved when a new provider comes without adding additional field in this API. For example, if I have a new provider comes and allow user to configure it to score the cluster based on resource usage, I have to update the API spec or introduce a new API to config. I do not see how you can define a common api schema that can adapt to different providers.

mvazquezc · 2023-07-14T07:55:31Z

I think I understand the intention, but I do not see how it can be achieved when a new provider comes without adding additional field in this API. For example, if I have a new provider comes and allow user to configure it to score the cluster based on resource usage, I have to update the API spec or introduce a new API to config. I do not see how you can define a common api schema that can adapt to different providers.

That's why I think it should be unstructured, so anyone can put their customization data without having to change the API. The controllers are responsible for reading their data and making sure that in case of misconfiguration, they can gracefully handle it.

qiujian16 · 2023-07-14T08:24:41Z

That's why I think it should be unstructured, so anyone can put their customization data without having to change the API. The controllers are responsible for reading their data and making sure that in case of misconfiguration, they can gracefully handle it.

if it is unstructured, why we need to define an API schema instead of directly using a configmap :)? I think we need to at least define some general concept in this API rather than arbitrary type...This is why I am struggling with how this API can provide commonality in this scenario.

mvazquezc · 2023-07-17T09:06:33Z

if it is unstructured, why we need to define an API schema instead of directly using a configmap :)? I think we need to at least define some general concept in this API rather than arbitrary type...This is why I am struggling with how this API can provide commonality in this scenario.

It could be done via ConfigMaps, we won't be able to leverage things defined in the KEP like addOnPlacementSelector but I guess that's not that bad.

I cannot think of a common structured API that can be leveraged without having to update API types for every new provider. Requirements for different data providers may be very different and as such, hard to put in common.

openshift-ci bot requested review from deads2k and qiujian16 April 13, 2023 14:05

mvazquezc force-pushed the addon-placement-score-generator branch from f9cf88f to 8031826 Compare April 13, 2023 14:25

haoqing0110 reviewed Apr 23, 2023

View reviewed changes

Added scoregenerator API proposal

2bf05f9

Signed-off-by: Mario Vazquez <[email protected]>

mvazquezc force-pushed the addon-placement-score-generator branch from 8031826 to 2bf05f9 Compare July 3, 2023 08:48

qiujian16 reviewed Jul 5, 2023

View reviewed changes

qiujian16 reviewed Jul 14, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add KEP for AddOnPlacementScoreGenerator #93

Add KEP for AddOnPlacementScoreGenerator #93

mvazquezc commented Apr 13, 2023

haoqing0110 Apr 23, 2023

mvazquezc Apr 25, 2023 •

edited

Loading

haoqing0110 Apr 28, 2023 •

edited

Loading

haoqing0110 Apr 28, 2023

haoqing0110 Apr 28, 2023

mvazquezc Apr 28, 2023

qiujian16 May 8, 2023

mvazquezc May 15, 2023

fenar May 17, 2023

qiujian16 Jul 4, 2023

openshift-ci bot commented Jul 3, 2023

qiujian16 left a comment

qiujian16 May 19, 2023

mvazquezc Jul 5, 2023

qiujian16 May 19, 2023

mvazquezc Jul 5, 2023

mvazquezc Jul 5, 2023 •

edited

Loading

qiujian16 Jul 14, 2023

qiujian16 Jul 4, 2023

qiujian16 left a comment

qiujian16 Jul 14, 2023

mvazquezc commented Jul 14, 2023

qiujian16 commented Jul 14, 2023

mvazquezc commented Jul 14, 2023

qiujian16 commented Jul 14, 2023 •

edited

Loading

mvazquezc commented Jul 17, 2023

Add KEP for AddOnPlacementScoreGenerator #93

Are you sure you want to change the base?

Add KEP for AddOnPlacementScoreGenerator #93

Conversation

mvazquezc commented Apr 13, 2023

Choose a reason for hiding this comment

mvazquezc Apr 25, 2023 • edited Loading

Choose a reason for hiding this comment

haoqing0110 Apr 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Jul 3, 2023

qiujian16 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mvazquezc Jul 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qiujian16 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mvazquezc commented Jul 14, 2023

qiujian16 commented Jul 14, 2023

mvazquezc commented Jul 14, 2023

qiujian16 commented Jul 14, 2023 • edited Loading

mvazquezc commented Jul 17, 2023

mvazquezc Apr 25, 2023 •

edited

Loading

haoqing0110 Apr 28, 2023 •

edited

Loading

mvazquezc Jul 5, 2023 •

edited

Loading

qiujian16 commented Jul 14, 2023 •

edited

Loading