Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy NFD as topology-updater component #88

Merged
merged 7 commits into from
Mar 23, 2022
Merged

Conversation

Tal-or
Copy link
Contributor

@Tal-or Tal-or commented Mar 9, 2022

This change will allow deployer to deploy NFD as the topology-updater component.
NFD is the u/s version of RTE, thus it's is important to have the ability to use it.

The default topology-updater component is still RTE since it's more mature and tested for quite some time.

  • Refactor the code to allow easier integration for NFD component
  • Add NFD component
  • Add unit-tests
  • Add E2E tests

This PR provides NFD support only for K8S.
We'll add support for OCP in the future when NFD will catch up with RTE (from abilities perspective)

Copy link
Collaborator

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the general direction seems good
in a future PR, we will need to clarify (or, even better, automate) how we pull the manifests for the external projects like NFD and scheduler-plugins and RTE.
Thinking about it, we should clarify indeed what's the relationship between the deployer and these manifests, because the users would benefit to know which source is truly authoritative.
But again, all of this should be addressed by a separate future PR.

@Tal-or
Copy link
Contributor Author

Tal-or commented Mar 10, 2022

the general direction seems good

The idea is not so good because it was relying on the fact that NFD and RTE have the same K8S resources (i.e. both have DaemonSet, ClusterRole, etc.)

This assumption was wrong because NFD also has separate manifests for nfd-master.
Once the group of manifests is different the whole design is broken.
I need to reevaluate and come up with a different solution.

@Tal-or Tal-or force-pushed the deploy_nfd_topologyupdater branch 4 times, most recently from 565a63b to a0a594e Compare March 14, 2022 09:38
Tal-or added 2 commits March 14, 2022 15:01
Add all the relevant yamls for the nfd-master and nfd-topology-updater components

Signed-off-by: Talor Itzhak <[email protected]>
- Add nfd constants
- Validate the new component
- Expand the DaemonSet function
- Expand Deployment function with namespace parameter

Signed-off-by: Talor Itzhak <[email protected]>
@Tal-or Tal-or force-pushed the deploy_nfd_topologyupdater branch from a0a594e to 8044411 Compare March 14, 2022 13:04
@Tal-or Tal-or changed the title WIP: Deploy NFD as topology-updater component Deploy NFD as topology-updater component Mar 14, 2022
@Tal-or Tal-or force-pushed the deploy_nfd_topologyupdater branch from 8044411 to 5ec198f Compare March 14, 2022 14:32
Copy link
Collaborator

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial review
in general there are some minor things. My biggest concern is trying to reduce the API surface as much as possible. I'm looking at the Deployable interface, let's see if we can get rid of it.

pkg/manifests/manifests.go Show resolved Hide resolved
pkg/manifests/manifests.go Outdated Show resolved Hide resolved
pkg/manifests/nfd/nfd.go Outdated Show resolved Hide resolved
pkg/manifests/nfd/nfd.go Outdated Show resolved Hide resolved
@@ -30,6 +30,12 @@ import (
"github.com/k8stopologyawareschedwg/deployer/pkg/tlog"
)

type Deployable interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next time, I suggest to consider adding the interface/new API alongside with its intended use case.
In general we should demostrate new APIs/interfaces as soon as possible, splitting them out only if the change is really huge.

pkg/deployer/updaters/rte/rte.go Outdated Show resolved Hide resolved
pkg/images/consts.go Show resolved Hide resolved
test/e2e/positive.go Outdated Show resolved Hide resolved
test/e2e/positive.go Outdated Show resolved Hide resolved
test/e2e/positive.go Outdated Show resolved Hide resolved
@ffromani
Copy link
Collaborator

@ffromani
Copy link
Collaborator

played a bit with the implementation. Overall looks good! Let's spend a bit more time trying to refining the API side, and if we get the chance let's do some polishing, but it's close to be ready.

@Tal-or
Copy link
Contributor Author

Tal-or commented Mar 21, 2022

played a bit with the implementation. Overall looks good! Let's spend a bit more time trying to refining the API side, and if we get the chance let's do some polishing, but it's close to be ready.

Thank you for your comments.
I reviewed your proposal, but I am still inclined to the Deployable interface and let me try to explain why:

  1. In your implementation, for every common function (i.e CreatableObjects, DeleteableObjects, ToObjects) you should check for the requested type, while when you're working via an interface you don't, which makes the code shorter and a bit more elegant.
  2. The Deployable interface can be expanded to the sched package in the future as well, since it implements the Deployable interface's function.

BTW we can change the interface name as well to something more elegant/catchy.

In general I would like to hear from you, what is so bad with that API that we're trying to avoid from it here.

Implement the business logic that creates the nfd manifests and render it

Signed-off-by: Talor Itzhak <[email protected]>
@Tal-or Tal-or force-pushed the deploy_nfd_topologyupdater branch 3 times, most recently from 24bd3c7 to f01854a Compare March 21, 2022 11:36
@ffromani
Copy link
Collaborator

Thank you for your comments. I reviewed your proposal, but I am still inclined to the Deployable interface and let me try to explain why:

1. In your implementation, for every common function (i.e `CreatableObjects`, `DeleteableObjects`, `ToObjects`) you should check for the requested type, while when you're working via an interface you don't, which makes the code shorter and a bit more elegant.

True. But in your code we need a form of dispatching (= a if chain) as well, only at module level: eed7c0d#diff-ead887f2895c525a257ae9593cdfd903139ab12da417dc111b47d059844a0a96R46 (and others)
Speaking of code size, the direction I suggest reduces the need of submodules and reduces the code footprint (and also, or at least that's the idea, the API surface).
Least but not least, I hear you about code generic being a quality, but for this case? we know exactly how many cases we need to deal with, and we can predict very accurately how many extensions would we need: zero :)
When deployer can handle NFD, it is actually "done" as it was planned and designed originally, no more need for extensions or future changes, besides bugs and internal improvements.
So genericity here brings value if comes for free and/or dramatically simplifies the code and/or improves the dev UX, and I'm not sure either of these really apply in this specific case.

 2. The `Deployable` interface can be expanded to the sched package in the future as well, since it implements the `Deployable` interface's function.

We can, but we should? I don't really see a benefit here. A consistent interface among different modules is good enough I believe.

BTW we can change the interface name as well to something more elegant/catchy.

Another problem is ToCreatable/ToDeletable and the Waitable interfaces aren't great engineering either (author speaking :) ) So I'd rather not go further along that path as well.
We can perhaps remove this in the future, but the very same considerations to deployer approaching the "done" stage also apply here.

In general I would like to hear from you, what is so bad with that API that we're trying to avoid from it here.

It is not so bad by any mean. But I think is not the best direction either.

@Tal-or Tal-or force-pushed the deploy_nfd_topologyupdater branch from f01854a to e834d6a Compare March 23, 2022 15:00
Tal-or and others added 4 commits March 23, 2022 17:00
Since rte and nfd deployment procedure is almost identical, let's have a single implmentation for them both, but yet keep the API changes as minimal as possible.

Signed-off-by: Talor Itzhak <[email protected]>
Co-authored-by: Francesco Romani <[email protected]>
Added coverage for the following packages:
- manifests/updates

Signed-off-by: Talor Itzhak <[email protected]>
Control the image selection via an environment variable or use the latest release as default: https://github.com/kubernetes-sigs/node-feature-discovery/releases

Signed-off-by: Talor Itzhak <[email protected]>
Signed-off-by: Talor Itzhak <[email protected]>
@Tal-or Tal-or force-pushed the deploy_nfd_topologyupdater branch from e834d6a to d763bb5 Compare March 23, 2022 15:01
@Tal-or
Copy link
Contributor Author

Tal-or commented Mar 23, 2022

I change the implementation as you suggested.
Please have another round when possible

Copy link
Collaborator

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
thanks for the updates

@ffromani ffromani merged commit 7f56774 into main Mar 23, 2022
@ffromani ffromani deleted the deploy_nfd_topologyupdater branch March 23, 2022 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants