Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move pkg/proxy to k8s.io/kube-proxy #92369

Closed
weiqiangt opened this issue Jun 22, 2020 · 55 comments
Closed

Move pkg/proxy to k8s.io/kube-proxy #92369

weiqiangt opened this issue Jun 22, 2020 · 55 comments
Assignees
Labels
area/code-organization Issues or PRs related to kubernetes code organization area/kube-proxy kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@weiqiangt
Copy link

What would you like to be added:

Could you please move pkg/proxy to k8s.io/kube-proxy as an individual module?

Why is this needed:

Some CNIs would like to implement their own kube-proxy like component to gain more performance and thus would like to reuse kube-proxy code. e.g. antrea-io/antrea#772 .

@weiqiangt weiqiangt added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 22, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 22, 2020
@athenabot
Copy link

/sig network

These SIGs are my best guesses for this issue. Please comment /remove-sig <name> if I am incorrect about one.

🤖 I am a bot run by vllry. 👩‍🔬

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 22, 2020
@athenabot
Copy link

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

@k8s-ci-robot k8s-ci-robot added the triage/unresolved Indicates an issue that can not or will not be resolved. label Jun 22, 2020
@wangyira
Copy link

/assign @thockin

@cmluciano
Copy link

I'm also starting to work on the kube-proxy config changes so I can probably help out here if this is something we want to push forward

@thockin
Copy link
Member

thockin commented Jun 25, 2020 via email

@thockin thockin added area/code-organization Issues or PRs related to kubernetes code organization kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed triage/unresolved Indicates an issue that can not or will not be resolved. labels Jun 25, 2020
@uablrek
Copy link
Contributor

uablrek commented Jun 26, 2020

As I think extending kube-proxy with more features or proxiers is a closed path, creating specialized proxiers must be the way forward.

I think there will be more custom-made stand-alone proxiers in the future living side-by-side with kube-proxy and selected with the service.kubernetes.io/service-proxy-name label;

apiVersion: v1
kind: Service
metadata:
  name: special-need-service
  labels:
    service.kubernetes.io/service-proxy-name: custom-proxy

I think now people copy code from kube-proxy in a kind of "wild-west" manner. IMO the ideal option would be to make the API that the "meta-proxier" uses official and provide the watching of services, endpoint-slices etc as a library.

@sbezverk I think have valuable and fresh experience in this area from the work with https://github.com/sbezverk/nfproxy.

A missing piece I see immediately is initialization/configuration. Here I would like to allow custom-made stand-alone proxiers to use the same config as the regular kube-proxy for instance with some section that the regular kube-proxy ignores, a bit like the existing ipvs and iptables sections;

apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
...
iptables:
  masqueradeAll: false
ipvs:
  excludeCIDRs: null
  scheduler: ""
custom-configs:
  - sctp-proxier:
...

The reasons are that some params may be needed also by custom proxiers (or the proxy library-code) and repeating them is error prone but also this is an indication that custom proxiers is something accepted, like CRI or CNI plugins.

I have a possible example/use-case; The current SCTP load-balancing in kube-proxy is insufficient for some applications and I see a possibility to create a custom SCTP-proxier that will only handle external services (receiving external traffic).

@thockin
Copy link
Member

thockin commented Jun 26, 2020

@mcluseau was also looking at some infrastructure to make it easier to write new proxiers

@mcluseau
Copy link
Contributor

mcluseau commented Jun 26, 2020

@thockin thanks for the tag. The draft (https://github.com/mcluseau/kube-proxy2/blob/master/doc/proposal.md) has not changed much since january, and I still have the same numbers. I recently got some time to make progress. The actual minimum amount of code for a proxier is currently that:

package main

import (
	"fmt"
	"os"
	"time"

	"github.com/mcluseau/kube-proxy2/pkg/api/localnetv1"
	"github.com/mcluseau/kube-proxy2/pkg/client"
)

func main() {
	client.Run(printState)
}

func printState(items []*localnetv1.ServiceEndpoints) {
	fmt.Fprintln(os.Stdout, "#", time.Now())
	for _, item := range items {
		fmt.Fprintln(os.Stdout, item)
	}
}

proxiers have a baseline of 13MB (on-disk size).

Examples here: https://github.com/mcluseau/kube-proxy2/tree/master/examples

@sbezverk
Copy link
Contributor

@mcluseau what framework you proposed buys me as a developer of a flavour of proxy? Does it make it faster? Not sure, Does it make it more reliable? not sure either, does it help to get nftables module into Google's linux distribution? Really doubt. So could you explain what DOES it buy me?

@mcluseau
Copy link
Contributor

@sbezverk my approach is to decouple the k8s' business logic from the actual application to a system. So things like computing the topology requested by the user to the targets for the specific node the proxy is running will be done on the proxy side, while applying those computed targets. Thus, developing a proxier implies only writing the code for the subsystem you're targeting, like nftables, but also applications like a local DNS resolver that would gain knowledge of which pods should be targeted from the current node.

As the API is made for the local node only, it's also simpler and I expect it to be more stable than the core API that needs to represent cluster policies and resources. The consequence should be (much) less maintenance on proxiers.

The API is available as gRPC and can be accessed via local unix socket (ie /run/kube-proxy.sock), allowing the proxy to run in a network namespace having access to the API, while the proxier can drive another namespace. It also allows to have proxiers as containers, so they can bring in whatever tool they need.

I'm not a googler and I don't use Google's distribution, so if the kernel modules are missing, I can't do anything on this side.

@uablrek
Copy link
Contributor

uablrek commented Sep 17, 2020

Calico with eBPF replaces kube-proxy but instead of rewrite/copy code it is imported from k8s.io/kubernetes/pkg/proxy;

https://github.com/projectcalico/felix/blob/7a7ffc82e2a803433ca6639582b4f74e034f75d9/bpf/proxy/proxy.go#L36-L39

After a brief discussion on slack I learned that the problem was not dependencies to other packages as @thockin warned for in #92369 (comment). The main problem was that since this is not an official API it is unstable (naturally).

I am thinking of writing a KEP for a "proxier" library since I see a future with many specialized proxiers selected with the service.kubernetes.io/service-proxy-name living side-by-side with the kube-proxy. Also in my experience code written as libraries becomes better structured and more testable and therefore more maintainable.

@thockin Since writing a KEP would take a lot of time, would it be favorable received by sig/network? To maintain another API is not something that should be taken lightly I guess.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2020
@uablrek
Copy link
Contributor

uablrek commented Dec 16, 2020

/remove-lifecycle stale

I still forsee a future with many specialized kube-proxy implementations selected with service.kubernetes.io/service-proxy-name.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2020
@aojea
Copy link
Member

aojea commented Jan 11, 2021

/assign @rikatz @andrewsykim
IIRC they´ve commented during past sig-network meeting they were going to do a KEP for this

Seems the staging repo is already there

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 28, 2022
@thockin thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 16, 2022
@uablrek
Copy link
Contributor

uablrek commented Dec 17, 2022

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Dec 17, 2022
@aojea
Copy link
Member

aojea commented Jan 31, 2023

@weiqiangt I can see that you already solved the problem by forking the code in your own repo antrea-io/antrea#772

Are you still interested on vendor the code directly?

@weiqiangt
Copy link
Author

@weiqiangt I can see that you already solved the problem by forking the code in your own repo antrea-io/antrea#772

Are you still interested on vendor the code directly?

Thanks for mentioning, fwd @hongliangl @vicky-liu @tnqn.

@aojea
Copy link
Member

aojea commented Feb 1, 2023

There is also a follow up question, in case that you are interested in vendor kubernetes/proxy, we are interested in knowing more about your use cases and requirements

@hongliangl
Copy link

There is also a follow up question, in case that you are interested in vendor kubernetes/proxy, we are interested in knowing more about your use cases and requirements

We still need the code now. If you guys could move pkg/proxy(not included pkg/proxy/iptables and pkg/proxy/ipvs)to k8s.io/kube-proxy, that will be very helpful, then we don't need to sync some latest changes in kube-proxy manually. For now, we need to fork the latest code in kube-proxy to our project as a third-party code to implement the latest features in Kubernetes. BTW, is there anything we could help? Thanks a lot. @aojea

@aojea
Copy link
Member

aojea commented Feb 8, 2023

@hongliangl feedback is super useful, there is a KEP to move the code to staging and we are trying to understand all the people needs.

My next question is if you need the whole implementation to be exported, including the iptables logic too?

We are trying to understand if people needs it a library, in that case which parts people needs: control-plane, data-plane, ... or if just needs the whole functionality, and in that case why they embed the code instead of running in a daemonset as a standalone binary

@hongliangl
Copy link

hongliangl commented Feb 8, 2023

My next question is if you need the whole implementation to be exported, including the iptables logic too?

We don't need iptables and ipvs since we have our own data-plan like iptables or ipvs.

For the earlier version of our implementation, we only forked service.go, types.go, endpoints.go and runner.go as a framework and added our own data-plane code; later when we added feature Endpointslice, we also need to forkendpointslicecache.go.

To catch up with the features like TopologyAwareHints and ProxyTerminatingEndpoints in our CNI, we even need to learn the logic in topology.go and add the logic to our code to categorize Endpoints(like this function

func CategorizeEndpoints(endpoints []Endpoint, svcInfo ServicePort, nodeLabels map[string]string) (clusterEndpoints, localEndpoints, allReachableEndpoints []Endpoint, hasAnyEndpoints bool) {
). If topology.go and corresponding files could be used as a library, we could use the latest feature like TopologyAwareHints or ProxyTerminatingEndpoints by upgrading the library.

@hongliangl
Copy link

Features like TopologyAwareHints and ProxyTerminatingEndpoints are used to categorize Endpoints, I think they are always needs by any CNI.

@aojea
Copy link
Member

aojea commented Feb 8, 2023

that is a great feedback, thanks

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 8, 2023
@uablrek
Copy link
Contributor

uablrek commented Jun 27, 2023

This issue can be considered implemented by kpng but as a separate out-of-tree library.

It is also considered a goal for the nftables-based kube-proxy:

Help with the clean-up and refactoring of the kube-proxy "library" code.

This is kind of the birth of kpng. I close this issue, as it has served it's purpose, rather than let it silently rot away.

/close

@k8s-ci-robot
Copy link
Contributor

@uablrek: Closing this issue.

In response to this:

This issue can be considered implemented by kpng but as a separate out-of-tree library.

It is also considered a goal for the nftables-based kube-proxy:

Help with the clean-up and refactoring of the kube-proxy "library" code.

This is kind of the birth of kpng. I close this issue, as it has served it's purpose, rather than let it silently rot away.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/code-organization Issues or PRs related to kubernetes code organization area/kube-proxy kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet