Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

支持kubegateway的单实例分批断开长连接 #29

Open
zjzhangkui opened this issue Apr 24, 2023 · 1 comment
Open

支持kubegateway的单实例分批断开长连接 #29

zjzhangkui opened this issue Apr 24, 2023 · 1 comment

Comments

@zjzhangkui
Copy link

当kubegateway某个实例异常或者重建时,该实例上建立的list/watch 长连接会一次性断开重新连接新的实例,当k8s集群的list/watch数量比较大时,这个kubegateway实例重启的那一刻,kube-apiserver的cpu使用率和负载会飙高,kubegateway能不能支持重建pod实例分批断开连接之后再重启?

@xuqingyun
Copy link
Collaborator

  1. kubegateway的graceful shutdown依赖底层http2 sever的graceful shutdown,之前由于依赖的apiserver版本中有部分graceful shutdown的bug fix commit未包含在内,所以graceful shutdown并未真正生效。通过这个PR已经解决了这个问题:fix: apiserver graceful shutdown #31

    http2 sever的graceful shutdown逻辑是在connection空闲时给长连接发送goaway,长连接断开的时机是不确定的,有时候也会出现大量空闲长连接在graceful shutdown开始时就立刻断开的情况。

  2. 在生产实践中,我们部署了比较多的gateway实例,保证每个实例上的长连接影响范围较小,通过滚动更新策略保证每个实例之间的更新间隔时间。

通过以上两种方法,kubegateway某个实例异常或者重建时,长连接重建的影响可以在可控范围内

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants