Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

流量检测问题算法的改进 #477

Closed
Smityz opened this issue Feb 18, 2020 · 1 comment
Closed

流量检测问题算法的改进 #477

Smityz opened this issue Feb 18, 2020 · 1 comment
Labels
type/enhancement Indicates new feature requests

Comments

@Smityz
Copy link
Contributor

Smityz commented Feb 18, 2020

之前在 PR444 中的算法原理为,找出当前app中partition流量的最小值为 qps_min,此时partition热点数值为 当前partition_qps/max(1,qps_min)
实验结果如图所示:
Lark20200218-105412
可见在开头和结尾时,有两个异常的凸起。
造成凸起的原因是:当读写刚刚进行时,某个partition没有流量,导致该app中 qps_min直接取到1,而分子是正常的流量,所以导致整体数值偏大。而当各个partition都有正常的流量后,分母qps_min变大,虽然分子中热点paritition的数值也相应较大,但是算下来热点数值并不如初期异常的大,所以会出现开始和结束时的异常凸起。
为了避免这种情况的产生,我调研了新的算法:三倍标准差
算法流程如下:

  1. 取出当前记录的有效历史数据(所有partition的不为0的历史qps)
  2. 计算标准差与平均值
  3. 热点值为当前partition的qps相对于均值的差值除以标准差
  4. 在热点值大于等于3的情况下,我们可以认为存在热点数据
@Smityz Smityz added the type/enhancement Indicates new feature requests label Feb 18, 2020
@foreverneverer
Copy link
Contributor

foreverneverer commented Feb 21, 2020

之前在 PR444 中的算法原理为

github的地址可以自动转换为超链接,像下面这样:
#444

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement Indicates new feature requests
Projects
None yet
Development

No branches or pull requests

2 participants