-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize global strategy #545
Conversation
逻辑上我同意你的看法……过完年多搞几个样例测试下 |
infoWithMinUsage := heap.Pop(infoHeap).(Info) | ||
deployMap[infoWithMinUsage.Nodename]++ | ||
infoWithMinUsage.Usage += infoWithMinUsage.Rate | ||
infoWithMinUsage.Capacity-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个确认一下会不会修改入参 infos 里的值, strategy 函数都应该是无副作用的, 而且可能会多次调用, 不应该在运行之后修改入参.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里因为上面有个copy,所以是无副作用的。(不过有个sort.Slice,会改变Info的顺序)
CommunismPlan里没有直接调用copy,但是在初始化heap的时候事实上相当于copy了一下,所以也是无副作用的。
简单看了一下历史版本,在去年1月份之前的版本里可能会产生副作用,不知道当时的情况是在哪个版本发生的...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在什么情况下会多次调用呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
历史上有个两次调用的版本, 不过后来删掉了.
这个 bug 在去年上半年有个活动的时候就发生过, 只是当时我不能复现, 看着生产环境的日志完全没有头绪, 现在终于捉到了. |
strategy/global.go
Outdated
@@ -22,39 +53,27 @@ func GlobalPlan(ctx context.Context, infos []Info, need, total, _ int) (map[stri | |||
strategyInfos := make([]Info, len(infos)) | |||
copy(strategyInfos, infos) | |||
sort.Slice(infos, func(i, j int) bool { return infos[i].Capacity > infos[j].Capacity }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个sort似乎只是为了unit test而存在的,打算给去了。
dfbcd37
to
9964b35
Compare
Approved in principle. |
之前的global strategy有点问题,所以优化了一下global strategy。基本思路如下:
已知
info.Usage
代表当前资源用量占比,info.Capacity
代表最多还能部署的实例数量,info.Rate
表示单个实例占用资源的比例。那么info.Rate + info.Usage
可以代表“如果在这个node上部署一个实例,最终的资源用量占比”。最终的目的是:尽量在分配完成后,各个info上的usage差不多。
所以就用这个
info.Rate + info.Usage
作为评价标准,来创建一个小顶堆。每次从小顶堆里pop出来一个info,然后在这个info对应的node上部署1个实例(也就是info.Usage += info.Rate,并且info.Capacity--),再把这个info放回去。简单想了想,这样应该很有助于最终info.Usage的均匀。对比来看,如果简单地用info.Usage来构建小顶堆,可能会出现以下情况:
node1: usage 0.5, capacity 1, rate 0.5
node2: usage 0.6, capacity 4, rate 0.1
想要部署1个实例,如果用usage构建堆,那么会部署到node1上,最终node1的usage达到100%。但是用rate + usage构建堆的话,会部署到node2上,最终node1的usage还是50%,node2的usage则是70%,对比起来似乎更合理一些。
粗略实现了一下,后续还需要仔细的测试,因为要过年了,得等年后才能继续了...