-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix telemetry rpc getting stuck if all nodes have bandwidth set to 0 #3643
Fix telemetry rpc getting stuck if all nodes have bandwidth set to 0 #3643
Conversation
what are the steps to reproduce this issue? |
|
I think this bug is caused by this if statement, which removes the zero bandwidth values.
If all bandwidth_cap are set 0 then Therefore, this should be reproducible by a unit test with 10 nodes having bandwidth set to 0. @fikumikudev are you willing to write such a unit test? |
Also, I do not think this bug is fully fixed by the proposed change. The function |
Yes, it looks like my fix is not complete. |
There are two ways to fix this I can think of, first is to rewrite |
I think improving strip_outliers_and_sum to handle any number of elements is the right way to do it. |
bf67968
to
af495ea
Compare
af495ea
to
1aee398
Compare
@dsiganos I fixed the |
Yeap, I'll look at it now. I pushed a PR to your election scheduler RPC PR branch. thank you for your contributions, they are great! |
I encountered this bug when I was doing some testing on private network, where all nodes have bandwidth set to 0. In that case the 'bandwidths' set is empty and std::next() has undefined behavior (gets stuck in infinite loop).