-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rate calculator optimization #538
Conversation
I've also tried to optimize rate calcautor by using deque,see penguinol@009f78b, but the performence improvement is not as good as i excepted. |
@penguinol right, I've run the same comparison with -O3:
Anyway, |
Thanks for this. How to test this? and how to test that there are no downsides? are unit tests enough? @jmillan can you please take a look to this? |
I have used this env variable to trace the process: Looking at the source code, the |
Ok, let's merge next week (bank holidays here). |
We are currently using I propose using: Personally I prefer |
RateCalculator( | ||
size_t windowSize = DefaultWindowSize, | ||
float scale = DefaultBpsScale, | ||
uint16_t windowItems = DefaultWindowItems) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason/case for windowItems
value to be different than windowSize
value?
I see no setter for it, and it takes the same value as windowSize
in constructor, by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder the very same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The windowItems
value depends on both the rate evaluation window period (windowSize
) and the events frequency.
For instance, if the measured event frequency is 1 every 2ms, you can set windowSize=1000
and windowItems=500
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
windowSize
is only used to get the scale
factor and to retrieve the new oldestTimeIndex
whereas windowItems
is the range (from 0 to windowItems
) where the items are really stored in the buffer
.
Having a windowSize
bigger than the windowItems
does not provide accurate results I would say. Ie:
windowSize = 1000
|-----------------------------------------------------|
windowItems = 500
|------------------------|
We are calculating the rate for a time window of 1 second, and only considering the data of the last 1/2 second, which is not accurate.
Due to that rationale I think windowItems
should be removed. Any comment on this @vpalmisano?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is using different variables to represent the maximum window size expressed in seconds and in number of items, without mixing the two things.
The windowSize
is expressed in seconds, while the windowItems
is the maximum number of items that can be stored at the same time in the time window, so in your example you must take into account also the frequency of the data sample.
If you receive 1 sample per millisecond, you will fill the array with 500 items considering only 1/2 second of data.
Instead, if the data sample frequency is 1 every 2 milliseconds, a 500 items array will correspond to 1 second of data.
I can perform same tests evaluating if the window is filled or not when changing the windowItems
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmillan right, the problem was not there before this PR. The drawback of the old approach is that it is using a number of items == window size in ms and in some cases this size could be equal to 2500 or 6000 packets.
Can we re-open this PR or should I open a new one with the latest fixes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRs cannot be reopened AFAIK so create a new one, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I will test it soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vpalmisano Works good for me.
Agreed |
Or maybe we can use |
For the same reason it will be better to rename |
The thing is that |
Wrapping up with your proposal @vpalmisano:
|
Done with the latest commit. |
Are we ready to merge this? or is there something missing? |
I've replaced some variable names and added a warning log in the case the calculation buffer is full. I think there are no other changes to be done. @jmillan ? |
LGMT, thanks @vpalmisano! |
Co-authored-by: José Luis Millán <[email protected]>
Co-authored-by: José Luis Millán <[email protected]>
Merging! |
Just a thing:
|
fixing in master |
Too many buffer full warnings after this commit.
|
|
In the current implementation of
RateCalculator
class, theRemoveOldData
has an high CPU usage because it should iterate over an array where each item index represents a time moment with a resolution of 1ms. As a consequence, in most cases we have a sparse array. This PR changes the calculator algorithm, assigning each sample to an array position in a contiguous way, avoiding large iterations when removing the expired elements. The unit test has been modified accordingly.Test is needed!
Callgrind outputs:
v3:
with optimization: