Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix netstats rollover #181

Merged
merged 4 commits into from
Jan 22, 2021
Merged

Fix netstats rollover #181

merged 4 commits into from
Jan 22, 2021

Conversation

graeme-a-stewart
Copy link
Member

@graeme-a-stewart graeme-a-stewart commented Jan 20, 2021

Change the way that network statistics are calculated so that if any stat rolls over (64bit uint overflow or another issue with the network counters), such that during an iteration

current_value < previous_value

Then the change is ignored for this cycle and a warning is printed.

Fixed a bug in the network stats test scripts (target values not updated correctly if --blocks was specified).

Fixed a few minor flake8 issues in test scripts.

Closes #180

Change the way that network counters function
so that if any counters roll over
(current value < previous value) the
changes are ignored for that monitoring
cycle, with a warning printed.
If the --blocks parameter was specified the
test itself failed to take this into account
(default of 1000 was always assumed)
Flake8: Local variable name is assigned
to but never used (F841)
Copy link
Collaborator

@amete amete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @graeme-a-stewart. The updates look good to me. Instead of doing now-start, we now do now-last and increment the counters accordingly, except when there is a drop.

The only bit I think can be improved, if we want to monitor how often the check fails, is to keep an internal counter, say network_error_counters, increment this when we bump into a decrease (in addition to the error message), and push this into the JSON at the end of the job. The error message itself will probably be fairly hard to fish out in production jobs.

Having said that, I don't really feel too strongly on that. Therefore, let me approve as is and let you decide 😄

@graeme-a-stewart
Copy link
Member Author

Thanks for taking a look @amete. It would be interesting to think about error counters, but I would prefer that we added it more generally to the Imonitor class and that all of the monitors could use it. Otherwise we get into a situation where we're handling a monitor in a special way, which compromises the design of the code.

So my inclination is to accept this PR and create a new issue if we want to handle errors in a more systematic way (also, there was my thought to introduce more formal logging in general, #106).

@amete amete merged commit 4db7101 into master Jan 22, 2021
@amete amete deleted the fix-netstats-rollover branch January 22, 2021 15:00
@amete amete mentioned this pull request Feb 5, 2021
@amete amete restored the fix-netstats-rollover branch January 27, 2022 15:51
@amete amete deleted the fix-netstats-rollover branch January 27, 2022 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overflowing network metrics
2 participants