-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interrupts+spurious #5519
Interrupts+spurious #5519
Conversation
|:---:|:---:|:---:|---| | ||
| count | counter | events | number of times the interrupt has been handled (modulo 100,000) | | ||
| total | counter | events | total number of times the interrupt has been handled | | ||
| unhandled | counter | events | number of times an interrupt was not handled | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice but we need to stick to the previous format style to match https://raw.githubusercontent.com/influxdata/telegraf/master/plugins/inputs/EXAMPLE_README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do, though I will assert your standard is ugly and takes up too much screen space :-)
@@ -13,65 +14,99 @@ The interrupts plugin gathers metrics about IRQs from `/proc/interrupts` and `/p | |||
## deployments. | |||
# cpu_as_tag = false | |||
|
|||
## spurious interrupt counters can be collected | |||
# spurious = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just enable this by default, and rely on the metric filtering for those who don't want spurious counts per IRQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
defer f.Close() | ||
irqs, err := parseInterrupts(f) | ||
irqs, err := parseInterrupts(f, s.Spurious) | ||
_ = f.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous defer method is actually preferred, since it will be less likely to be broken in future code updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with the previous method is that the static code checkers complain about the f.close()
return code being ignored, which it is. Two choices:
- add proper func for
defer f.Close()
to explicitly ignore the return value - add comment for static checkers to ignore the ignored return value error
I'm fine either way. Do you have a preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not our goal to keep static code checkers quiet, outside of the ones required by the integration tests (go vet in particular). So I'd rather just leave it as before, I don't find assigning to _
to be helpful in explaining the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll prefer to f.Close() in the loop. If defer is in a loop, we trigger possible leak detection because the defer's closure is the function, not the loop. In this particular location inside Gather
, we loop twice, not a real problem. In parseInterrupts
we open one file per irq, this can easily grow into the hundreds of open file descriptors, so it is better to close in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, this is possibly an indication that we should extract a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I'm not particularly happy with the overall design, lemme try a more detailed refactor, including some improved testing
@richardelling any update on this? The code looks quite nice already and it would be a pitty if it bit-rots like this. |
@richardelling do you see a chance to work on this PR? |
Anyone interested in taking over this PR? |
@srebhan thanks for the ping, I'll not have time until next month at the earliest. I |
@richardelling thank you for letting me know. Please drop me a note once you get to it (or decide to drop this PR). :-) |
Adds an optional
spurious_interrupts
measurement for tracking spurious interrupts.Also, removes metrics for phantom CPUs: those who may be reported in the /proc/softirqs, but might not actually exist and only include zeroes. See also #5451
Required for all PRs: