-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix thread safety in process code #43
fix thread safety in process code #43
Conversation
To unblock 8.3.3 as safely as possible we are going to downgrade the system metrics package in affected versions: elastic/beats#32467 (comment) We will merge this to beats main and 8.3 after the 8.3.3 release completes (really after the next build candidate is generated). |
Is there a test we can write to prove this is thread safe? Maybe the test we need should be added in beats? |
@cmacknz Yah, I was wondering about that. If anything, I think the test should go in the |
Should we just document that this code isn't thread safe and put the locking in beats? I haven't looked at that code yet so I don't know how hard it is, but it seems a bit weird for the fix and the test to live in two different places. |
Well, theoretically it should be threadsafe now. The reporting/monitoring code that actually creates and spins up per-thread callbacks are in If we wanted to be extra paranoid, we could have just have each monitoring callback function have its own instance of |
FWIW, a test on the monitoring code would almost be more an integration test, since we're hitting both the monitoring subsystem and any metrics code attached to it. |
Added a small test to the report library. System metrics like this are always a tad awkward to write tests for, since a lot of metrics will just be context/OS dependent. |
Needs a CHANGELOG entry :) |
This commit may have been made to the 8.3 branch, but it was made well after 8.3.0 was released so I think this should have made it into the 8.3.3 release and should be labeled with that. |
What does this PR do?
This fixes elastic/beats#32467
This code originates from metricbeat, which never worried too much about threadsafety, since a single metricset just lives in its own thread. However, the code has carried over to beat's self-monitoring in
report/
. A previous PR fixed a bug whereGetSelf()
would not calculate percentages, as it never accessed its own internal map of processes that it uses to calculate percentages. However,setup.go
was creating a single instance of the processStat
struct to send to multiple function callbacks in the monitoring, hence a concurrent map access.Wanted to get this PR up fast, still testing on non-linux systems.
Why is it important?
Potential concurrent hashmap access issue.
Checklist
CHANGELOG.md