Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rudy/psutilcpu #1042

Merged
merged 6 commits into from
Jul 22, 2014
Merged

Rudy/psutilcpu #1042

merged 6 commits into from
Jul 22, 2014

Conversation

bunelr
Copy link
Contributor

@bunelr bunelr commented Jul 18, 2014

Use Psutil to fix #653

Keep on using the wmi connection to get the system.cpu.interrupted metric, which is not available via psutil. I don't know if anyone is really using that metric, given that it's not part of the standard dashboard but I left it there to preserve existing behaviour.

Reason for the >100 peak:

  • First, wmi doesn't return values that add to 100 so there was always this offset.
  • Second, the if cpu_user to check that a value was there resulted in metrics not being sent when they were ==0, which lead to missing values on the graph and aggregation artifacts showing huge peak

@@ -1,4 +1,5 @@
from checks import Check
import psutil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you take the opportunity to reorganize the import the same way i did ? 737f396

Thanks!

@remh
Copy link

remh commented Jul 21, 2014

Thanks!
Should we use psutil.cpu_times instead of psutil.cpu_times_percent, then we could compute a rate to get an averaged CPU consumption over two collections instead of the instantaneous consumption ?

@bunelr
Copy link
Contributor Author

bunelr commented Jul 21, 2014

Do you mean to compute an average over the timespan between each run of each checks? That would probably be better indeed.

@remh
Copy link

remh commented Jul 21, 2014

Actually it won't work easily because the old check interface can't compute negative rates so forget about it.

Small other remark:
with psutil you are using a sampling interval of 0.5 sec, do you know what was the equivalent interval when using WMI ?

@bunelr
Copy link
Contributor Author

bunelr commented Jul 21, 2014

Why would there be a negative rate? The cpu_times would be strictly growing and we would need to do a computation to get percents anyway, in order to provide the same metric.

Regarding the interval, the data is sampled by windows but the interval is not exposed.

@remh
Copy link

remh commented Jul 21, 2014

Ah yeah you're right.
So yeah you can define these metrics as counter https://github.com/DataDog/dd-agent/blob/master/checks/system/win32.py#L88-L91 instead of gauge and then use psutil.cpu_times.
Rate will be automatically computed.

@remh remh merged commit 3f6c2cc into master Jul 22, 2014
@bunelr bunelr deleted the rudy/psutilcpu branch July 22, 2014 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Windows CPU times can add up to > 100%
2 participants