Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: host-metrics reporting wrong system and process CPU utilization #1718

Closed
david-luna opened this issue Oct 5, 2023 · 3 comments · Fixed by #1785
Closed

bug: host-metrics reporting wrong system and process CPU utilization #1718

david-luna opened this issue Oct 5, 2023 · 3 comments · Fixed by #1785
Labels
bug Something isn't working pkg:host-metrics priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect

Comments

@david-luna
Copy link
Contributor

What version of OpenTelemetry are you using?

"dependencies": {
"@opentelemetry/api": "^1.4.1",
"@opentelemetry/auto-instrumentations-node": "^0.39.2",
"@opentelemetry/host-metrics": "^0.33.1",
"@opentelemetry/sdk-metrics": "^1.15.2",
"@opentelemetry/sdk-node": "^0.41.2"
}

What version of Node are you using?

v18.16.0

What did you do?

Instrument an application with console metrics exporter and watch the CPU utilisation values. There is a repo with the app

https://github.com/david-luna/otel-metrics-cpu-utilization

What did you expect to see?

According to semantic conventions I expect to see CPU utilization values between 0 and 1.

What did you see instead?

Values are way above 1 like the sample provided in the repo I get some metric readings like this one.

{
  descriptor: {
    name: 'system.cpu.utilization',
    type: 'OBSERVABLE_GAUGE',
    description: 'Cpu usage time 0-1',
    unit: '',
    valueType: 1
  },
  dataPointType: 2,
  dataPoints: [
    {
      attributes: [Object],
      startTime: [Array],
      endTime: [Array],
      value: 157.82086149302881
    },
    {
      attributes: [Object],
      startTime: [Array],
      endTime: [Array],
      value: 143.03498303049165
    },
    {
      attributes: [Object],
      startTime: [Array],
      endTime: [Array],
      value: 1083.5302883959548
    },
    {
      attributes: [Object],
      startTime: [Array],
      endTime: [Array],
      value: 1362.709911550165
    },
    {
      attributes: [Object],
      startTime: [Array],
      endTime: [Array],
      value: 0
    }
    // ...
  ]
}

Additional context

N/A

@david-luna
Copy link
Contributor Author

david-luna commented Oct 9, 2023

@legendecas since you're the module owner could you please have a look? I'd be happy to help by creating a PR but 1st I want your ACK on this to discard a misunderstanding from my side

Thanks :)

@legendecas
Copy link
Member

Right, and the semantic convention the package used is outdated as well. We should update the semantic convention all together.

@legendecas legendecas added priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect pkg:host-metrics labels Oct 10, 2023
@david-luna
Copy link
Contributor Author

@legendecas thanks for the confirmation :)

I'm planning on crate a PR which:

  • will refer to the updated semantic convention
  • will change the calculation to make diffs from the latest measurement as semantic convention specifies

One question. Semantic conventions defines system.cpu.utilization as

Difference in system.cpu.time since the last measurement, divided by the elapsed time and number of logical CPUs

Since we are using the state attribute does it makes sense to divide by the number of CPUs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg:host-metrics priority:p2 Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants