Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System metrics semantic conventions #937

Merged
merged 32 commits into from
Oct 15, 2020

Conversation

aabmass
Copy link
Member

@aabmass aabmass commented Sep 9, 2020

Fixes #818

Changes

Adds system metric conventions/instruments from open-telemetry/oteps#119 to the spec. This PR does not include process level metrics (just a placeholder and TODO), which I can do in a separate PR. In addition to what is in open-telemetry/oteps#119:

Related issues #818

Related oteps open-telemetry/oteps#119

@aabmass aabmass force-pushed the system-metrics-818 branch 3 times, most recently from 99e7891 to 87bc60c Compare September 9, 2020 20:14
@aabmass aabmass marked this pull request as ready for review September 9, 2020 20:36
@aabmass aabmass requested review from a team September 9, 2020 20:36
@jmacd jmacd added the spec:metrics Related to the specification/metrics directory label Sep 21, 2020
@kenfinnigan
Copy link
Member

Is there a definition of "system" and "process" anywhere, and how it relates to the different Resources that could be present?

For instance, the system metrics being output will be very different depending on what that system might be: physical, virtual, Kube Pod, Kube Container, etc.

I couldn't see a way that the system metrics can be tied back into the "type" of a system it came from, but I may have missed it.

Coming from a JVM background, would anything related to JVM metrics fall under the "process" metrics section? Will the process metrics be coming in a later PR?

@aabmass
Copy link
Member Author

aabmass commented Sep 28, 2020

I couldn't see a way that the system metrics can be tied back into the "type" of a system it came from, but I may have missed it.

The "type" info should be in the Resource attached to these metrics, which should follow these resource semantic conventions (e.g. attributes for k8s and containers).

Coming from a JVM background, would anything related to JVM metrics fall under the "process" metrics section? Will the process metrics be coming in a later PR?

I believe all of the JVM metrics would be under the runtime. prefix. Process metrics I'm planning for a separate PR

Copy link
Member

@tigrannajaryan tigrannajaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very useful document, thank you.

specification/metrics/semantic_conventions/README.md Outdated Show resolved Hide resolved
specification/metrics/semantic_conventions/README.md Outdated Show resolved Hide resolved
Comment on lines 87 to 89
**time** instruments are a special case of **usage** metrics, where the
**limit** can usually be calculated as the sum of **time** over all label
values. **utilization** can also be calculated and useful, for example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand what this tries to say.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example, the sum over all state labels of system.cpu.time (idle, user, etc.) gives system.cpu.limit Does that make sense? Happy to remove this too if it's not very useful.

specification/metrics/semantic_conventions/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@MrAlias MrAlias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Copy link
Member

@justinfoote justinfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I especially found the general semantic conventions addition to be userful.

Copy link
Member

@andrewhsu andrewhsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

TIL of UCUM

Copy link
Member

@james-bebbington james-bebbington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just a few nits

@jmacd
Copy link
Contributor

jmacd commented Oct 15, 2020

🎉

@jmacd jmacd merged commit 60250bf into open-telemetry:master Oct 15, 2020
@aabmass aabmass deleted the system-metrics-818 branch October 15, 2020 20:22
carlosalberto pushed a commit to carlosalberto/opentelemetry-specification that referenced this pull request Oct 31, 2024
* System metrics semantic conventions

Conventions from [OTEP
119](open-telemetry/oteps#119)

* change process count to UpDownSumObserver

* fix system.cpu.utilization, use better example

* first several comments

* add description columns, update units to UCUM

* markdown-toc

* clarify OS process level metrics

* clarify load average exapmle

* move general conventions + OTEP 108 into README.md

* renamed swap -> paging

* add addition fs labels

* fix links

* fix link

* Update specification/metrics/semantic_conventions/README.md

Co-authored-by: Tigran Najaryan <[email protected]>

* Update specification/metrics/semantic_conventions/README.md

Co-authored-by: Tigran Najaryan <[email protected]>

* Apply suggestions from code review

Co-authored-by: Tigran Najaryan <[email protected]>

* fix tigran comments

* add disk io_time and operation_time

* add descriptions/footnotes for dropped packets and net errors

* lint, more info for net dropped packets/errors

* "dropped_packets" -> "dropped"

* Apply suggestions from James' code review

Co-authored-by: James Bebbington <[email protected]>

* comments from James' code review

* clarify windows perf counter

* Update specification/metrics/semantic_conventions/README.md

Co-authored-by: Joshua MacDonald <[email protected]>

* reflow text

Co-authored-by: Tigran Najaryan <[email protected]>
Co-authored-by: James Bebbington <[email protected]>
Co-authored-by: Joshua MacDonald <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Writing system metrics conventions into the specification