Add namespace for process metric attributes #3431

trask · 2023-04-21T22:54:14Z

Part of #3131

Note: I'm not sure what the best approach is here, so took a stab at it to see what this would look like.

(also note that my ulterior motive for pushing this forward is because we have similar questions in JVM runtime metrics, and so want to get community alignment on this issue)

Changes

Adds namespace for process metric attributes:

process.cpu.time and process.cpu.utilization metrics
- state renamed to cpu.state
process.disk.io metric
- direction renamed to disk.direction
process.network.io metric
- direction renamed to network.direction
process.context_switches
- type renamed to context_switch.type
process.paging.faults (should we consider renaming to process.memory.page_faults?)
- type renamed to memory.page_fault.type

~~(I don't think schema transformations support this, since e.g. we don't want to rename the metric attribute state across all metrics, only the one for the process.cpu.time metric)~~

Summary of benefits

Metrics can be derived from logs.

raw measurements can be emitted via logs.

CHANGELOG.md

reyang · 2023-04-21T23:12:52Z

CHANGELOG.md

+  ([#3390](https://github.com/open-telemetry/opentelemetry-specification/pull/3390))
+- Rename `process.cpu.time` metric attribute `state` to `cpu.state`,
+  rename `process.cpu.utilization` metric attribute `state` to `cpu.state`,
+  rename `process.disk.io` metric attribute `direction` to `disk.direction`,


Not targeting this PR, I'll restate that disk direction is very weird, most operating systems model disks with read/write/control.

I understand wanting the consistency. I, personally, view disk as "yet another type of network operation", thanks to years of SAN and DFS.

I think we SHOULD get together a semconv group to really look at these at some point, but unfortunately we have a ton of users right now.

Simple renames and namespacing, I think we can afford to do, especially given what this could provide us w/ Log -> Metric or Trace -> Metric generation.

Anything more semantically different, we need to be cautious about.

reyang

I'm supportive of this direction and I think the spirit is also well aligned with https://github.com/open-telemetry/oteps/blob/6b62348b280b82036e897781d49ac4f2c3a9d95e/text/0199-support-elastic-common-schema-in-opentelemetry.md.

Not a blocker for now - I think many of these attributes will be useful for other areas (e.g. for a different metrics stream, for logs and traces) so eventually they should be extracted to a common place.

reyang · 2023-04-21T23:45:49Z

@AlexanderWert I want to get your input regarding metrics attributes (dimensions).

tigrannajaryan · 2023-04-24T16:56:48Z

This breaks hostmetrics receiver in the Collector, which is declared Beta and is likely one of the most used receivers in the Collector. FYI, @open-telemetry/collector-contrib-approvers @open-telemetry/collector-approvers

What is the justification for this change?

(I don't think schema transformations support this, since e.g. we don't want to rename the metric attribute state across all metrics, only the one for the process.cpu.time metric)

It does. You can use apply_to_metrics setting to limit attribute renaming to specific metrics.

trask · 2023-04-24T18:03:54Z

What is the justification for this change?

being able to share the same attributes across traces, metrics and logs

see @jsuereth's #3342 (review):

We want to share attributes between metrics + other signals. This is to aide in telemetry correlation, one of OTEL's core missions. Why the disparity?

tigrannajaryan · 2023-04-24T22:01:12Z

What is the justification for this change?

being able to share the same attributes across traces, metrics and logs

see @jsuereth's #3342 (review):

We want to share attributes between metrics + other signals. This is to aide in telemetry correlation, one of OTEL's core missions. Why the disparity?

Good point. I think we need to put some more thought into this this.

The purpose of namespacing is to avoid name conflicts.

Attributes that we define for only a particular metric are already in their own namespace, the name of the metric (e.g. state attribute of process.cpu.time metric). These attributes don't need any other help to ensure there are no conflicts. These attributes are also only used for metrics and never for other signals.

Generic attributes (e.g. http.status_code) that can be recorded on a span, on a log or on a metric need such namespacing.

I think we need to make it a rule that metric-only attributes defined for specific metrics don't need to be in a namespace. So, I would make this change only for attributes which we plan to also record for traces and logs or which are defined as general-purpose attributes which can be recorded on any metrics (and thus have higher probability of clashes with other such attributes). At a glance cpu.state does not make sense to me as a span or log attribute, but perhaps there is a use case for that?

@jsuereth thoughts?

jack-berg · 2023-04-24T22:17:22Z

I think we need to make it a rule that metric-only attributes defined for specific metrics don't need to be in a namespace. I would make this change only for attributes which we plan to also record for traces and logs or which are defined as general-purpose attributes which can be recorded on any metrics (and thus have higher probability of clashes with other such attributes)

This seems like an appealing way to think about it. I wonder how this would work with code generation. Right now code generation produces one big class containing all the attributes for traces, metrics, and logs. If an attribute like cpu state were to be scoped to specific metrics process.cpu.time and process.cpu.utilization, we'd either need multiple copies of state, or some sort of documentation describing the different contexts in which it is used.

Something like:

public static final AttributeKey<String> PROCESS_CPU_TIME_STATE = stringKey("state");

public static final AttributeKey<String> PROCESS_CPU_UTILIZATION_STATE = stringKey("state");

Or:

/**
 * Used in multiple contexts:
 * <ul>
 *   <li>process.cpu.time: <description of meaning when used with process.cpu.time>
 *   <li>process.cpu.utilization:  <description of meaning when used with process.cpu.utilization>
 */
public static final AttributeKey<String> STATE = stringKey("state");

trask · 2023-04-25T00:33:14Z

I agree that lots of metric attributes don't have a use on spans, but it seems like they could potentially be useful on logs, e.g.

metrics could be derived from logs
raw measurements could be emitted via logs

trask · 2023-04-25T01:03:10Z

(I don't think schema transformations support this, since e.g. we don't want to rename the metric attribute state across all metrics, only the one for the process.cpu.time metric)

It does. You can use apply_to_metrics setting to limit attribute renaming to specific metrics.

nice, I added this, thx

AlexanderWert · 2023-04-25T14:00:16Z

From the metrics attributes mentioned above, the only one I see a valuable correlation between metrics and logs is direction for process.network.io (-> network.direction). So, it would make sense to add a namespace for that attribute.

However, if we want to align this with ECS' network.direction we need to make sure the recommended / potential values are the same.

In OTel direction SHOULD be one of: receive, transmit.

In ECS the valid values for network.direction are:

ingress (corresponds to OTel's receive)
egress (corresponds to OTel's transmit)
inbound
outbound
internal
external
unknown

With the following description:

Direction of the network traffic.

When mapping events from a host-based monitoring context, populate this field from the host’s point of view, using the values "ingress" or "egress".

When mapping events from a network or perimeter-based monitoring context, populate this field from the point of view of the network perimeter, using the values "inbound", "outbound", "internal" or "external".

Note that "internal" is not crossing perimeter boundaries, and is meant to describe communication between two hosts within the perimeter. Note also that "external" is meant to describe traffic between two hosts that are external to the perimeter. This could for example be useful for ISPs or VPN service providers.

Alternatively, we could think about adding receive/transmit to the list of ECS' values as values to be used in a process-based context

jsuereth · 2023-04-25T17:50:17Z

CHANGELOG.md

+  ([#3390](https://github.com/open-telemetry/opentelemetry-specification/pull/3390))
+- Rename `process.cpu.time` metric attribute `state` to `cpu.state`,
+  rename `process.cpu.utilization` metric attribute `state` to `cpu.state`,
+  rename `process.disk.io` metric attribute `direction` to `disk.direction`,


I understand wanting the consistency. I, personally, view disk as "yet another type of network operation", thanks to years of SAN and DFS.

I think we SHOULD get together a semconv group to really look at these at some point, but unfortunately we have a ton of users right now.

Simple renames and namespacing, I think we can afford to do, especially given what this could provide us w/ Log -> Metric or Trace -> Metric generation.

Anything more semantically different, we need to be cautious about.

bertysentry · 2023-05-01T22:33:44Z

TBH I don't like these changes.

As @tigrannajaryan mentioned, there is no possible conflict between attributes using the same name but for different metrics. Specifying cpu.state instead of just state for a process.cpu.usage metric doesn't add any information. Same goes for disk.direction vs direction for process.disk.io: we already know it's about the disk.

Also, metrics don't have the same hierarchical nature as logs and traces with inherited attributes and context. This justifies having different rules for metrics attributes.

Now, I hear the argument about correlation between spans, logs, and metrics. The goal is noble but adding too much constraints on attributes naming between traces and metrics is going to create a lot of discussion and tension between spec writers. There are cases where a span attribute will require a very long namespace, and a small rare use case where we may want to correlate that attribute with a metric, and then the metric should use the long namespace for the attribute name, causing migration issues, etc.

While consistency is absolutely necessary, correlation between metrics and traces cannot be ensured by semantic conventions. Correlation engines will always require some mapping mechanisms, and in the age of AIOps, enforcing identical attribute names everywhere doesn't sound necessary.

Sorry for the long rant, I just mean: can't we keep it as simple as possible, while ensuring consistency in terms, and rely on implementation for correlation and other niceties?

specification/metrics/semantic_conventions/process-metrics.md

reyang · 2023-05-09T04:37:47Z

@trask heads up - most likely this PR will be closed, and we'll ask you to resubmit the PR in a new repo, please refer to #3474 (comment).

jack-berg

Approving to indicate my support, even though I understand this PR will need to be re-opened against the new repo.

My reasons:

Its very convenient to have a global set of attribute keys which mean the same thing in all contexts. This isn't strictly necessary, since we can say metric attributes are scoped to the metric name, but its a nicety I'm reluctant to give up.
It will be hard to anticipate when an attribute will only be used in a metric (and not need a namespace) vs be used across signals (and need a namespace). Telemetry schemas would allow for renames if an attribute is initially limited to metrics and later is exposed in logs or traces, but IMO its better to not have to use telemetry schemas even if they could be used.
Common rules for attributes in all signals is simpler than having an exception just for metrics. I don't share the view that using non-namespaced attributes for metrics is simpler. Consistency is simplicity.

PeterF778 · 2023-05-12T15:25:29Z

Since there are good arguments for namespacing metric attributes as well as against, how about adding this requirement:

For each metric, its attribute names must be unique even after stripping the namespacing prefix.

This will allow some flexibility with handling metric attributes by backends, and specifically UIs.

jsuereth

Please move to https://github.com/open-telemetry/semantic-conventions

github-actions · 2023-05-20T03:16:46Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

trask · 2023-05-24T03:07:53Z

hey all, the https://github.com/orgs/open-telemetry/teams/technical-committee is reviewing this issue, and asked to create a summary of pros/cons. I have created open-telemetry/semantic-conventions#51 based on the comments here, please comment further over there if I missed anything, thx!

trask added 2 commits April 21, 2023 15:46

Add namespace for process metric attributes

4bfeeb3

Change log entry

7d71cbd

trask marked this pull request as ready for review April 21, 2023 22:57

trask requested review from a team April 21, 2023 22:57

github-actions bot assigned jmacd Apr 21, 2023

trask mentioned this pull request Apr 21, 2023

Attribute names in semantic guidelines should be hierarchical #3131

Closed

reyang reviewed Apr 21, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

reyang reviewed Apr 21, 2023

View reviewed changes

trask mentioned this pull request Apr 21, 2023

JVM metrics use attributes which aren't namespaced #3417

Closed

trask added 2 commits April 21, 2023 16:35

cpu.context_switch -> os.context_switch

62088b0

os.context_switch -> context_switch

6aa922c

reyang approved these changes Apr 21, 2023

View reviewed changes

Yun-Ting approved these changes Apr 21, 2023

View reviewed changes

trask added 2 commits April 24, 2023 17:33

Merge branch 'main' into process-metric-attrs

975400c

Add schema transformation

f2a4769

jsuereth approved these changes Apr 25, 2023

View reviewed changes

sebastien-rosset mentioned this pull request Apr 28, 2023

Attributes namespacing clarification #3342

Closed

jack-berg mentioned this pull request Apr 28, 2023

Add new JVM runtime environment metrics #3352

Closed

Merge branch 'main' into process-metric-attrs

4d04e08

zeitlinger mentioned this pull request May 2, 2023

add namespace to jvm metric attributes #3464

Closed

PeterF778 reviewed May 2, 2023

View reviewed changes

specification/metrics/semantic_conventions/process-metrics.md Outdated Show resolved Hide resolved

PeterF778 reviewed May 2, 2023

View reviewed changes

specification/metrics/semantic_conventions/process-metrics.md Outdated Show resolved Hide resolved

trask added 2 commits May 8, 2023 14:42

Merge remote-tracking branch 'upstream/main' into process-metric-attrs

4312df7

Fix

1c195d6

reyang added the area:semantic-conventions Related to semantic conventions label May 9, 2023

arminru added the spec:metrics Related to the specification/metrics directory label May 9, 2023

jack-berg approved these changes May 11, 2023

View reviewed changes

jsuereth requested changes May 12, 2023

View reviewed changes

github-actions bot added the Stale label May 20, 2023

trask mentioned this pull request May 24, 2023

Should metric attributes be required to be namespaced? open-telemetry/semantic-conventions#51

Closed

trask closed this May 24, 2023

trask deleted the process-metric-attrs branch May 24, 2023 03:07

AlexanderWert mentioned this pull request May 25, 2023

REQUEST: New membership for @AlexanderWert open-telemetry/community#1507

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add namespace for process metric attributes #3431

Add namespace for process metric attributes #3431

trask commented Apr 21, 2023 •

edited

Loading

reyang Apr 21, 2023

jsuereth Apr 25, 2023

reyang left a comment

reyang commented Apr 21, 2023

tigrannajaryan commented Apr 24, 2023

trask commented Apr 24, 2023

tigrannajaryan commented Apr 24, 2023

jack-berg commented Apr 24, 2023

trask commented Apr 25, 2023

trask commented Apr 25, 2023

AlexanderWert commented Apr 25, 2023

jsuereth Apr 25, 2023

bertysentry commented May 1, 2023

reyang commented May 9, 2023

jack-berg left a comment

PeterF778 commented May 12, 2023

jsuereth left a comment

github-actions bot commented May 20, 2023

trask commented May 24, 2023

Add namespace for process metric attributes #3431

Add namespace for process metric attributes #3431

Conversation

trask commented Apr 21, 2023 • edited Loading

Changes

Summary of benefits

reyang Apr 21, 2023

Choose a reason for hiding this comment

jsuereth Apr 25, 2023

Choose a reason for hiding this comment

reyang left a comment

Choose a reason for hiding this comment

reyang commented Apr 21, 2023

tigrannajaryan commented Apr 24, 2023

trask commented Apr 24, 2023

tigrannajaryan commented Apr 24, 2023

jack-berg commented Apr 24, 2023

trask commented Apr 25, 2023

trask commented Apr 25, 2023

AlexanderWert commented Apr 25, 2023

jsuereth Apr 25, 2023

Choose a reason for hiding this comment

bertysentry commented May 1, 2023

reyang commented May 9, 2023

jack-berg left a comment

Choose a reason for hiding this comment

PeterF778 commented May 12, 2023

jsuereth left a comment

Choose a reason for hiding this comment

github-actions bot commented May 20, 2023

trask commented May 24, 2023

trask commented Apr 21, 2023 •

edited

Loading