Distinguish Tail Sampling Policies in Metrics by Prefixing Processor Name #34099

EOjeah · 2024-07-16T11:49:15Z

Component(s)

processor/tailsampling

Is your feature request related to a problem? Please describe.

Currently, the otelcol_processor_tail_sampling_sampling_decision_latency_bucket metric does not distinguish between different tail sampling policies if they share the same name. This is problematic because it makes it difficult to differentiate between metrics emitted by different policies, especially when they are defined in separate tail sampling processors. For example, both tail_sampling/match and tail_sampling/string processors have a policy named default-policy, leading to identical metric labels and causing confusion in metric analysis.

  tail_sampling/match:
    policies: [
      {
        name: default-policy,
        type: ottl_condition,
        ottl_condition: {
          span: [
            'IsMatch(attributes["app.force_sample"], "true")',
          ]
        }
      }
    ]


  tail_sampling/string:
    policies: [
      {
        name: default-policy,
        type: ottl_condition,
        ottl_condition: {
          span: [
            'IsString(attributes["app.force_sample"])',
          ]
        }
      }
    ]

Describe the solution you'd like

To resolve this issue, I propose prefixing the policy label in the metric with the name of the tail sampling processor. This way, the metrics will be uniquely identifiable based on both the processor and the policy name. For instance:

The metric for the default-policy in the tail_sampling/match processor would be labeled as tail_sampling/match-default-policy. (or remove the tail_sampling if it's too obvious and just use the name after the / if any)

The metric for the default-policy in the tail_sampling/string processor would be labeled as tail_sampling/string-default-policy.

Describe alternatives you've considered

Unique Names for Each Policy: One alternative is to ensure that each policy has a unique name across all processors. However, this approach is not scalable, especially when policies are defined across multiple files. It would require more coordination and management to avoid name collisions.

Additional context

Only raised the issue for otelcol_processor_tail_sampling_sampling_decision_latency_bucket but it's likely to be the case for the other metrics from the tail sampling processor unique to each policy definition.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-07-16T11:49:32Z

Pinging code owners:

processor/tailsampling: @jpkrohling

See Adding Labels via Comments if you do not have permissions to add labels yourself.

jpkrohling · 2024-07-16T12:42:22Z

Seems like an easy one to tackle. Do you want to give it a try? You can get the "componentID" from processor.Settings that the Factory receives. You'd need to propagate it down to the place we initialize the telemetry.

@codeboten, perhaps the component ID should be a parameter of the telemetry builder? I believe the same problem would occur to every other component.

EOjeah · 2024-07-16T12:55:56Z

sure, I'll give it a go, can assign the issue to me

This change includes the componentID as a dot prefix to the metrics 'policy' dimension when generating metrics for the processor. The change ensures that similarly named policys in the tail sampling processor that belong to different components also has a unique value. Resolves: open-telemetry#34099

…licy' (open-telemetry#34192) **Description:** Fixing a bug - This change includes the componentID as a dot prefix to the metrics `policy` dimension when generating metrics for the processor. The change ensures that similarly named policy's in the tail sampling processor that belong to different components also has a unique value in the `policy` field for the metrics. Also includes minor refactor change to rename `telemetry` to `telemetryBuilder` where applicable (return type == `NewTelemetryBuilder`) Resolves: open-telemetry#34099 **Link to tracking Issue:** <Issue number if applicable> **Testing:** Ran the collector locally with `make run` with the configuration below which uses the tail sampling processor and has metrics exposed in prometheus format. Sending sample zipkin spans to the receiver ```yaml receivers: zipkin: processors: tail_sampling: policies: [ { name: test-policy-1, type: always_sample } ] tail_sampling/custom_name: policies: [ { name: test-policy-1, type: always_sample } ] exporters: debug: service: telemetry: logs: metrics: pipelines: traces: receivers: [zipkin] processors: [tail_sampling, tail_sampling/custom_name] exporters: [debug] ``` Curling the metrics endpoint shows the policy name is unique for both tail sampling processors ```bash otelcol_processor_tail_sampling_sampling_decision_latency_bucket{policy="custom_name.test-policy-1",service_instance_id="X",service_name="otelcontribcol",service_version="0.105.0-dev",le="5000"} 1 otelcol_processor_tail_sampling_sampling_decision_latency_bucket{policy="test-policy-1",service_instance_id="X",service_name="otelcontribcol",service_version="0.105.0-dev",le="5000"} 1 ``` Tasks - [ ] Confirm prefix separator as `.` - [ ] Update change log entry

EOjeah added enhancement New feature or request needs triage New item requiring triage labels Jul 16, 2024

github-actions bot added the processor/tailsampling Tail sampling processor label Jul 16, 2024

jpkrohling added help wanted Extra attention is needed good first issue Good for newcomers and removed needs triage New item requiring triage labels Jul 16, 2024

EOjeah mentioned this issue Jul 22, 2024

[processor/tailsampling] Include componentID as prefix in metrics 'policy' #34192

Merged

2 tasks

github-actions bot mentioned this issue Jul 23, 2024

Weekly Report: 2024-07-16 - 2024-07-23 #34202

Closed

jpkrohling closed this as completed in #34192 Aug 27, 2024

jpkrohling closed this as completed in b8ee45e Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distinguish Tail Sampling Policies in Metrics by Prefixing Processor Name #34099

Distinguish Tail Sampling Policies in Metrics by Prefixing Processor Name #34099

EOjeah commented Jul 16, 2024

github-actions bot commented Jul 16, 2024

jpkrohling commented Jul 16, 2024

EOjeah commented Jul 16, 2024 •

edited

Loading

Distinguish Tail Sampling Policies in Metrics by Prefixing Processor Name #34099

Distinguish Tail Sampling Policies in Metrics by Prefixing Processor Name #34099

Comments

EOjeah commented Jul 16, 2024

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Jul 16, 2024

jpkrohling commented Jul 16, 2024

EOjeah commented Jul 16, 2024 • edited Loading

EOjeah commented Jul 16, 2024 •

edited

Loading