Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Add new Context-specific configuration options #15381

Merged
merged 32 commits into from
Nov 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
b858f59
Add new Context-specific configuration options
TylerHelmuth Oct 20, 2022
07457a8
Add ContextStatements config
TylerHelmuth Oct 21, 2022
cd058e7
Add high-level context
TylerHelmuth Oct 25, 2022
5f8471c
Add high-level context
TylerHelmuth Oct 25, 2022
2da5d8a
Merge branch 'open-telemetry:main' into tp-enhanced-contexts-2
TylerHelmuth Oct 26, 2022
ffc1fee
respond to feedback
TylerHelmuth Oct 27, 2022
38b7d9c
Fix merge
TylerHelmuth Oct 27, 2022
9133edf
Merge branch 'open-telemetry:main' into tp-enhanced-contexts-2
TylerHelmuth Oct 27, 2022
cab0096
Merge branch 'open-telemetry:main' into tp-enhanced-contexts-2
TylerHelmuth Oct 28, 2022
edfe1a4
Adjust ParserCollection
TylerHelmuth Oct 31, 2022
cdf7c7e
Add example usage temporarily
TylerHelmuth Oct 31, 2022
a19bd17
apply feedback
TylerHelmuth Oct 31, 2022
14d60b1
Split into individual files
TylerHelmuth Nov 1, 2022
7e044a4
merge in upstream/main
TylerHelmuth Nov 2, 2022
8950da0
Add spanevent and metric contexts
TylerHelmuth Nov 2, 2022
438ce15
revert logs processor example
TylerHelmuth Nov 2, 2022
b725cfb
Merge in updates
TylerHelmuth Nov 2, 2022
1371a6e
merge in upstream/main
TylerHelmuth Nov 8, 2022
85045f5
Fix lint
TylerHelmuth Nov 8, 2022
1a83c26
run make gotidy
TylerHelmuth Nov 8, 2022
335d124
fix impi
TylerHelmuth Nov 8, 2022
4c1dd6b
Update readme
TylerHelmuth Nov 8, 2022
8ce73a0
merge in upstream/main
TylerHelmuth Nov 8, 2022
7cc39ed
add changelog entry
TylerHelmuth Nov 8, 2022
caf0889
Merge remote-tracking branch 'upstream/main' into tp-enhanced-contexts
TylerHelmuth Nov 8, 2022
342d086
merge in upstream/main
TylerHelmuth Nov 9, 2022
c3f163b
Update processor/transformprocessor/README.md
TylerHelmuth Nov 9, 2022
ad39509
Add test cases that reuse context
TylerHelmuth Nov 9, 2022
3d7e974
Merge branch 'tp-enhanced-contexts' of https://github.com/TylerHelmut…
TylerHelmuth Nov 9, 2022
6e7426a
Update processor/transformprocessor/README.md
TylerHelmuth Nov 9, 2022
a939062
Apply feedback
TylerHelmuth Nov 9, 2022
1ad3e04
Update processor/transformprocessor/README.md
TylerHelmuth Nov 10, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .chloggen/tp-enhanced-contexts.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: processor/transform

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Adds new configuration options that allow specifying the OTTL context to use when executing statements. See [Transform Processor README](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor#config) for more details.

# One or more tracking issues related to the change
issues: [15381]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
The existing configuration options will be deprecated in a future release.
151 changes: 116 additions & 35 deletions processor/transformprocessor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,66 +8,147 @@
| Warnings | [Unsound Transformations, Identity Conflict, Orphaned Telemetry, Other](#warnings) |

The transform processor modifies telemetry based on configuration using the [OpenTelemetry Transformation Language](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl).
The processor takes a list of statements for each signal type and executes the statements against the incoming telemetry in the order specified in the config. Each statement can access and transform telemetry using functions and allow the use of a condition to help decide whether the function should be executed.

For each signal type, the processor takes a list of statements associated to a [Context type](#contexts) and executes the statements against the incoming telemetry in the order specified in the config.
Each statement can access and transform telemetry using functions and allow the use of a condition to help decide whether the function should be executed.

## Config

The transform processor allows configuring statements for traces, metrics, and logs. Each signal specifies a list of string statements that get passed to the OTTL for interpretation.
The transform processor allows configuring multiple context statements for traces, metrics, and logs.
The value of `context` specifies which [OTTL Context](#contexts) to use when interpreting the associated statements.
The statement strings, which must be OTTL compatible, will be passed to the OTTL and interpreted using the associated context.
Each context will be processed in the order specified and each statement for a context will be executed in the order specified.

```yaml
transform:
<traces|metrics|logs>:
statements:
- string
- string
- string
<trace|metric|log>_statements:
- context: string
statements:
- string
- string
- string
- context: string
statements:
- string
- string
- string
```

Proper use of contexts will provide increased performance and capabilities. See [Contexts](#contexts) for more details.

Valid values for `context` are:

| Signal | Context Values |
|-------------------|------------------------------------------------|
| trace_statements | `resource`, `scope`, `trace`, and `spanevent` |
| metric_statements | `resource`, `scope`, `metric`, and `datapoint` |
| log_statements | `resource`, `scope`, and `log` |

## Example

The example takes advantage of context efficiency by grouping transformations with the context which it intends to transform.
See [Contexts](#contexts) for more details.

Example configuration:
```yaml
transform:
traces:
statements:
- set(status.code, 1) where attributes["http.path"] == "/health"
- keep_keys(resource.attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
- set(name, attributes["http.route"])
- replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}")
- replace_pattern(resource.attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")
- limit(attributes, 100, [])
- limit(resource.attributes, 100, [])
trace_statements:
- context: resource
djaglowski marked this conversation as resolved.
Show resolved Hide resolved
statements:
- keep_keys(attributes, ["service.name", "service.namespace", "cloud.region", "process.command_line"])
- replace_pattern(attributes["process.command_line"], "password\\=[^\\s]*(\\s?)", "password=***")
- limit(attributes, 100, [])
- truncate_all(attributes, 4096)
- context: trace
statements:
- set(status.code, 1) where attributes["http.path"] == "/health"
- set(name, attributes["http.route"])
- replace_match(attributes["http.target"], "/user/*/list/*", "/user/{userId}/list/{listId}")
- limit(attributes, 100, [])
- truncate_all(attributes, 4096)

metric_statements:
- context: resource
statements:
- keep_keys(attributes, ["host.name"])
- truncate_all(attributes, 4096)
- truncate_all(resource.attributes, 4096)
metrics:
statements:
- set(metric.description, "Sum") where metric.type == "Sum"
- keep_keys(resource.attributes, ["host.name"])
- limit(attributes, 100, ["host.name"])
- truncate_all(attributes, 4096)
- truncate_all(resource.attributes, 4096)
- convert_sum_to_gauge() where metric.name == "system.processes.count"
- convert_gauge_to_sum("cumulative", false) where metric.name == "prometheus_metric"
logs:
statements:
- set(severity_text, "FAIL") where body == "request failed"
- replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")
- replace_all_patterns(attributes, "/account/\\d{4}", "/account/{accountId}")
- set(body, attributes["http.route"])
- keep_keys(resource.attributes, ["service.name", "service.namespace", "cloud.region"])
- context: metric
statements:
- set(description, "Sum") where type == "Sum"
- context: datapoint
statements:
- limit(attributes, 100, ["host.name"])
- truncate_all(attributes, 4096)
- convert_sum_to_gauge() where metric.name == "system.processes.count"
- convert_gauge_to_sum("cumulative", false) where metric.name == "prometheus_metric"

log_statements:
- context: resource
statements:
- keep_keys(resource.attributes, ["service.name", "service.namespace", "cloud.region"])
- context: log
statements:
- set(severity_text, "FAIL") where body == "request failed"
- replace_all_matches(attributes, "/user/*/list/*", "/user/{userId}/list/{listId}")
- replace_all_patterns(attributes, "/account/\\d{4}", "/account/{accountId}")
- set(body, attributes["http.route"])
```

## Grammar

You can learn more in-depth details on the capabilities and limitations of the OpenTelemetry Transformation Language used by the transform processor by reading about its [grammar](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl#grammar).

## Contexts

The transform processor utilizes the OTTL's standard contexts for Traces, Metrics and Logs. The contexts allow the OTTL to interact with the underlying telemetry data in its pdata form.
The transform processor utilizes the OTTL's contexts to transform Resource, Scope, Trace, SpanEvent, Metric, DataPoint, and Log telemetry.
The contexts allow the OTTL to interact with the underlying telemetry data in its pdata form.

- [Resource Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottlresource)
- [Scope Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottlscope)
- [Traces Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottltraces)
- [Metrics Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottldatapoints)
- [SpanEvent Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottlspanevent)
- [Metric Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottlmetric)
- [DataPoint Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottldatapoints)
- [Logs Context](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/ottl/contexts/ottllogs)

Each context allows transformation of its type of telemetry.
For example, statements associated to a `resource` context will be able to transform the resource's `attributes` and `dropped_attributes_count`.

Contexts __NEVER__ supply access to individual items "lower" in the protobuf definition.
- This means statements associated to a `resource` __WILL NOT__ be able to access the underlying instrumentation scopes.
- This means statements associated to a `scope` __WILL NOT__ be able to access the underlying telemetry slices (spans, metrics, or logs).
- Similarly, statements associated to a `metric` __WILL NOT__ be able to access individual datapoints, but can access the entire datapoints slice.
- Similarly, statements associated to a `trace` __WILL NOT__ be able to access individual SpanEvents, but can access the entire SpanEvents slice.

For practical purposes, this means that a context cannot make decisions on its telemetry based on telemetry "lower" in the structure.
For example, __the following context statement is not possible__ because it attempts to use individual datapoint attributes in the condition of a statements that is associated to a `metric`

```yaml
metric_statements:
- context: metric
statements:
- set(description, "test passed") where datapoints.attributes["test"] == "pass"
```

Context __ALWAYS__ supply access to the items "higher" in the protobuf definition that are associated to the telemetry being transformed.
- This means that statements associated to a `datapoint` have access to a datapoint's metric, instrumentation scope, and resource.
- This means that statements associated to a `spanevent` have access to a spanevent's span, instrumentation scope, and resource.
- This means that statements associated to a `trace`/`metric`/`log` have access to the telemetry's instrumentation scope, and resource.
- This means that statements associated to a `scope` have access to the scope's resource.

For example, __the following context statement is possible__ because `datapoint` statements can access the datapoint's metric.

```yaml
metric_statements:
- context: datapoint
statements:
- set(metric.description, "test passed") where attributes["test"] == "pass"
```

Whenever possible, associate your statements to the context that the statement intend to transform.
Although you can modify resource attributes associated to a span using the `trace` context, it is more efficient to use the `resource` context.
TylerHelmuth marked this conversation as resolved.
Show resolved Hide resolved
This is because contexts are nested: the efficiency comes because higher-level contexts can avoid iterating through any of the contexts at a lower level.

## Supported functions:

Since the transform processor utilizes the OTTL's contexts for Traces, Metrics, and Logs, it is able to utilize functions that expect pdata in addition to any common functions. These common functions can be used for any signal.
Expand Down
86 changes: 71 additions & 15 deletions processor/transformprocessor/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,16 @@
package transformprocessor // import "github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor"

import (
"fmt"

"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config"
"go.uber.org/multierr"
"go.uber.org/zap"

"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/contexts/ottldatapoints"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/contexts/ottllogs"
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/ottl/contexts/ottltraces"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor/internal/common"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor/internal/logs"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor/internal/metrics"
"github.com/open-telemetry/opentelemetry-collector-contrib/processor/transformprocessor/internal/traces"
Expand All @@ -31,6 +33,11 @@ import (
type Config struct {
config.ProcessorSettings `mapstructure:",squash"`

TraceStatements []common.ContextStatements `mapstructure:"trace_statements"`
MetricStatements []common.ContextStatements `mapstructure:"metric_statements"`
LogStatements []common.ContextStatements `mapstructure:"log_statements"`

// Deprecated. Use TraceStatements, MetricStatements, and LogStatements instead
OTTLConfig `mapstructure:",squash"`
}

Expand All @@ -47,24 +54,73 @@ type SignalConfig struct {
var _ component.ProcessorConfig = (*Config)(nil)

func (c *Config) Validate() error {
var errors error
if (len(c.Traces.Statements) > 0 || len(c.Metrics.Statements) > 0 || len(c.Logs.Statements) > 0) &&
(len(c.TraceStatements) > 0 || len(c.MetricStatements) > 0 || len(c.LogStatements) > 0) {
return fmt.Errorf("cannot use Traces, Metrics and/or Logs with TraceStatements, MetricStatements and/or LogStatements")
}

if len(c.Traces.Statements) > 0 {
ottltracesp := ottltraces.NewParser(traces.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
_, err := ottltracesp.ParseStatements(c.Traces.Statements)
if err != nil {
return err
}
}

ottltracesp := ottltraces.NewParser(traces.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
_, err := ottltracesp.ParseStatements(c.Traces.Statements)
if err != nil {
errors = multierr.Append(errors, err)
if len(c.TraceStatements) > 0 {
pc, err := common.NewTraceParserCollection(traces.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
if err != nil {
return err
}
for _, cs := range c.TraceStatements {
_, err = pc.ParseContextStatements(cs)
if err != nil {
return err
}
}
}

ottlmetricsp := ottldatapoints.NewParser(metrics.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
_, err = ottlmetricsp.ParseStatements(c.Metrics.Statements)
if err != nil {
errors = multierr.Append(errors, err)
if len(c.Metrics.Statements) > 0 {
ottlmetricsp := ottldatapoints.NewParser(metrics.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
_, err := ottlmetricsp.ParseStatements(c.Metrics.Statements)
if err != nil {
return err
}
}

ottllogsp := ottllogs.NewParser(logs.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
_, err = ottllogsp.ParseStatements(c.Logs.Statements)
if err != nil {
errors = multierr.Append(errors, err)
if len(c.MetricStatements) > 0 {
pc, err := common.NewMetricParserCollection(metrics.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
if err != nil {
return err
}
for _, cs := range c.MetricStatements {
_, err = pc.ParseContextStatements(cs)
if err != nil {
return err
}
}
}
return errors

if len(c.Logs.Statements) > 0 {
ottllogsp := ottllogs.NewParser(logs.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
_, err := ottllogsp.ParseStatements(c.Logs.Statements)
if err != nil {
return err
}
}

if len(c.LogStatements) > 0 {
pc, err := common.NewLogParserCollection(logs.Functions(), component.TelemetrySettings{Logger: zap.NewNop()})
if err != nil {
return err
}
for _, cs := range c.LogStatements {
_, err = pc.ParseContextStatements(cs)
if err != nil {
return err
}
}
}

return nil
}
Loading