Skip to content

Commit

Permalink
Enable Target Allocator Rewrite by default (open-telemetry#2231)
Browse files Browse the repository at this point in the history
* Enable Target Allocator Rewrite by default

* Add test for target allocator rewrite override behaviour
  • Loading branch information
swiatekm authored Oct 18, 2023
1 parent 27f4ba8 commit 82a25a4
Show file tree
Hide file tree
Showing 8 changed files with 169 additions and 57 deletions.
18 changes: 18 additions & 0 deletions .chloggen/ta-reload-beta.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: breaking

# The name of the component, or a single word describing the area of concern, (e.g. operator, target allocator, github action)
component: operator

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Enable Target Allocator Rewrite by default

# One or more tracking issues related to the change
issues: [2208]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
See [the documentation](/README.md#target-allocator) for details.
Use the `--feature-gates=-operator.collector.rewritetargetallocator` command line option to switch back to the old behaviour.
55 changes: 11 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -541,7 +541,7 @@ For more information about multi-instrumentation feature capabilities please see

### Target Allocator

The OpenTelemetry Operator comes with an optional component, the [Target Allocator](/cmd/otel-allocator/README.md) (TA). When creating an OpenTelemetryCollector Custom Resource (CR) and setting the TA as enabled, the Operator will create a new deployment and service to serve specific `http_sd_config` directives for each Collector pod as part of that CR. It will also change the Prometheus receiver configuration in the CR, so that it uses the [http_sd_config](https://prometheus.io/docs/prometheus/latest/http_sd/) from the TA. The following example shows how to get started with the Target Allocator:
The OpenTelemetry Operator comes with an optional component, the [Target Allocator](/cmd/otel-allocator/README.md) (TA). When creating an OpenTelemetryCollector Custom Resource (CR) and setting the TA as enabled, the Operator will create a new deployment and service to serve specific `http_sd_config` directives for each Collector pod as part of that CR. It will also rewrite the Prometheus receiver configuration in the CR, so that it uses the deployed target allocator. The following example shows how to get started with the Target Allocator:

```yaml
apiVersion: opentelemetry.io/v1alpha1
Expand Down Expand Up @@ -579,6 +579,7 @@ spec:
processors: []
exporters: [debug]
```

The usage of `$$` in the replacement keys in the example above is based on the information provided in the Prometheus receiver [README](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/prometheusreceiver/README.md) documentation, which states:
`Note: Since the collector configuration supports env variable substitution $ characters in your prometheus configuration are interpreted as environment variables. If you want to use $ characters in your prometheus configuration, you must escape them using $$.`

Expand All @@ -587,19 +588,10 @@ Behind the scenes, the OpenTelemetry Operator will convert the Collector’s con
```yaml
receivers:
prometheus:
config:
scrape_configs:
- job_name: otel-collector
scrape_interval: 10s
http_sd_configs:
- url: http://collector-with-ta-targetallocator:80/jobs/otel-collector/targets?collector_id=$POD_NAME
metric_relabel_configs:
- action: labeldrop
regex: (id|name)
replacement: $$1
- action: labelmap
regex: label_(.+)
replacement: $$1
target_allocator:
endpoint: http://collector-with-ta-targetallocator:80
interval: 30s
collector_id: $POD_NAME
exporters:
debug:
Expand All @@ -612,8 +604,6 @@ Behind the scenes, the OpenTelemetry Operator will convert the Collector’s con
exporters: [debug]
```

Note how the Operator removes any existing service discovery configurations (e.g., `static_configs`, `file_sd_configs`, etc.) from the `scrape_configs` section and adds an `http_sd_configs` configuration pointing to a Target Allocator instance it provisioned.

The OpenTelemetry Operator will also convert the Target Allocator's Prometheus configuration after the reconciliation into the following:

```yaml
Expand All @@ -631,40 +621,17 @@ The OpenTelemetry Operator will also convert the Target Allocator's Prometheus c
regex: label_(.+)
replacement: $1
```

Note that in this case, the Operator replaces "$$" with a single "$" in the replacement keys. This is because the collector supports environment variable substitution, whereas the TA (Target Allocator) does not. Therefore, to ensure compatibility, the TA configuration should only contain a single "$" symbol.

More info on the TargetAllocator can be found [here](cmd/otel-allocator/README.md).

#### Target Allocator config rewriting
#### Using Prometheus Custom Resources for service discovery

Prometheus receiver now has explicit support for acquiring scrape targets from the target allocator. As such, it is now possible to have the
Operator add the necessary target allocator configuration automatically. This feature currently requires the `operator.collector.rewritetargetallocator` feature flag to be enabled. With the flag enabled, the configuration from the previous section would be rendered as:

```yaml
receivers:
prometheus:
config:
global:
scrape_interval: 1m
scrape_timeout: 10s
evaluation_interval: 1m
target_allocator:
endpoint: http://collector-with-ta-targetallocator:80
interval: 30s
collector_id: $POD_NAME
exporters:
debug:
service:
pipelines:
metrics:
receivers: [prometheus]
processors: []
exporters: [debug]
```
The target allocator can use Custom Resources from the prometheus-operator ecosystem, like ServiceMonitors and PodMonitors, for service discovery, performing
a function analogous to that of prometheus-operator itself. This is enabled via the `prometheusCR` section in the Collector CR.

This also allows for a more straightforward collector configuration for target discovery using prometheus-operator CRDs. See below for a minimal example:
See below for a minimal example:

```yaml
apiVersion: opentelemetry.io/v1alpha1
Expand Down
58 changes: 51 additions & 7 deletions internal/manifests/collector/config_replace_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (

"github.com/prometheus/prometheus/discovery/http"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
colfeaturegate "go.opentelemetry.io/collector/featuregate"
"gopkg.in/yaml.v2"

Expand All @@ -32,6 +33,11 @@ func TestPrometheusParser(t *testing.T) {
assert.NoError(t, err)

t.Run("should update config with http_sd_config", func(t *testing.T) {
err := colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), false)
require.NoError(t, err)
t.Cleanup(func() {
_ = colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), true)
})
actualConfig, err := ReplaceConfig(param.OtelCol)
assert.NoError(t, err)

Expand Down Expand Up @@ -63,11 +69,7 @@ func TestPrometheusParser(t *testing.T) {
assert.True(t, cfg.TargetAllocConfig == nil)
})

t.Run("should update config with targetAllocator block", func(t *testing.T) {
err := colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), true)
param.OtelCol.Spec.TargetAllocator.Enabled = true
assert.NoError(t, err)

t.Run("should update config with targetAllocator block if block not present", func(t *testing.T) {
// Set up the test scenario
param.OtelCol.Spec.TargetAllocator.Enabled = true
actualConfig, err := ReplaceConfig(param.OtelCol)
Expand All @@ -87,9 +89,32 @@ func TestPrometheusParser(t *testing.T) {
"collector_id": "${POD_NAME}",
}
assert.Equal(t, expectedTAConfig, promCfgMap["target_allocator"])
assert.NoError(t, err)
})

t.Run("should update config with targetAllocator block if block already present", func(t *testing.T) {
// Set up the test scenario
paramTa, err := newParams("test/test-img", "testdata/http_sd_config_ta_test.yaml")
require.NoError(t, err)
paramTa.OtelCol.Spec.TargetAllocator.Enabled = true

actualConfig, err := ReplaceConfig(paramTa.OtelCol)
assert.NoError(t, err)

// Disable the feature flag
err = colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), false)
// Verify the expected changes in the config
promCfgMap, err := ta.ConfigToPromConfig(actualConfig)
assert.NoError(t, err)

prometheusConfig := promCfgMap["config"].(map[interface{}]interface{})

assert.NotContains(t, prometheusConfig, "scrape_configs")

expectedTAConfig := map[interface{}]interface{}{
"endpoint": "http://test-targetallocator:80",
"interval": "30s",
"collector_id": "${POD_NAME}",
}
assert.Equal(t, expectedTAConfig, promCfgMap["target_allocator"])
assert.NoError(t, err)
})

Expand Down Expand Up @@ -145,6 +170,12 @@ func TestReplaceConfig(t *testing.T) {
})

t.Run("should rewrite scrape configs with SD config when TargetAllocator is enabled and feature flag is not set", func(t *testing.T) {
err := colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), false)
require.NoError(t, err)
t.Cleanup(func() {
_ = colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), true)
})

param.OtelCol.Spec.TargetAllocator.Enabled = true

expectedConfigBytes, err := os.ReadFile("testdata/relabel_config_expected_with_sd_config.yaml")
Expand All @@ -156,4 +187,17 @@ func TestReplaceConfig(t *testing.T) {

assert.Equal(t, expectedConfig, actualConfig)
})

t.Run("should remove scrape configs if TargetAllocator is enabled and feature flag is set", func(t *testing.T) {
param.OtelCol.Spec.TargetAllocator.Enabled = true

expectedConfigBytes, err := os.ReadFile("testdata/config_expected_targetallocator.yaml")
assert.NoError(t, err)
expectedConfig := string(expectedConfigBytes)

actualConfig, err := ReplaceConfig(param.OtelCol)
assert.NoError(t, err)

assert.Equal(t, expectedConfig, actualConfig)
})
}
18 changes: 13 additions & 5 deletions internal/manifests/collector/configmap_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,12 @@ service:

})

t.Run("should return expected collector config map with http_sd_config", func(t *testing.T) {
t.Run("should return expected collector config map with http_sd_config if rewrite flag disabled", func(t *testing.T) {
err := colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), false)
assert.NoError(t, err)
t.Cleanup(func() {
_ = colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), true)
})
expectedLables["app.kubernetes.io/component"] = "opentelemetry-collector"
expectedLables["app.kubernetes.io/name"] = "test-collector"

Expand Down Expand Up @@ -112,7 +117,13 @@ service:

})

t.Run("should return expected escaped collector config map with http_sd_config", func(t *testing.T) {
t.Run("should return expected escaped collector config map with http_sd_config if rewrite flag disabled", func(t *testing.T) {
err := colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), false)
assert.NoError(t, err)
t.Cleanup(func() {
_ = colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), true)
})

expectedLables["app.kubernetes.io/component"] = "opentelemetry-collector"
expectedLables["app.kubernetes.io/name"] = "test-collector"
expectedLables["app.kubernetes.io/version"] = "latest"
Expand Down Expand Up @@ -163,8 +174,6 @@ service:
expectedLables["app.kubernetes.io/component"] = "opentelemetry-collector"
expectedLables["app.kubernetes.io/name"] = "test-collector"
expectedLables["app.kubernetes.io/version"] = "latest"
err := colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), true)
assert.NoError(t, err)

expectedData := map[string]string{
"collector.yaml": `exporters:
Expand Down Expand Up @@ -199,7 +208,6 @@ service:

// Reset the value
expectedLables["app.kubernetes.io/version"] = "0.47.0"
err = colfeaturegate.GlobalRegistry().Set(featuregate.EnableTargetAllocatorRewrite.ID(), false)
assert.NoError(t, err)

})
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
debug: null
exporters: null
processors: null
receivers:
prometheus:
config:
global:
evaluation_interval: 1m
scrape_interval: 1m
scrape_timeout: 10s
target_allocator:
collector_id: ${POD_NAME}
endpoint: http://test-targetallocator:80
interval: 30s
service:
pipelines:
metrics:
exporters:
- debug
processors: []
receivers:
- prometheus
25 changes: 25 additions & 0 deletions internal/manifests/collector/testdata/http_sd_config_ta_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
processors:
receivers:
prometheus:
config:
scrape_configs:
- job_name: prometheus

static_configs:
- targets: ["prom.domain:9001", "prom.domain:9002", "prom.domain:9003"]
labels:
my: label
target_allocator:
collector_id: ${POD_NAME}
endpoint: http://test-sd-targetallocator:80
interval: 60s

exporters:
debug:

service:
pipelines:
metrics:
receivers: [prometheus]
processors: []
exporters: [debug]
2 changes: 1 addition & 1 deletion pkg/featuregate/featuregate.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ var (
// automatically be rewritten when the target allocator is enabled.
EnableTargetAllocatorRewrite = featuregate.GlobalRegistry().MustRegister(
"operator.collector.rewritetargetallocator",
featuregate.StageAlpha,
featuregate.StageBeta,
featuregate.WithRegisterDescription("controls whether the operator should configure the collector's targetAllocator configuration"),
featuregate.WithRegisterFromVersion("v0.76.1"),
)
Expand Down
28 changes: 28 additions & 0 deletions tests/e2e/smoke-targetallocator/00-assert.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,34 @@ kind: ConfigMap
metadata:
name: stateful-targetallocator
---
apiVersion: v1
kind: ConfigMap
metadata:
name: stateful-collector
data:
collector.yaml: |
exporters:
debug: null
processors: null
receivers:
jaeger:
protocols:
grpc: null
prometheus:
config: {}
target_allocator:
collector_id: ${POD_NAME}
endpoint: http://stateful-targetallocator:80
interval: 30s
service:
pipelines:
traces:
exporters:
- debug
processors: []
receivers:
- jaeger
---
# Print TA pod logs if test fails
apiVersion: kuttl.dev/v1beta1
kind: TestAssert
Expand Down

0 comments on commit 82a25a4

Please sign in to comment.