Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config sources to support full remote config and value substitution #4468

Conversation

tigrannajaryan
Copy link
Member

@tigrannajaryan tigrannajaryan commented Nov 23, 2021

Resolves: #4190

This is a draft PR (not production quality code) that shows roughly what the end
goal is. Don't merge this PR, it is merely to show what the final destination
looks like. If we agree that this is a good end goal I will split this into
series of smaller PRs and will submit them.

I am primarily looking for the feedback on the new concept and the APIs. There
is no need to review all the implementation details, they are not production
quality yet.

Configuration File Format Changes

Introduced 2 new optional sections in the configuration file format:
config_sources and merge_configs. config_sources defines one or more
configuration sources. Configuration sources are a new type of component that is
pluggable like all other component types by implementing its Factory.
Configuration sources must implement a configmapprovider.Provider interface.

The merge_configs section contains a sequence of config source references. The
configs from these sources is retrieved and merged in the specified order to
form the top-level configuration of the Collector. The locally specified
top-level configuration is merged last (thus has the highest precedence).

This approach allows to maintain backward compatibility. Any previously valid
configuration file remains valid. New configuration files may specify only
config_sources/merge_configs section, move the local configuration to a separate
file and use that file as one of the configuration sources.

This approach also does not require us to introduce a new --config_sources
command line option.

Configuration Provider Interfaces

Introduced the concept of config MapProvider and config ValueProvider, both
implementing a base Provider interface.

MapProvider is equivalent to the old Provider interface and allows retrieving
configuration maps.

ValueProvider can return configuration values, which includes primitive values,
arrays, maps. ValueProvider also supports returning different values based on
the selector and params passed which allows to parameterize the returned value
at the place of usage.

Command Line Invocation

ValueProvicer-type config sources can be invoked directly from the command line.
The --config option now accepts either a file path (which was the existing
functionality) or values in the form of <sourcename>:<selector>?<params>
(which is a new functionality). The new functionality is triggered by the
presence of a colon in the --config option value. This makes it impossible to
pass file names that use a colon character, but we assume that this is an
acceptable limitation since colon is often used as a file path delimiter
character in Unix systems anyway.

The new functionality allows for example for the following command lines:
./otelcol --config=config.yaml
./otelcol --config=files:config.yaml
./otelcol --config=http://example.com/path/to/config.yaml
./otelcol --config=https://example.com/path/to/config.yaml?foo=bar&bar=baz

Example Config Sources

As an example I implemented 3 config sources to demonstrate how things work:

  • env is a ValueProvider that allows getting environment variables by name.

  • files is both a MapProvider and a ValueProvider, which allows it to be used
    both as a top-level config source and as command line config source (see
    example above).

  • http is a ValueProvider which has factories that can be registered for "http"
    and "https" config sources and which can perform an HTTP GET operation to
    retrieve the config.

Other Changes

Deleted Expand map provider. It is no longer needed since environment expansion
is now handled in the valueSubstitutor which needs to distinguish between env
variables and config value source subsitution both starting with $ character.

Deleted config/internal/configsource.Manager. The new functionality covers what
Manager was intended to do.

TODO

  • Refine RetrieveValue() to return a more specific type instead of interface{}

  • See if it is worth combining MapProvider and ValueProvider into just Provider
    (doesn't seem so, but need to check).

  • See if it possible/desirable to allow config source settings to be a single
    string such that name is not nessary for files config source and instead
    the file name can be specified directly as the setting of files source.

  • See if better names are possible for the new top-level config file keys.
    Perhaps root_configs is a better name instead of merge_configs.

  • Some more cleanup and refactoring may be necessary to tidy things up once we
    are fully settled on the functionality. Probably rename configmapprovider
    package to configprovider, rename Retrieve to RetriveMap, etc.

  • Add more comments to explain the design and how things work.

  • Add tests for everything.

Example Usage From End User Perspective

Below are some example usages.

Example 1. Single Local Config File.

This is the old (current) way. It is still fully supported and will continue to
be supported. It is not obsolete and is the preferable way when there is a
single local config file.

Command line: ./otelcol --config=local.yaml

local.yaml content:

receivers:
  otlp:
    grpc:

exporters:
  otlp:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]

Example 2. Config Sources, Single Local Config File.

Command line: ./otelcol --config=sources.yaml

sources.yaml content:

config_sources:
  files:
    name: local.yaml

merge_configs: [files]

local.yaml content:

receivers:
  otlp:
    grpc:

exporters:
  otlp:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]

This example results in a config that is completely equivalent to the config in
Example 1.

Example 3. Config Sources from Command Line

This uses a shorthand for specifying a single config source on the command line.

Command line: ./otelcol --configs=files:local.yaml

local.yaml content:

receivers:
  otlp:
    grpc:

exporters:
  otlp:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]

Example 4. Multiple Local Config Files

Command line: ./otelcol --config=sources.yaml

sources.yaml content:

config_sources:
  # Merge all files that match the specified glob into one file and use that as a config
  files:
    name: /var/lib/otelcol/config/**/*.yaml

merge_configs: [files]

Example 5. From HTTP Server

Command line: ./otelcol --config=sources.yaml

source.yaml content:

config_sources:
  https:

merge_configs:
  - https://example.com/path/file.yaml

This will do a HTTP GET request to https://example.com/path/file.yaml, will
download the content and will use the content as the config file.

The equivalent result can be achieved using only the command line:

./otelcol --config=https://example.com/path/file.yaml

Example 6. Multiple Sources

Command line: ./otelcol --config=sources.yaml

source.yaml content:

config_sources:
  files:
    name: local.yaml
  s3: 
    bucket: mybucket
    region: us-east-1

merge_configs: [files,s3]

This will merge a local.yaml file with the content of an S3 bucket and will use
the content as the config file.

Example 7. Value Sources

Command line: ./otelcol --config=sources.yaml

source.yaml content:

config_sources:
  files: local.yaml

  # define a value source of "vault" type
  # that can be referenced later.
  vault:
    # endpoint is the Vault server address.
    endpoint: http://localhost:8200
    # path is the Vault path to the secret location.
    path: secret/data/kv
    auth:
      token: foo

merge_configs: [files]

local.yaml content:

receivers:
  otlp:
    grpc:

exporters:
  signalfx:
    # this will read the access_token value from "vault" config source.
    realm: us0
    access_token: $vault:data.access_token

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [signalfx]

Example 8. Environment Variables

Command line: ./otelcol --config=config.yaml

config.yaml content:

config_sources:
  env:

receivers:
  otlp:
    grpc:

exporters:
  signalfx:
    # Both of the following values are read from env variables.
    realm: $SIGNALFX_REALM
    access_token: $env:SIGNALFX_ACCESSTOKEN

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [signalfx]

Use "merge_configs" to specify root config sources to merge
Resolves: open-telemetry#4190

This is a draft PR (not production quality code) that shows roughly what the end
goal is. Don't merge this PR, it is merely to show what the final destination
looks like. If we agree that this is a good end goal I will split this into
series of smaller PRs and will submit them.

I am primarily looking for the feedback on the new concept and the APIs. There
is no need to review all the implementation details, they are not production
quality yet.

## Configuration File Format Changes

Introduced 2 new optional sections in the configuration file format:
config_sources and merge_configs. config_sources defines one or more
configuration sources. Configuration sources are a new type of component that is
pluggable like all other component types by implementing its Factory.
Configuration sources must implement a configmapprovider.Provider interface.

The merge_configs section contains a sequence of config source references. The
configs from these sources is retrieved and merged in the specified order to
form the top-level configuration of the Collector. The locally specified
top-level configuration is merged last (thus has the highest precedence).

This approach allows to maintain backward compatibility. Any previously valid
configuration file remains valid. New configuration files may specify only
config_sources/merge_configs section, move the local configuration to a separate
file and use that file as one of the configuration sources.

This approach also does not require us to introduce a new --config_sources
command line option.

## Configuration Provider Interfaces

Introduced the concept of config MapProvider and config ValueProvider, both
implementing a base Provider interface.

MapProvider is equivalent to the old Provider interface and allows retrieving
configuration maps.

ValueProvider can return configuration values, which includes primitive values,
arrays, maps. ValueProvider also supports returning different values based on
the selector and params passed which allows to parameterize the returned value
at the place of usage.

## Command Line Invocation

ValueProvicer-type config sources can be invoked directly from the command line.
The --config option now accepts either a file path (which was the existing
functionality) or values in the form of <sourcename>:<selector>?<params> (which
is a new functionality). The new functionality is triggered by the presence of a
colon in the --config option value. This makes it impossible to pass file names
that use a colon character, but we assume that this is an acceptable limitation
since colon is often used as a file path delimiter character in Unix systems
anyway.

The new functionality allows for example for the following command lines:
./otelcol --config=config.yaml
./otelcol --config=files:config.yaml
./otelcol --config=http://example.com/path/to/config.yaml
./otelcol --config=http://example.com/path/to/config.yaml?foo=bar&bar=baz

## Example Config Sources

As an example I implemented 3 config sources to demonstrate how things work:

- env is a ValueProvider that allows getting environment variables by name.

- files is both a MapProvider and a ValueProvider, which allows it to be used
  both as a top-level config source and as command line config source (see
  example above).

- http is a ValueProvider which has factories that can be registered for "http"
  and "https" config sources and which can perform an HTTP GET operation to
  retrieve the config.

## Other Changes

Deleted Expand map provider. It is no longer needed since environment expansion
is now handled in the valueSubstitutor which needs to distinguish between env
variables and config value source subsitution both starting with $ character.

Deleted config/internal/configsource.Manager. The new functionality covers what
Manager was intended to do.

## TODO

- Refine RetrieveValue() to return a more specific type instead of interface{}

- See if it is worth combining MapProvider and ValueProvider into just Provider
  (doesn't seem so, but need to check).

- See if it possible/desirable to allow config source settings to be a single
  string such that `name` is not nessary for `files` config source and instead
  the file name can be specified directly as the setting of `files` source.

- See if better names are possible for the new top-level config file keys.
  Perhaps root_configs is a better name instead of merge_configs.

- Some more cleanup and refactoring may be necessary to tidy things up once we
  are fully settled on the functionality. Probably rename configmapprovider
  package to configprovider, rename Retrieve to RetriveMap, etc.

- Add more comments to explain the design and how things work.

- Add tests for everything.

## Example Usage From End User Perspective

Below are some example usages.

### Example 1. Single Local Config File.

This is the old (current) way. It is still fully supported and will continue to
be supported. It is not obsolete and is the preferable way when there is a
single local config file.

Command line: `./otelcol --config=local.yaml`

local.yaml content:
```yaml
receivers:
  otlp:
    grpc:

exporters:
  otlp:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]
```

### Example 2. Config Sources, Single Local Config File.

Command line: `./otelcol --config=sources.yaml`

sources.yaml content:
```yaml
config_sources:
  files:
    name: local.yaml

merge_configs: [files]
```

local.yaml content:
```yaml
receivers:
  otlp:
    grpc:

exporters:
  otlp:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]
```

This example results in a config that is completely equivalent to the config in
Example 1.

### Example 3. Config Sources from Command Line

This uses a shorthand for specifying a single config source on the command line.

Command line: `./otelcol --configs=files:local.yaml`

local.yaml content:
```yaml
receivers:
  otlp:
    grpc:

exporters:
  otlp:

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [otlp]
```

### Example 4. Multiple Local Config Files

Command line: `./otelcol --config=sources.yaml`

sources.yaml content:
```yaml
config_sources:
  # Merge all files that match the specified glob into one file and use that as a config
  files:
    name: /var/lib/otelcol/config/**/*.yaml

merge_configs: [files]
```

### Example 5. From HTTP Server

Command line: `./otelcol --config=sources.yaml`

source.yaml content:
```yaml
config_sources:
  https:

merge_configs:
  - https://example.com/path/file.yaml
```

This will do a HTTP GET request to https://example.com/path/file.yaml, will
download the content and will use the content as the config file.

The equivalent result can be achieved using only the command line:

`./otelcol --config=https://example.com/path/file.yaml`

### Example 6. Multiple Sources

Command line: `./otelcol --config=sources.yaml`

source.yaml content:
```yaml
config_sources:
  files:
    name: local.yaml
  s3:
    bucket: mybucket
    region: us-east-1

merge_configs: [files,s3]
```

This will merge a local.yaml file with the content of an S3 bucket and will use
the content as the config file.

### Example 7. Value Sources

Command line: `./otelcol --config=sources.yaml`

source.yaml content:
```yaml
config_sources:
  files: local.yaml

  # define a value source of "vault" type
  # that can be referenced later.
  vault:
    # endpoint is the Vault server address.
    endpoint: http://localhost:8200
    # path is the Vault path to the secret location.
    path: secret/data/kv
    auth:
      token: foo

merge_configs: [files]
```

local.yaml content:
```yaml
receivers:
  otlp:
    grpc:

exporters:
  signalfx:
    # this will read the access_token value from "vault" config source.
    realm: us0
    access_token: $vault:data.access_token

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [signalfx]
```

### Example 8. Environment Variables

Command line: `./otelcol --config=config.yaml`

config.yaml content:
```yaml
config_sources:
  env:

receivers:
  otlp:
    grpc:

exporters:
  signalfx:
    # Both of the following values are read from env variables.
    realm: $SIGNALFX_REALM
    access_token: $env:SIGNALFX_ACCESSTOKEN

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [signalfx]
```
Comment on lines +68 to +71
type ConfigSourceSettings struct {
ConfigSources []map[string]map[string]interface{} `mapstructure:"config_sources"`
MergeConfigs []string `mapstructure:"merge_configs"`
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we not discussed that this is a separate File?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we did. This is an slightly different approach to avoid adding a different file and new command line option, see PR description.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, it still can be a separate file or it can be the same file, both ways are valid and the user can decide how they want to use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it happen if the source changes the config sources?

Comment on lines 49 to 62
func (fmp *fileMapProvider) RetrieveValue(_ context.Context, _ func(*ChangeEvent), selector string, paramsConfigMap *config.Map) (RetrievedValue, error) {
if selector == "" {
return nil, errors.New("config file not specified")
}

// TODO: support multiple file paths in selector (use some sort of delimiter).

cp, err := config.NewMapFromFile(selector)
if err != nil {
return nil, fmt.Errorf("error loading config file %q: %w", fmp.fileName, err)
}

return &simpleRetrievedValue{value: cp}, nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very confusing. RetrieveValue seems to be a "subset" of the RetrieveMap from the description but here is a super set.

Also you cannot return any "random" instance (interface{}) because we have to unmarshal after.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very confusing. RetrieveValue seems to be a "subset" of the RetrieveMap from the description but here is a super set.

It is a superset. Description needs to be fixed.

Also you cannot return any "random" instance (interface{}) because we have to unmarshal after.

It is in the TODO to fix this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved the description a bit. It still all needs a good pass of clarification, it is still draft.

)

// ConfigSource is a component that provides a configuration map or value.
type ConfigSource interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I saw it. It will be all deleted since it is no longer needed. I deleted some parts of it already in this PR (the Manager).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this feature can be achieved through helm chart

@codecov
Copy link

codecov bot commented Nov 23, 2021

Codecov Report

Merging #4468 (50915c4) into main (f69d234) will decrease coverage by 4.33%.
The diff coverage is 14.72%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4468      +/-   ##
==========================================
- Coverage   90.73%   86.39%   -4.34%     
==========================================
  Files         178      181       +3     
  Lines       10357    10496     +139     
==========================================
- Hits         9397     9068     -329     
- Misses        743     1221     +478     
+ Partials      217      207      -10     
Impacted Files Coverage Δ
component/factories.go 90.62% <ø> (ø)
config/configmapprovider/simple.go 28.57% <0.00%> (-71.43%) ⬇️
config/configsource.go 0.00% <0.00%> (ø)
service/defaultconfigprovider/inlinemapprovider.go 0.00% <0.00%> (ø)
...vice/defaultconfigprovider/mapfromvalueprovider.go 0.00% <0.00%> (ø)
service/defaultconfigprovider/valuesubstitutor.go 0.00% <0.00%> (ø)
...ice/defaultconfigprovider/defaultconfigprovider.go 12.41% <12.41%> (ø)
config/configmapprovider/file.go 62.50% <30.76%> (-37.50%) ⬇️
service/command.go 84.21% <40.00%> (-15.79%) ⬇️
config/configmapprovider/inmemory.go 70.00% <100.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f69d234...50915c4. Read the comment docs.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "Example 4. Multiple Local Config Files" what is the order? Last Modify? Alphabetic by name? The order matters for merging.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "Command Line Invocation" why not simple "URL" and use same inside the config (when referencing) and then you don't to do any special parsing for CLI vs Reference in file


// ConfigSource is a component that provides a configuration map or value.
type ConfigSource interface {
configmapprovider.Provider
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a ConfigSource and a Provider if they are the same? This is what I was suggesting #3924. I think we should have only one concept.

@@ -22,18 +22,18 @@ import (
"go.opentelemetry.io/collector/config"
)

type fileMapProvider struct {
type FileMapProvider struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provides both so probably FileProvider is better name

fileName: fileName,
}
}

func (fmp *fileMapProvider) Retrieve(_ context.Context, _ func(*ChangeEvent)) (Retrieved, error) {
func (fmp *FileMapProvider) Retrieve(_ context.Context, _ func(*ChangeEvent)) (RetrievedMap, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this if the other call can retrieve a map? That was my whole concern that we don't need 2 concepts we just need to find the "middle ground" between the full flexibility ConfigSource have and restricted scope of MapProvider. The reality is that the "ConfigSources" for example do not use params except for the "template" case for example, so we need to answer is that a reasonable use-case so we need to support params?

@tigrannajaryan
Copy link
Member Author

Chatted with @bogdandrutu and decided that this looks more complicated than we would like. Bogdan is going to see if he can come up with a simpler alternate. Keeping this in draft for now.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 1, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 9, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2022

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Jan 2, 2022
@bogdandrutu bogdandrutu removed the Stale label Jan 3, 2022
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2022

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Feb 2, 2022
@tigrannajaryan tigrannajaryan deleted the add-configsources-provider branch March 3, 2022 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Config sources to support full remote config and individual config value substitution
3 participants