-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config sources to support full remote config and individual config value substitution #4190
Comments
@alolita Can you please review this proposal and tell if it works for your use cases? |
@Aditya-Gollapudi I believe you wanted this capability for AWS. Please review and provide feedback so that we can move forward. If there is anyone else at AWS who needs remote configuration please ask them to review. |
@tigrannajaryan unfortunately I am no longer at AWS - I know this was a concern for @PettitWesley and @hossain-rayhan |
Hi @tigrannajaryan, thanks for putting everything together. Example 6 is a good use case which we are looking to support. @PettitWesley take a look here. |
@hossain-rayhan Thanks. What do you think about the fact that there will be a new command line option I am looking for the validation of the entire proposal from the user experience perspective. If we are happy about how this looks from the end user's point of view then we can go ahead and see how we implement this in the code. |
@dmitryax what do you think about the proposal? |
Thanks for putting this together @tigrannajaryan. We are definitely looking for pattern 4, 5, 7. @vishalj82 please look into this proposal. Specifically the pattern 4 and 7. |
I like what is being suggested here w/r/t stacking multiple configs, incl. ability to pull them remotely. precedence rules for merging multiple configs would be required tho, right? even more so, I am interested in understanding if we have also considered allowing configuration push. in my ideal world, my observability backend could push configuration to a collector. in other words, say I have a to me, this would be preferable to having to specify endpoints and credentials multiple times. this is not to say any of the methods proposed here don't make sense in their own right. but I am looking for one more thing, and it would be to the most important one, and default case for "my" users :) naively, it would seem this would require some sort of collector-internal API to control configuration, available to exporters. some of it seems anticipated in #2374 but not in this proposal here. |
@tigrannajaryan One of the core reqs from AWS was to have a command line flag. Can there be a way to specify multiple config files on the command line without a separate config sources file?
Or may be you could specify the flag multiple times. |
@kumoroku thanks for reviewing.
Correct.
Yes. See the proposal for management protocol that can push configuration: https://github.com/signalfx/opamp-spec
Yes, completely doable. One caveat: I believe the management protocol needs to be decoupled from the telemetry delivery protocol because different vendors use different telemetry protocols but we want everyone to use the same management protocol. This does not prevent the management protocol and telemetry delivery protocol from sharing the authentication credentials and also does not prevent implementation of management server on the same machine that accepts telemetry. So, essentially you could have a backend where http://example.com/v1/otlp/metrics accept metrics in OTLP format and http://example.com/v1/opamp to accept management protocol connections.
It is possible to avoid the need to specify credentials multiple times. The management server can push other credentials to the Collector, so the Collector only needs to know how to connect to the management server. OpAMP spec describes one possible way to do it: https://github.com/signalfx/opamp-spec#connection-settings-management
That is possible, but requires more changes to Collector internals than may be desirable. Another, perhaps simpler way is for the Collector config to be defined using replaceable keywords which the remote configuration source knows are to be replaced by credentials received from the management server. |
@PettitWesley Yes, it should be doable. We will just have to implement the config source such that it allows passing multiple file names as the CLI argument. We will need to agree on a syntax and what file name separator character to use. |
@bogdandrutu comments? |
@PettitWesley your request is pretty simple right now (but you need to create the flag yourself), let's assume you have already a
|
One of the things worth considering is that, rather than building a consolidated service each time a config source is added/modified we should only be rebuilding a portion of it, i.e pipelines that were affected. My proposal would be in this case is to keep both The benefit would be that, when ever a new config-source file is added in, we create its pipelines. when it is removed, we delete only those pipelines. A pipeline from a config source can only refer to All pipelines need to be refreshed only if the core config changes as there might be references to the components defined on it. thoughts? |
We want to do this in the future. We can detect the config deltas and only rebuild affected pipelines.
This does not seems to be necessary for the above goal. We can keep the current approach and still only rebuild affected pipelines. |
Heads up: I am also looking at a slightly different config file structure, while maintaining the functionality described in the above examples. We can do this: # Define all config sources (including vaule sources)
config_sources:
files/local: local.yaml
# define a value source of "vault" type
# that can be referenced later.
vault:
# endpoint is the Vault server address.
endpoint: http://localhost:8200
# path is the Vault path to the secret location.
path: secret/data/kv
auth:
token: foo
merge_configs:
- files/local In local.yaml: exporters:
signalfx:
# this will read the access_token value from "vault" config source.
realm: us0
access_token: $vault:data.access_token Conceptually it is no different than the original proposal at the top of this thread. However this is more uniform with other configuration settings, where we first define the component with its setting (e.g. a receiver) and then have a list of components used somewhere (e.g. receivers list in the pipeline definition). |
This is a draft PR (not production quality code) that shows roughly what the end goal is. Don't merge this PR, it is merely to show what the final destination looks like. If we agree that this is a good end goal I will split this into series of smaller PRs and will submit them. I am primarily looking for the feedback on the new concept and the APIs. There is no need to review all the implementation details, they are not production quality yet. ## Configuration File Format Changes Introduced 2 new optional sections in the configuration file format: config_sources and merge_configs. config_sources defines one or more configuration sources. Configuration sources are a new type of component that is pluggable like all other component types by implementing its Factory. Configuration sources must implement a configmapprovider.Provider interface. The merge_configs section contains a sequence of config source references. The configs from these sources is retrieved and merged in the specified order to form the top-level configuration of the Collector. The locally specified top-level configuration is merged last (thus has the highest precedence). This approach allows to maintain backward compatibility. Any previously valid configuration file remains valid. New configuration files may specify only config_sources/merge_configs section, move the local configuration to a separate file and use that file as one of the configuration sources. This approach also does not require us to introduce a new --config_sources command line option. ## Configuration Provider Interfaces Introduced the concept of config MapProvider and config ValueProvider, both implementing a base Provider interface. MapProvider is equivalent to the old Provider interface and allows retrieving configuration maps. ValueProvider can return configuration values, which includes primitive values, arrays, maps. ValueProvider also supports returning different values based on the selector and params passed which allows to parameterize the returned value at the place of usage. ## Command Line Invocation ValueProvicer-type config sources can be invoked directly from the command line. The --config option now accepts either a file path (which was the existing functionality) or values in the form of <sourcename>:<selector>?<params> (which is a new functionality). The new functionality is triggered by the presence of a colon in the --config option value. This makes it impossible to pass file names that use a colon character, but we assume that this is an acceptable limitation since colon is often used as a file path delimiter character in Unix systems anyway. The new functionality allows for example for the following command lines: ./otelcol --config=config.yaml ./otelcol --config=files:config.yaml ./otelcol --config=http://example.com/path/to/config.yaml ./otelcol --config=http://example.com/path/to/config.yaml?foo=bar&bar=baz ## Example Config Sources As an example I implemented 3 config sources to demonstrate how things work: - env is a ValueProvider that allows getting environment variables by name. - files is both a MapProvider and a ValueProvider, which allows it to be used both as a top-level config source and as command line config source (see example above). - http is a ValueProvider which has factories that can be registered for "http" and "https" config sources and which can perform an HTTP GET operation to retrieve the config. ## Other Changes Deleted Expand map provider. It is no longer needed since environment expansion is now handled in the valueSubstitutor which needs to distinguish between env variables and config value source subsitution both starting with $ character. Deleted config/internal/configsource.Manager. The new functionality covers what Manager was intended to do. ## TODO - Refine RetrieveValue() to return a more specific type instead of interface{} - See if it is worth combining MapProvider and ValueProvider into just Provider (doesn't seem so, but need to check). - See if it possible/desirable to allow config source settings to be a single string such that `name` is not nessary for `files` config source and instead the file name can be specified directly as the setting of `files` source. - See if better names are possible for the new top-level config file keys. Perhaps root_configs is a better name instead of merge_configs. - Some more cleanup and refactoring may be necessary to tidy things up once we are fully settled on the functionality. - Add tests for everything. ## Example Usage From End User Perspective Below are some example usages. ### Example 1. Single Local Config File. This is the old (current) way. It is still fully supported and will continue to be supported. It is not obsolete and is the preferable way when there is a single local config file. Command line: `./otelcol --config=local.yaml` local.yaml content: ```yaml receivers: otlp: grpc: exporters: otlp: service: pipelines: traces: receivers: [otlp] exporters: [otlp] ``` ### Example 2. Config Sources, Single Local Config File. Command line: `./otelcol --config=sources.yaml` sources.yaml content: ```yaml config_sources: files: name: local.yaml merge_configs: [files] ``` local.yaml content: ```yaml receivers: otlp: grpc: exporters: otlp: service: pipelines: traces: receivers: [otlp] exporters: [otlp] ``` This example results in a config that is completely equivalent to the config in Example 1. ### Example 3. Config Sources from Command Line This uses a shorthand for specifying a single config source on the command line. Command line: `./otelcol --configs=files:local.yaml` local.yaml content: ```yaml receivers: otlp: grpc: exporters: otlp: service: pipelines: traces: receivers: [otlp] exporters: [otlp] ``` ### Example 4. Multiple Local Config Files Command line: `./otelcol --config=sources.yaml` sources.yaml content: ```yaml config_sources: # Merge all files that match the specified glob into one file and use that as a config files: name: /var/lib/otelcol/config/**/*.yaml merge_configs: [files] ``` ### Example 5. From HTTP Server Command line: `./otelcol --config=sources.yaml` source.yaml content: ```yaml config_sources: https: merge_configs: - https://example.com/path/file.yaml ``` This will do a HTTP GET request to https://example.com/path/file.yaml, will download the content and will use the content as the config file. The equivalent result can be achieved using only the command line: `./otelcol --config=https://example.com/path/file.yaml` ### Example 6. Multiple Sources Command line: `./otelcol --config=sources.yaml` source.yaml content: ```yaml config_sources: files: name: local.yaml s3: bucket: mybucket region: us-east-1 merge_configs: [files,s3] ``` This will merge a local.yaml file with the content of an S3 bucket and will use the content as the config file. ### Example 7. Value Sources Command line: `./otelcol --config=sources.yaml` source.yaml content: ```yaml config_sources: files: local.yaml # define a value source of "vault" type # that can be referenced later. vault: # endpoint is the Vault server address. endpoint: http://localhost:8200 # path is the Vault path to the secret location. path: secret/data/kv auth: token: foo merge_configs: [files] ``` local.yaml content: ```yaml receivers: otlp: grpc: exporters: signalfx: # this will read the access_token value from "vault" config source. realm: us0 access_token: $vault:data.access_token service: pipelines: metrics: receivers: [otlp] exporters: [signalfx] ``` ### Example 8. Environment Variables Command line: `./otelcol --config=config.yaml` config.yaml content: ```yaml config_sources: env: receivers: otlp: grpc: exporters: signalfx: # Both of the following values are read from env variables. realm: $SIGNALFX_REALM access_token: $env:SIGNALFX_ACCESSTOKEN service: pipelines: metrics: receivers: [otlp] exporters: [signalfx] ```
Resolves: open-telemetry#4190 This is a draft PR (not production quality code) that shows roughly what the end goal is. Don't merge this PR, it is merely to show what the final destination looks like. If we agree that this is a good end goal I will split this into series of smaller PRs and will submit them. I am primarily looking for the feedback on the new concept and the APIs. There is no need to review all the implementation details, they are not production quality yet. ## Configuration File Format Changes Introduced 2 new optional sections in the configuration file format: config_sources and merge_configs. config_sources defines one or more configuration sources. Configuration sources are a new type of component that is pluggable like all other component types by implementing its Factory. Configuration sources must implement a configmapprovider.Provider interface. The merge_configs section contains a sequence of config source references. The configs from these sources is retrieved and merged in the specified order to form the top-level configuration of the Collector. The locally specified top-level configuration is merged last (thus has the highest precedence). This approach allows to maintain backward compatibility. Any previously valid configuration file remains valid. New configuration files may specify only config_sources/merge_configs section, move the local configuration to a separate file and use that file as one of the configuration sources. This approach also does not require us to introduce a new --config_sources command line option. ## Configuration Provider Interfaces Introduced the concept of config MapProvider and config ValueProvider, both implementing a base Provider interface. MapProvider is equivalent to the old Provider interface and allows retrieving configuration maps. ValueProvider can return configuration values, which includes primitive values, arrays, maps. ValueProvider also supports returning different values based on the selector and params passed which allows to parameterize the returned value at the place of usage. ## Command Line Invocation ValueProvicer-type config sources can be invoked directly from the command line. The --config option now accepts either a file path (which was the existing functionality) or values in the form of <sourcename>:<selector>?<params> (which is a new functionality). The new functionality is triggered by the presence of a colon in the --config option value. This makes it impossible to pass file names that use a colon character, but we assume that this is an acceptable limitation since colon is often used as a file path delimiter character in Unix systems anyway. The new functionality allows for example for the following command lines: ./otelcol --config=config.yaml ./otelcol --config=files:config.yaml ./otelcol --config=http://example.com/path/to/config.yaml ./otelcol --config=http://example.com/path/to/config.yaml?foo=bar&bar=baz ## Example Config Sources As an example I implemented 3 config sources to demonstrate how things work: - env is a ValueProvider that allows getting environment variables by name. - files is both a MapProvider and a ValueProvider, which allows it to be used both as a top-level config source and as command line config source (see example above). - http is a ValueProvider which has factories that can be registered for "http" and "https" config sources and which can perform an HTTP GET operation to retrieve the config. ## Other Changes Deleted Expand map provider. It is no longer needed since environment expansion is now handled in the valueSubstitutor which needs to distinguish between env variables and config value source subsitution both starting with $ character. Deleted config/internal/configsource.Manager. The new functionality covers what Manager was intended to do. ## TODO - Refine RetrieveValue() to return a more specific type instead of interface{} - See if it is worth combining MapProvider and ValueProvider into just Provider (doesn't seem so, but need to check). - See if it possible/desirable to allow config source settings to be a single string such that `name` is not nessary for `files` config source and instead the file name can be specified directly as the setting of `files` source. - See if better names are possible for the new top-level config file keys. Perhaps root_configs is a better name instead of merge_configs. - Some more cleanup and refactoring may be necessary to tidy things up once we are fully settled on the functionality. Probably rename configmapprovider package to configprovider, rename Retrieve to RetriveMap, etc. - Add more comments to explain the design and how things work. - Add tests for everything. ## Example Usage From End User Perspective Below are some example usages. ### Example 1. Single Local Config File. This is the old (current) way. It is still fully supported and will continue to be supported. It is not obsolete and is the preferable way when there is a single local config file. Command line: `./otelcol --config=local.yaml` local.yaml content: ```yaml receivers: otlp: grpc: exporters: otlp: service: pipelines: traces: receivers: [otlp] exporters: [otlp] ``` ### Example 2. Config Sources, Single Local Config File. Command line: `./otelcol --config=sources.yaml` sources.yaml content: ```yaml config_sources: files: name: local.yaml merge_configs: [files] ``` local.yaml content: ```yaml receivers: otlp: grpc: exporters: otlp: service: pipelines: traces: receivers: [otlp] exporters: [otlp] ``` This example results in a config that is completely equivalent to the config in Example 1. ### Example 3. Config Sources from Command Line This uses a shorthand for specifying a single config source on the command line. Command line: `./otelcol --configs=files:local.yaml` local.yaml content: ```yaml receivers: otlp: grpc: exporters: otlp: service: pipelines: traces: receivers: [otlp] exporters: [otlp] ``` ### Example 4. Multiple Local Config Files Command line: `./otelcol --config=sources.yaml` sources.yaml content: ```yaml config_sources: # Merge all files that match the specified glob into one file and use that as a config files: name: /var/lib/otelcol/config/**/*.yaml merge_configs: [files] ``` ### Example 5. From HTTP Server Command line: `./otelcol --config=sources.yaml` source.yaml content: ```yaml config_sources: https: merge_configs: - https://example.com/path/file.yaml ``` This will do a HTTP GET request to https://example.com/path/file.yaml, will download the content and will use the content as the config file. The equivalent result can be achieved using only the command line: `./otelcol --config=https://example.com/path/file.yaml` ### Example 6. Multiple Sources Command line: `./otelcol --config=sources.yaml` source.yaml content: ```yaml config_sources: files: name: local.yaml s3: bucket: mybucket region: us-east-1 merge_configs: [files,s3] ``` This will merge a local.yaml file with the content of an S3 bucket and will use the content as the config file. ### Example 7. Value Sources Command line: `./otelcol --config=sources.yaml` source.yaml content: ```yaml config_sources: files: local.yaml # define a value source of "vault" type # that can be referenced later. vault: # endpoint is the Vault server address. endpoint: http://localhost:8200 # path is the Vault path to the secret location. path: secret/data/kv auth: token: foo merge_configs: [files] ``` local.yaml content: ```yaml receivers: otlp: grpc: exporters: signalfx: # this will read the access_token value from "vault" config source. realm: us0 access_token: $vault:data.access_token service: pipelines: metrics: receivers: [otlp] exporters: [signalfx] ``` ### Example 8. Environment Variables Command line: `./otelcol --config=config.yaml` config.yaml content: ```yaml config_sources: env: receivers: otlp: grpc: exporters: signalfx: # Both of the following values are read from env variables. realm: $SIGNALFX_REALM access_token: $env:SIGNALFX_ACCESSTOKEN service: pipelines: metrics: receivers: [otlp] exporters: [signalfx] ```
Chatted with @bogdandrutu and decided that the PR that implements this looks more complicated than we would like. Bogdan is going to see if he can come up with a simpler alternate. I am un-assigning this from myself for now. |
@bogdandrutu will you be proposing a simpler alternative? |
Happy New Year everyone! Just checking in on this... specifically interested in pattern 4 (Multiple Local Config Files) |
@ttomsu are you interested to support "blob/regexp" or just multiple local files in general? |
@bogdandrutu - Apologies for the delay. I'm interested in both - using the blob/regexp to specify a directory such as |
Hey @bogdandrutu - I finally got some time to update and test drive the changes in the codebase. It looks like you're using the Config 1:
Config 2:
According to this issue:
So my second config clobbers the list of receivers, which is what I observe in practice. |
I spoke with Bogdan and Anthony Mirabella on Slack about this issue and they recommended using separate pipelines across different files - the best practice being to not modify pipelines outside of a single config file. Additional pipelines can be identified using the same |
@bogdandrutu @Aneurysm9 and I discussed how the Collector configuration can be extended to support the remote configuration needs.
One possible approach is to extend the concept of config sources that were proposed earlier in #3687 to allow the config sources to supply the full configuration and value sources to provided individual values to substitute in the configuration (see examples below).
The following proposal shows what it will look like from the end user perspective. Please review and comment. At this stage we would like to validate that this approach covers the use cases that the community is interested it. Once we agree on the end-user perspective, we will discuss what the internal API should look like.
Proposal
The config sources / value sources concepts enhance Collector configuration capabilities while keeping full backwards compatibility.
The following are examples of how the end user will specify the configuration, assuming the new concept of config sources is implemented.
Example 1. Single Local Config File.
This is the old (current) way. It is still fully supported and will continue to be supported. It is not obsolete and is the preferable way when there is a single local config file.
Command line:
./otelcol --config=local.yaml
local.yaml content:
Example 2. Config Sources, Single Local Config File.
This uses a new
--config_sources
command line flag.Command line:
./otelcol --config_sources=sources.yaml
sources.yaml content:
local.yaml content:
This example results in a config that is completely equivalent to the config in Example 1.
Example 3. Config Sources from Command Line
This uses a shorthand for specifying a single config source on the command line.
Command line:
./otelcol --config_sources=files:local.yaml
local.yaml content:
Example 4. Multiple Local Config Files
Command line:
./otelcol --config_sources=sources.yaml
sources.yaml content:
/var/lib/otelcol/config/local.yaml content:
/var/lib/otelcol/config/local2.yaml content:
This results in the following effective configuration:
Example 5. From HTTP Server
Command line:
./otelcol --config_sources=sources.yaml
source.yaml content:
This will do a HTTP GET request to https://example.com/path/file.yaml, will download the content and will use the content as the config file.
The equivalent result can be achieved using only the command line:
./otelcol --config_sources=http:http://example.com/path/file.yaml
or
./otelcol --config_sources=http://example.com/path/file.yaml
(Note that http config_source automatically prepends "http" to the url if not already present).
Example 6. Multiple Sources
Command line:
./otelcol --config_sources=sources.yaml
source.yaml content:
This will merge a local.yaml file with the content of an S3 bucket and will use the content as the config file.
Example 7. Value Sources
Value sources are almost exactly what the the experimental ConfigSources were proposed to be, i.e. allow individual values in the config file to be substituted.
Command line:
./otelcol --config_sources=sources.yaml
source.yaml content:
local.yaml content:
Alternative Command Line Approach
Note that this proposal suggest that we add a new
--config_sources
flag and keep the existing--config
for single-file case. An alternate approach is to only have--config
flag and auto-detect the provided file format and either enable the current single-file config logic or enable the newly proposed "config sources" approach.The text was updated successfully, but these errors were encountered: