From 96175482d62025cc82e53ee636e1b12d692f98cd Mon Sep 17 00:00:00 2001 From: Qi Chen Date: Fri, 3 Nov 2023 10:08:00 -0700 Subject: [PATCH] [DOC] data prepper secret extensions (#5202) * ADD: extension docs in data-prepper-config Signed-off-by: George Chen * MAINT: updating secrets extension doc Signed-off-by: George Chen * MAINT: fix links Signed-off-by: George Chen * MAINT: fix one more dead link Signed-off-by: George Chen * MAINT: renaming Signed-off-by: George Chen * Update configuring-data-prepper.md * Update pipelines.md --------- Signed-off-by: George Chen Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --- .../configuring-data-prepper.md | 106 ++++++++++++++++++ .../configuration/processors/rename-keys.md | 2 +- .../configuration/sources/opensearch.md | 4 +- _data-prepper/pipelines/pipelines.md | 24 ++++ 4 files changed, 133 insertions(+), 3 deletions(-) diff --git a/_data-prepper/managing-data-prepper/configuring-data-prepper.md b/_data-prepper/managing-data-prepper/configuring-data-prepper.md index b27ba8e49d..0c91b37e2c 100644 --- a/_data-prepper/managing-data-prepper/configuring-data-prepper.md +++ b/_data-prepper/managing-data-prepper/configuring-data-prepper.md @@ -31,6 +31,7 @@ processorShutdownTimeout | No | Duration | The time given to processors to clear sinkShutdownTimeout | No | Duration | The time given to sinks to clear any in-flight data and gracefully shut down. Default is 30s. peer_forwarder | No | Object | Peer forwarder configurations. See [Peer forwarder options](#peer-forwarder-options) for more details. circuit_breakers | No | [circuit_breakers](#circuit-breakers) | Configures a circuit breaker on incoming data. +extensions | No | Object | The pipeline extension plugin configurations. See [Extension plugins](#extension-plugins) for more details. ### Peer forwarder options @@ -100,3 +101,108 @@ usage | Yes | Bytes | Specifies the JVM heap usage at which to trip a circuit br reset | No | Duration | After tripping the circuit breaker, no new checks are made until after this time has passed. This effectively sets the minimum time for a breaker to remain open to allow for clearing memory. Defaults to `1s`. check_interval | No | Duration | Specifies the time between checks of the heap size. Defaults to `500ms`. +### Extension plugins + +Since Data Prepper 2.5, Data Prepper provides support for user configurable extension plugins. Extension plugins are shared common +configurations shared across pipeline plugins, such as [sources, buffers, processors, and sinks]({{site.url}}{{site.baseurl}}/data-prepper/index/#concepts). + +### AWS extension plugins + +To use the AWS extension plugin, add the following setting to your `data-prepper-config.yaml` under `aws`. + +Option | Required | Type | Description +:--- |:---|:---| :--- +aws | No | Object | The AWS extension plugins configuration. + +#### AWS secrets extension plugin + +The AWS secrets extension plugin configures the [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html) to be +referenced in pipeline plugin configurations, as shown in the following example: + +```json +extensions: + aws: + secrets: + : + secret_id: + region: + sts_role_arn: + refresh_interval: + : + ... +``` + +To use the secrets extension plugin, add the following setting to your `pipeline.yaml` under `extensions` > `aws`. + +Option | Required | Type | Description +:--- |:---|:---| :--- +secrets | No | Object | The AWS Secrets Manager extension plugin configuration. See [Secrets](#secrets) for more details. + +### Secrets + +Use the following settings under the `secrets` extension setting. + + +Option | Required | Type | Description +:--- |:---|:---| :--- +secret_id | Yes | String | The AWS secret name or ARN. | +region | No | String | The AWS region of the secret. Defaults to `us-east-1`. +sts_role_arn | No | String | The AWS Security Token Service (AWS STS) role to assume for requests to the AWS Secrets Manager. Defaults to `null`, which will use the [standard SDK behavior for credentials](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html). +refresh_interval | No | Duration | The refreshment interval for AWS secrets extension plugin to poll new secret values. Defaults to `PT1H`. See [Automatically refreshing secrets](#automatically-refreshing-secrets) for details. + +#### Reference secrets + +In `pipelines.yaml`, secret values can be referenced within the pipeline plugins using the following formats: + +* plaintext: `${{aws_secrets:}}`. +* JSON (key-value pairs): `${{aws_secrets::}}` + + +Replace `` with the corresponding secret config ID under `/extensions/aws/secrets`. Replace `` with the desired key in the secret JSON value. The secret value reference string format can be interpreted for the following plugin setting data types: + +* String +* Number +* Long +* Short +* Integer +* Double +* Float +* Boolean +* Character + +The following example section of `data-prepper-config.yaml` names two secret config IDs, `host-secret-config` and `credential-secret-config`: + + +```json +extensions: + aws: + secrets: + host-secret-config: + secret_id: + region: + sts_role_arn: + refresh_interval: + credential-secret-config: + secret_id: + region: + sts_role_arn: + refresh_interval: +``` + +After `` is configured, you can reference the IDs in your `pipelines.yaml`: + +``` +sink: + - opensearch: + hosts: [ "${{aws_secrets:host-secret-config}}" ] + username: "${{aws_secrets:credential-secret-config:username}}" + password: "${{aws_secrets:credential-secret-config:password}}" + index: "test-migration" +``` + + +#### Automatically refreshing secrets + +For each individual secret configuration, the latest secret value is polled on a regular interval to support refreshing secrets in AWS Secrets Manager. The refreshed secret values are utilized by certain pipeline plugins to refresh their components, such as connection and authentication to the backend service. + +For multiple secret configurations, jitter within `60s` will be applied across all configurations during the initial secrets polling. diff --git a/_data-prepper/pipelines/configuration/processors/rename-keys.md b/_data-prepper/pipelines/configuration/processors/rename-keys.md index d2c892d745..f57b4e509f 100644 --- a/_data-prepper/pipelines/configuration/processors/rename-keys.md +++ b/_data-prepper/pipelines/configuration/processors/rename-keys.md @@ -64,7 +64,7 @@ When you run the `rename_keys` processor, it parses the message into the followi ## Special considerations -Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `rename_keys` processor. See the following example `pipline.yaml` file: +Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `rename_keys` processor. See the following example `pipeline.yaml` file: ```yaml pipeline: diff --git a/_data-prepper/pipelines/configuration/sources/opensearch.md b/_data-prepper/pipelines/configuration/sources/opensearch.md index d5397a38b0..baddcc4998 100644 --- a/_data-prepper/pipelines/configuration/sources/opensearch.md +++ b/_data-prepper/pipelines/configuration/sources/opensearch.md @@ -97,8 +97,8 @@ The following table describes options you can configure for the `opensearch` sou Option | Required | Type | Description :--- | :--- |:--------| :--- `hosts` | Yes | List | A list of OpenSearch hosts to write to, for example, `["https://localhost:9200", "https://remote-cluster:9200"]`. -`username` | No | String | The username for HTTP basic authentication. -`password` | No | String | The password for HTTP basic authentication. +`username` | No | String | The username for HTTP basic authentication. Since Data Prepper 2.5, this setting can be refreshed at runtime if [AWS secrets reference]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#reference-secrets) is applied. +`password` | No | String | The password for HTTP basic authentication. Since Data Prepper 2.5, this setting can be refreshed at runtime if [AWS secrets reference]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#reference-secrets) is applied. `disable_authentication` | No | Boolean | Whether authentication is disabled. Defaults to `false`. `aws` | No | Object | The AWS configuration. For more information, see [aws](#aws). `acknowledgments` | No | Boolean | When `true`, enables the `opensearch` source to receive [end-to-end acknowledgments]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/#end-to-end-acknowledgments) when events are received by OpenSearch sinks. Default is `false`. diff --git a/_data-prepper/pipelines/pipelines.md b/_data-prepper/pipelines/pipelines.md index 50063079e7..87c8ce5755 100644 --- a/_data-prepper/pipelines/pipelines.md +++ b/_data-prepper/pipelines/pipelines.md @@ -326,3 +326,27 @@ peer_forwarder: ``` +## Pipeline Configurations + +Since Data Prepper 2.5, shared pipeline components can be configured under the reserved section `pipeline_configurations` when all pipelines are defined in a single pipeline configuration YAML file. +Shared pipeline configurations can include certain components within [Extension Plugins]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#extension-plugins), as shown in the following example that refers to secrets configurations for an `opensearch` sink: + +```json +pipeline_configurations: + aws: + secrets: + credential-secret-config: + secret_id: + region: + sts_role_arn: +simple-sample-pipeline: + ... + sink: + - opensearch: + hosts: [ "${{aws_secrets:host-secret-config}}" ] + username: "${{aws_secrets:credential-secret-config:username}}" + password: "${{aws_secrets:credential-secret-config:password}}" + index: "test-migration" +``` + +When the same component is defined in both `pipelines.yaml` and `data-prepper-config.yaml`, the definition in the `pipelines.yaml` will overwrite the counterpart in `data-prepper-config.yaml`. For more information on shared pipeline components, see [AWS secrets extension plugin]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#aws-secrets-extension-plugin) for details.