Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] data prepper secret extensions #5202

106 changes: 106 additions & 0 deletions _data-prepper/managing-data-prepper/configuring-data-prepper.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
sinkShutdownTimeout | No | Duration | The time given to sinks to clear any in-flight data and gracefully shut down. Default is 30s.
peer_forwarder | No | Object | Peer forwarder configurations. See [Peer forwarder options](#peer-forwarder-options) for more details.
circuit_breakers | No | [circuit_breakers](#circuit-breakers) | Configures a circuit breaker on incoming data.
extensions | No | Object | The pipeline extension plugin configurations. See [Extension plugins](#extension-plugins) for more details.

### Peer forwarder options

Expand Down Expand Up @@ -100,3 +101,108 @@
reset | No | Duration | After tripping the circuit breaker, no new checks are made until after this time has passed. This effectively sets the minimum time for a breaker to remain open to allow for clearing memory. Defaults to `1s`.
check_interval | No | Duration | Specifies the time between checks of the heap size. Defaults to `500ms`.

### Extension plugins

Since Data Prepper 2.5, Data Prepper provides support for user configurable extension plugins. Extension plugins are shared common
configurations shared across pipeline plugins, such as [sources, buffers, processors, and sinks]({{site.url}}{{site.baseurl}}/data-prepper/index/#concepts).

### AWS extension plugins

Check failure on line 109 in _data-prepper/managing-data-prepper/configuring-data-prepper.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'AWS extension plugins' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'AWS extension plugins' is a heading and should be in sentence case.", "location": {"path": "_data-prepper/managing-data-prepper/configuring-data-prepper.md", "range": {"start": {"line": 109, "column": 5}}}, "severity": "ERROR"}

To use the AWS extension plugin, add the following setting to your `data-prepper-config.yaml` under `aws`.

Option | Required | Type | Description
:--- |:---|:---| :---
aws | No | Object | The AWS extension plugins configuration.

#### AWS secrets extension plugin

Check failure on line 117 in _data-prepper/managing-data-prepper/configuring-data-prepper.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'AWS secrets extension plugin' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'AWS secrets extension plugin' is a heading and should be in sentence case.", "location": {"path": "_data-prepper/managing-data-prepper/configuring-data-prepper.md", "range": {"start": {"line": 117, "column": 6}}}, "severity": "ERROR"}

The AWS secrets extension plugin configures the [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html) to be
referenced in pipeline plugin configurations, as shown in the following example:

```json
extensions:
aws:
secrets:
<YOUR_SECRET_CONFIG_ID_1>:
secret_id: <YOUR_SECRET_ID_1>
region: <YOUR_REGION_1>
sts_role_arn: <YOUR_STS_ROLE_ARN_1>
refresh_interval: <YOUR_REFRESH_INTERVAL>
<YOUR_SECRET_CONFIG_ID_2>:
...
```

To use the secrets extension plugin, add the following setting to your `pipeline.yaml` under `extensions` > `aws`.

Option | Required | Type | Description
:--- |:---|:---| :---
secrets | No | Object | The AWS Secrets Manager extension plugin configuration. See [Secrets](#secrets) for more details.

### Secrets

Use the following settings under the `secrets` extension setting.


Option | Required | Type | Description
:--- |:---|:---| :---
secret_id | Yes | String | The AWS secret name or ARN. |
region | No | String | The AWS region of the secret. Defaults to `us-east-1`.
sts_role_arn | No | String | The AWS Security Token Service (AWS STS) role to assume for requests to the AWS Secrets Manager. Defaults to `null`, which will use the [standard SDK behavior for credentials](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html).
refresh_interval | No | Duration | The refreshment interval for AWS secrets extension plugin to poll new secret values. Defaults to `PT1H`. See [Automatically refreshing secrets](#automatically-refreshing-secrets) for details.

#### Reference secrets

In `pipelines.yaml`, secret values can be referenced within the pipeline plugins using the following formats:

* plaintext: `${{aws_secrets:<YOUR_SECRET_CONFIG_ID>}}`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be clearly state in the preceding paragraph that users need to replace <YOUR_SECRET_CONFIG_ID> including the <>.

* JSON (key-value pairs): `${{aws_secrets:<YOUR_SECRET_CONFIG_ID>:<YOUR_KEY>}}`


Replace `<YOUR_SECRET_CONFIG_ID>` with the corresponding secret config ID under `/extensions/aws/secrets`. Replace `<YOUR_KEY>` with the desired key in the secret JSON value. The secret value reference string format can be interpreted for the following plugin setting data types:

* String
* Number
* Long
* Short
* Integer
* Double
* Float
* Boolean
* Character

The following example section of `data-prepper-config.yaml` names two secret config IDs, `host-secret-config` and `credential-secret-config`:


```json
extensions:
aws:
secrets:
host-secret-config:
secret_id: <YOUR_SECRET_ID_1>
region: <YOUR_REGION_1>
sts_role_arn: <YOUR_STS_ROLE_ARN_1>
refresh_interval: <YOUR_REFRESH_INTERVAL_1>
credential-secret-config:
secret_id: <YOUR_SECRET_ID_2>
region: <YOUR_REGION_2>
sts_role_arn: <YOUR_STS_ROLE_ARN_2>
refresh_interval: <YOUR_REFRESH_INTERVAL_2>
```

After `<YOUR_SECRET_CONFIG_ID>` is configured, you can reference the IDs in your `pipelines.yaml`:

```
sink:
- opensearch:
hosts: [ "${{aws_secrets:host-secret-config}}" ]
username: "${{aws_secrets:credential-secret-config:username}}"
password: "${{aws_secrets:credential-secret-config:password}}"
index: "test-migration"
```


#### Automatically refreshing secrets

For each individual secret configuration, the latest secret value is polled on a regular interval to support refreshing secrets in AWS Secrets Manager. The refreshed secret values are utilized by certain pipeline plugins to refresh their components, such as connection and authentication to the backend service.

For multiple secret configurations, jitter within `60s` will be applied across all configurations during the initial secrets polling.
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ When you run the `rename_keys` processor, it parses the message into the followi

## Special considerations

Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `rename_keys` processor. See the following example `pipline.yaml` file:
Renaming operations occur in the order that the key-value pair entries are listed in the `pipeline.yaml` file. This means that chaining (where key-value pairs are renamed in sequence) is implicit in the `rename_keys` processor. See the following example `pipeline.yaml` file:

```yaml
pipeline:
Expand Down
4 changes: 2 additions & 2 deletions _data-prepper/pipelines/configuration/sources/opensearch.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,8 @@ The following table describes options you can configure for the `opensearch` sou
Option | Required | Type | Description
:--- | :--- |:--------| :---
`hosts` | Yes | List | A list of OpenSearch hosts to write to, for example, `["https://localhost:9200", "https://remote-cluster:9200"]`.
`username` | No | String | The username for HTTP basic authentication.
`password` | No | String | The password for HTTP basic authentication.
`username` | No | String | The username for HTTP basic authentication. Since Data Prepper 2.5, this setting can be refreshed at runtime if [AWS secrets reference]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#reference-secrets) is applied.
`password` | No | String | The password for HTTP basic authentication. Since Data Prepper 2.5, this setting can be refreshed at runtime if [AWS secrets reference]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#reference-secrets) is applied.
`disable_authentication` | No | Boolean | Whether authentication is disabled. Defaults to `false`.
`aws` | No | Object | The AWS configuration. For more information, see [aws](#aws).
`acknowledgments` | No | Boolean | When `true`, enables the `opensearch` source to receive [end-to-end acknowledgments]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/pipelines/#end-to-end-acknowledgments) when events are received by OpenSearch sinks. Default is `false`.
Expand Down
24 changes: 24 additions & 0 deletions _data-prepper/pipelines/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,3 +326,27 @@
```


## Pipeline Configurations

Check failure on line 329 in _data-prepper/pipelines/pipelines.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Pipeline Configurations' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Pipeline Configurations' is a heading and should be in sentence case.", "location": {"path": "_data-prepper/pipelines/pipelines.md", "range": {"start": {"line": 329, "column": 4}}}, "severity": "ERROR"}

Since Data Prepper 2.5, shared pipeline components can be configured under the reserved section `pipeline_configurations` when all pipelines are defined in a single pipeline configuration YAML file.
Shared pipeline configurations can include certain components within [Extension Plugins]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#extension-plugins), as shown in the following example that refers to secrets configurations for an `opensearch` sink:

```json
pipeline_configurations:
aws:
secrets:
credential-secret-config:
secret_id: <YOUR_SECRET_ID>
region: <YOUR_REGION>
sts_role_arn: <YOUR_STS_ROLE_ARN>
simple-sample-pipeline:
...
sink:
- opensearch:
hosts: [ "${{aws_secrets:host-secret-config}}" ]
username: "${{aws_secrets:credential-secret-config:username}}"
password: "${{aws_secrets:credential-secret-config:password}}"
index: "test-migration"
```

When the same component is defined in both `pipelines.yaml` and `data-prepper-config.yaml`, the definition in the `pipelines.yaml` will overwrite the counterpart in `data-prepper-config.yaml`. For more information on shared pipeline components, see [AWS secrets extension plugin]({{site.url}}{{site.baseurl}}/data-prepper/managing-data-prepper/configuring-data-prepper/#aws-secrets-extension-plugin) for details.

Check failure on line 352 in _data-prepper/pipelines/pipelines.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'for more information about' instead of 'For more information on'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'for more information about' instead of 'For more information on'.", "location": {"path": "_data-prepper/pipelines/pipelines.md", "range": {"start": {"line": 352, "column": 194}}}, "severity": "ERROR"}
Loading