Rehydrates OTLP from Azure Blob Storage that was stored using the Azure Blob Exporter [../../exporter/azureblobexporter/README.md].
This is not a traditional receiver that continually produces data but rather rehydrates all blobs found within a specified time range. Once all of the blobs have been rehydrated in that time range the receiver will stop producing data. After the receiver has detected three consecutive empty polls it will stop polling for new blobs in order to prevent unnecessary API calls.
- Introduced: v1.37.0
- Metrics
- Logs
- Traces
-
The receiver polls blob storage for pages of blobs in the specified container.
-
The receiver will parse each blob's path to determine if it matches a path created by the Azure Blob Exporter.
-
If the blob path is from the exporter, the receiver will parse the timestamp represented by the path.
-
If the timestamp is within the configured range the receiver will download the blob and parse its contents into OTLP data.
a. The receiver will process both uncompressed JSON blobs and blobs compressed with gzip.
Note: There is no current way of specifying a time range to rehydrate so any blobs outside of the time range still need to be retrieved from the API in order to filter via the
starting_time
andending_time
configuration.
Field | Type | Default | Required | Description |
---|---|---|---|---|
connection_string | string | true |
The connection string to the Azure Blob Storage account. Can be found under the Access keys section of your storage account. |
|
container | string | true |
The name of the container to rehydrate from. | |
root_folder | string | false |
The root folder that prefixes the blob path. Should match the root_folder value of the Azure Blob Exporter. |
|
starting_time | string | true |
The UTC start time that represents the start of the time range to rehydrate from. Must be in the form YYYY-MM-DDTHH:MM . |
|
ending_time | string | true |
The UTC end time that represents the end of the time range to rehydrate from. Must be in the form YYYY-MM-DDTHH:MM . |
|
delete_on_read | bool | false |
false |
If true the blob will be deleted after being rehydrated. |
storage | string | false |
The component ID of a storage extension. The storage extension prevents duplication of data after a collector restart by remembering which blobs were previously rehydrated. | |
poll_interval* | string | false |
The interval at which the Azure API is scanned for blobs. | |
poll_timeout* | string | false |
The timeout for the Azure API to scan for blobs. | |
batch_size | int | 30 |
false |
The number of blobs to download and process in the pipeline simultaneously. This parameter directly impacts performance by controlling the concurrent blob download limit. |
page_size | int | 1000 |
false |
The maximum number of blob information to request in a single API call. |
Deprecated*:
poll_interval
andpoll_timeout
are no longer supported andbatch_size
/page_size
should be used instead.
This configuration specifies a connection_string
, container
, starting_time
, and ending_time
.
This will rehydrate all blobs in the container my-container
that have a path that represents they were created between 1:00pm
and 2:30pm
UTC time on October 1, 2023
.
Such a path could look like the following:
year=2023/month=10/day=01/hour=13/minute=30/metrics_12345.json
year=2023/month=10/day=01/hour=13/minute=30/logs_12345.json
year=2023/month=10/day=01/hour=13/minute=30/traces_12345.json
azureblobrehydration:
connection_string: "DefaultEndpointsProtocol=https;AccountName=storage_account_name;AccountKey=storage_account_key;EndpointSuffix=core.windows.net"
container: "my-container"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
batch_size: 100
page_size: 1000
This configuration shows using a storage extension to track rehydration progress over agent restarts. The storage
field is set to the component ID of the storage extension.
extensions:
file_storage:
directory: $OIQ_OTEL_COLLECTOR_HOME/storage
receivers:
azureblobrehydration:
connection_string: "DefaultEndpointsProtocol=https;AccountName=storage_account_name;AccountKey=storage_account_key;EndpointSuffix=core.windows.net"
container: "my-container"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
storage: "file_storage"
batch_size: 100
page_size: 1000
This configuration specifies an additional field root_folder
to match the root_folder
value of the Azure Blob Exporter.
The root_folder
value in the exporter will prefix the blob path with the root folder and it needs to be accounted for in the rehydration receiver.
Such a path could look like the following:
root/year=2023/month=10/day=01/hour=13/minute=30/metrics_12345.json
root/year=2023/month=10/day=01/hour=13/minute=30/logs_12345.json
root/year=2023/month=10/day=01/hour=13/minute=30/traces_12345.json
azureblobrehydration:
connection_string: "DefaultEndpointsProtocol=https;AccountName=storage_account_name;AccountKey=storage_account_key;EndpointSuffix=core.windows.net"
container: "my-container"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
root_folder: "root"
batch_size: 100
page_size: 1000
This configuration enables the delete_on_read
functionality which will delete a blob from Azure after it has been successfully rehydrated into OTLP data and sent onto the next component in the pipeline.
azureblobrehydration:
connection_string: "DefaultEndpointsProtocol=https;AccountName=storage_account_name;AccountKey=storage_account_key;EndpointSuffix=core.windows.net"
container: "my-container"
starting_time: 2023-10-01T13:00
ending_time: 2023-10-01T14:30
delete_on_read: true
batch_size: 100
page_size: 1000
The following configuration fields are deprecated and will be removed in a future release.
Field | Deprecated |
---|---|
poll_interval | true |
poll_timeout | true |