Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/elasticsearch] route based on data stream attributes #33755

Conversation

andrzej-stencel
Copy link
Member

Description:

This adds a feature to the Elasticsearch exporter to route telemetry based on data stream attributes, as described in An introduction to the Elastic data stream naming scheme.

Link to tracking Issue:

Testing:

Added unit tests.

Documentation:

Added documentation in exporter's README.

Copy link
Contributor

@felixbarny felixbarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ensure that all docs have a data_stream.<type|dataset|namespace> property at the top-level, that's consistent with the _index ("${data_stream.type}-${data_stream.dataset:generic}-${data_stream.namespace:default}" when data_stream mode is enabled.

exporter/elasticsearchexporter/model.go Show resolved Hide resolved
Comment on lines +238 to +246
logs := newLogsWithAttributeAndResourceMap(
map[string]string{
dataStreamDataset: "record.dataset",
},
map[string]string{
dataStreamDataset: "resource.dataset",
dataStreamNamespace: "resource.namespace",
},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test is only covering record and resource, not scope

Data point attributes take precedence over scope attributes, which take precedence over resource attributes.
- `prefix_suffix` - uses resource, scope or data point attributes `elasticsearch.index.prefix` and `elasticsearch.index.suffix`
to dynamically construct index name in the form `${elasticsearch.index.prefix}${metrics_index}${elasticsearch.index.suffix}`.
Data point attributes take precedence over scope attributes, which take precedence over resource attributes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

metrics prefix_suffix precedence is different to logs and traces. Is this deliberate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you've created #33725 but the question remains: is it worth it to have this exception / inconsistency for metrics prefix_suffix mode? Otherwise, does it make sense to break all of them at once when a decision is made in #33725 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I created #33725 because I think the current setting for prefix_suffix should be changed.

When introducing metrics, my thinking was to introduce the prefix_suffix mode for metrics using the correct precedence, to prevent a breaking change at least in metrics behavior.

I do see value in having those consistent between signals though. If you think we should rather keep prefix_suffix mode precedence consistent with other signals, I can change that.

@andrzej-stencel
Copy link
Member Author

We should ensure that all docs have a data_stream.<type|dataset|namespace> property at the top-level, that's consistent with the _index ("${data_stream.type}-${data_stream.dataset:generic}-${data_stream.namespace:default}" when data_stream mode is enabled.

@felixbarny I have added filling in missing data stream attributes in cb88ebc, can you take a look?

@@ -213,6 +213,43 @@ func TestExporterLogs(t *testing.T) {
rec.WaitItems(1)
})

t.Run("publish with dynamic index data_stream mode", func(t *testing.T) {
Copy link
Contributor

@carsonip carsonip Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some parts missing from the tests

  • scope attributes (mentioned in another comment)
  • fields in doc
    • default is used when no data_stream attributes apply
    • data_stream attributes are present
  • doc action
    • default is used when no data_stream attributes apply

Comment on lines +25 to +26
dataType := ensureAttribute(dataStreamType, defaultDataStreamTypeLogs, record.Attributes(), scope.Attributes(), resource.Attributes())
return fmt.Sprintf("%s-%s-%s", dataType, dataSet, namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking this PR but the type should be fixed based on the record type. Similar story for metrics and spans.

Suggested change
dataType := ensureAttribute(dataStreamType, defaultDataStreamTypeLogs, record.Attributes(), scope.Attributes(), resource.Attributes())
return fmt.Sprintf("%s-%s-%s", dataType, dataSet, namespace)
recordAttributes.PutStr(dataStreamType, "logs")
return fmt.Sprintf("logs-%s-%s", dataSet, namespace)

@andrzej-stencel
Copy link
Member Author

Superseded by #33794

@andrzej-stencel andrzej-stencel deleted the route-on-data-stream-attributes branch November 5, 2024 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants