Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enrichment/trace] Handle exception for span events #99

Merged
merged 6 commits into from
Oct 1, 2024

Conversation

lahsivjar
Copy link
Contributor

@lahsivjar lahsivjar commented Sep 23, 2024

Related to: #86

@lahsivjar lahsivjar requested a review from a team as a code owner September 23, 2024 18:25
enrichments/trace/internal/elastic/span.go Outdated Show resolved Hide resolved
enrichments/trace/internal/elastic/span.go Show resolved Hide resolved
Comment on lines 437 to 438
// TODO(lahsivjar): Is error/log context handling be done in es-exporter?
// Ref: https://github.com/elastic/apm-data/blob/59d33b8113d629f43dc82f9b3931ceda63a19a7a/input/otlp/traces.go#L1243-L1260
Copy link
Contributor Author

@lahsivjar lahsivjar Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[For reviewers] APM-data sets context for span-events where it sets fields like span.id, transaction.id, parent.id based on the span (ref), however, I am not sure if enrichments is the correct place to do this. Since the es-exporter could decide how to encode span-events into ES docs (separate or in the same doc as parent span) maybe it would be better to have this handled in the expoter? I am confused here, WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that all properties that are already mapped as top level fields in the logs data stream should not be handled by enrichments but in the ES exporter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC: @carsonip This would require some changes in the spanevent handling in es-exporter. I think just copying some attributes from the parent span should be good enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

Copy link
Member

@carsonip carsonip Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span event span.id, transaction.id are already done in elasticsearchexporter https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/model.go#L645-L647

in that case, no change is needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need transaction.type for span events?

I don't know

If so, should they be in attributes.* or at top level?

If needed, it should be under attributes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that all properties that are already mapped as top level fields in the logs data stream should not be handled by enrichments but in the ES exporter.

If needed, it should be under attributes.

As they are under attributes, it is better done in enrichment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As they are under attributes, it is better done in enrichment

I don't think it is a good idea to do this as enrichments since the fields we are talking about here are derived from the span. The ES exporter decides HOW to extract and encode the span event (in a separate doc or in the same doc). Whlie currently we are encoding them as a separate doc making this dependent on enrichment doesn't feel correct to me.

(#99 (comment) discussion thread is also related to the same topic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me and @carsonip talked about this IRL and decided to do it as enrichments for now due to the differences in the handling of error and log contexts in apm-data (ref). Once the specs are a bit more clearer around this we can revisit the enrichment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

222386e

The above commit adds transaction.{type, sampled} enrichment for span event.

Comment on lines 445 to 452
case semconv.AttributeExceptionEscaped:
s.exceptionEscaped = v.Bool()
case semconv.AttributeExceptionType:
s.exceptionType = v.Str()
case semconv.AttributeExceptionMessage:
s.exceptionMessage = v.Str()
case semconv.AttributeExceptionStacktrace:
s.exceptionStacktrace = v.Str()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregkalapos These attributes are mostly 1<>1 mappings however, I didn't see them in the otel-data plugin so added it here. Do you think we should do this here or in the otel-data plugin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can create alias fields for these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, aliases are how we have been handling such fields. Will wait for @gregkalapos to confirm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd be also for aliases as well.

One exception (no pun intended) may be the handled flag: from what I remember from the UI testing, the UI access that field, so it always needs to be present (We could question why, but from what I remember, that's how it is today), but exception.escaped is only recommended. So we always need to populate the handled flag.

That's the only edge case which comes to my mind, I'd expect the other ones to work with aliases.

Copy link
Member

@felixbarny felixbarny Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Could you add these aliases, Greg? One challenge will be how to add the attribute aliases just for logs, without these getting overridden with the standard set of attribute aliases. Maybe we'll need to add them to the alias definition that applies to all signals.
Also, I'm not sure if we need an alias for the exception stacktrace, since we likely need to fetch that from _source anyway, because IINM, we don't have a stored field for it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we don't even need the aliases, don't we? 🤔 This should all work out via passthrough.

Following could be sent via attributes and would be visible top level:

  • exception.type
  • exception.message
    ( above 2 are in SemConv - I don't think we even need to do anything with those here in the enrichment lib)
  • exception.handled: that is not part of OTel, se we need to set it here.

I opened a draft: elastic/elasticsearch#113636 I commented out all the aliases, the test is still green which queries exception.type top level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregkalapos I think the alias should be for error.exception.{type, message} -> exception.{type, message}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think we might need to put alias for stacktrace too: error.stack_trace -> exception.stacktrace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

Ok, we can do error.exception.{type, message} -> exception.{type, message}; I'll update the PR.

Same for stack trace.

@@ -47,4 +47,15 @@ const (
AttributeSpanDestinationServiceResource = "span.destination.service.resource"
AttributeSpanDurationUs = "span.duration.us"
AttributeSpanRepresentativeCount = "span.representative_count"

// span event attributes
AttributeParentID = "parent.id"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should solve that by adding a parent.id -> span_id alias in the logs data stream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid the "parent" in parent.id is not the span, but the parent of span, i.e. parent.id != span_id.

I looked at the apm-data code, it appears that parent.id is really the span event's span's parent id. Not the span's id: https://github.com/elastic/apm-data/blob/cd0bb416903eebb5dc22e3eb6b6e4e9c75ad51fe/input/otlp/traces.go#L196

Parent id is passed onto the span event via cloning the span: https://github.com/elastic/apm-data/blob/cd0bb416903eebb5dc22e3eb6b6e4e9c75ad51fe/input/otlp/traces.go#L1072

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid the "parent" in parent.id is not the span, but the parent of span, i.e. parent.id != span_id.

I don't think this is true for errors. In case of errors, we specifically override the parent ID of the SpanEvent to be the SpanID (or `TransactionID for elastic transaction): https://github.com/elastic/apm-data/blob/bd5e34116889942c53b186deab0f76beaf182de0/input/otlp/traces.go#L1255-L1259

Copy link
Member

@carsonip carsonip Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wow, I missed that. Then is it a bug for logs from span events in apm-server?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this fix in apm-data look right? elastic/apm-data#374
If so, yes, I think we can have parent.id alias -> span.id.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are discussing on slack the alternative to make the UI use span.id directly: https://elastic.slack.com/archives/C95SB62AG/p1727693736192519?thread_ts=1727462154.365409&cid=C95SB62AG

enrichments/trace/internal/elastic/attributes.go Outdated Show resolved Hide resolved
Comment on lines 437 to 438
// TODO(lahsivjar): Is error/log context handling be done in es-exporter?
// Ref: https://github.com/elastic/apm-data/blob/59d33b8113d629f43dc82f9b3931ceda63a19a7a/input/otlp/traces.go#L1243-L1260
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that all properties that are already mapped as top level fields in the logs data stream should not be handled by enrichments but in the ES exporter.

Comment on lines 445 to 452
case semconv.AttributeExceptionEscaped:
s.exceptionEscaped = v.Bool()
case semconv.AttributeExceptionType:
s.exceptionType = v.Str()
case semconv.AttributeExceptionMessage:
s.exceptionMessage = v.Str()
case semconv.AttributeExceptionStacktrace:
s.exceptionStacktrace = v.Str()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can create alias fields for these?

enrichments/trace/internal/elastic/span.go Outdated Show resolved Hide resolved
enrichments/trace/internal/elastic/span.go Outdated Show resolved Hide resolved
enrichments/trace/internal/elastic/span.go Outdated Show resolved Hide resolved
@lahsivjar lahsivjar merged commit 10e3f6c into elastic:main Oct 1, 2024
3 checks passed
@lahsivjar lahsivjar deleted the spanevent branch October 1, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants