-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enrichment/trace] Handle exception for span events #99
Changes from 1 commit
26a9fe2
3883a14
4361255
80c485f
82d649e
222386e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,7 +18,11 @@ | |
package elastic | ||
|
||
import ( | ||
"crypto/md5" | ||
"crypto/rand" | ||
"encoding/hex" | ||
"fmt" | ||
"io" | ||
"math" | ||
"net" | ||
"net/http" | ||
|
@@ -152,7 +156,8 @@ | |
// Ensure all dependent attributes are handled. | ||
s.normalizeAttributes() | ||
|
||
if isElasticTransaction(span) { | ||
isElasticTxn := isElasticTransaction(span) | ||
if isElasticTxn { | ||
s.enrichTransaction(span, cfg.Transaction) | ||
} else { | ||
s.enrichSpan(span, cfg.Span) | ||
|
@@ -416,16 +421,39 @@ | |
} | ||
} | ||
|
||
type spanEventEnrichmentContext struct { | ||
exception bool | ||
exception bool | ||
exceptionEscaped bool | ||
|
||
exceptionType string | ||
exceptionMessage string | ||
exceptionStacktrace string | ||
} | ||
|
||
func (s *spanEventEnrichmentContext) enrich( | ||
se ptrace.SpanEvent, | ||
cfg config.SpanEventConfig, | ||
) { | ||
// TODO(lahsivjar): Is error/log context handling be done in es-exporter? | ||
// Ref: https://github.com/elastic/apm-data/blob/59d33b8113d629f43dc82f9b3931ceda63a19a7a/input/otlp/traces.go#L1243-L1260 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [For reviewers] APM-data sets context for span-events where it sets fields like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think that all properties that are already mapped as top level fields in the logs data stream should not be handled by enrichments but in the ES exporter. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CC: @carsonip This would require some changes in the spanevent handling in es-exporter. I think just copying some attributes from the parent span should be good enough. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. makes sense There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. span event span.id, transaction.id are already done in elasticsearchexporter https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/elasticsearchexporter/model.go#L645-L647 in that case, no change is needed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't know
If needed, it should be under attributes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
As they are under attributes, it is better done in enrichment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think it is a good idea to do this as enrichments since the fields we are talking about here are derived from the span. The ES exporter decides HOW to extract and encode the span event (in a separate doc or in the same doc). Whlie currently we are encoding them as a separate doc making this dependent on enrichment doesn't feel correct to me. (#99 (comment) discussion thread is also related to the same topic) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The above commit adds |
||
|
||
// Extract top level span event information. | ||
s.exception = se.Name() == "exception" | ||
if s.exception { | ||
se.Attributes().Range(func(k string, v pcommon.Value) bool { | ||
switch k { | ||
case semconv.AttributeExceptionEscaped: | ||
s.exceptionEscaped = v.Bool() | ||
case semconv.AttributeExceptionType: | ||
s.exceptionType = v.Str() | ||
case semconv.AttributeExceptionMessage: | ||
s.exceptionMessage = v.Str() | ||
case semconv.AttributeExceptionStacktrace: | ||
s.exceptionStacktrace = v.Str() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gregkalapos These attributes are mostly 1<>1 mappings however, I didn't see them in the otel-data plugin so added it here. Do you think we should do this here or in the otel-data plugin? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe we can create alias fields for these? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, aliases are how we have been handling such fields. Will wait for @gregkalapos to confirm There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'd be also for aliases as well. One exception (no pun intended) may be the That's the only edge case which comes to my mind, I'd expect the other ones to work with aliases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. Could you add these aliases, Greg? One challenge will be how to add the attribute aliases just for logs, without these getting overridden with the standard set of attribute aliases. Maybe we'll need to add them to the alias definition that applies to all signals. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But we don't even need the aliases, don't we? 🤔 This should all work out via Following could be sent via attributes and would be visible top level:
I opened a draft: elastic/elasticsearch#113636 I commented out all the aliases, the test is still green which queries There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gregkalapos I think the alias should be for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, I think we might need to put alias for stacktrace too: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I see. Ok, we can do Same for stack trace. |
||
} | ||
return true | ||
}) | ||
} | ||
|
||
// Enrich span event attributes. | ||
if cfg.TimestampUs.Enabled { | ||
|
@@ -434,6 +462,43 @@ | |
if cfg.ProcessorEvent.Enabled && s.exception { | ||
se.Attributes().PutStr(AttributeProcessorEvent, "error") | ||
} | ||
if s.exceptionType == "" && s.exceptionMessage == "" { | ||
// Span event does not represent an exception | ||
se.Attributes().PutStr(AttributeEventKind, "event") | ||
lahsivjar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
se.Attributes().PutStr(AttributeMessage, se.Name()) | ||
lahsivjar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return | ||
} | ||
|
||
// Span event represents exception | ||
if cfg.ErrorID.Enabled { | ||
if id, err := newUniqueID(); err == nil { | ||
se.Attributes().PutStr(AttributeErrorID, id) | ||
} | ||
} | ||
if cfg.ErrorExceptionType.Enabled && s.exceptionType != "" { | ||
se.Attributes().PutStr(AttributeErrorExceptionType, s.exceptionType) | ||
} | ||
if cfg.ErrorExceptionMessage.Enabled { | ||
se.Attributes().PutStr(AttributeErrorExceptionMessage, s.exceptionMessage) | ||
} | ||
if cfg.ErrorExceptionHandled.Enabled { | ||
se.Attributes().PutBool(AttributeErrorExceptionHandled, !s.exceptionEscaped) | ||
} | ||
if cfg.ErrorStacktrace.Enabled { | ||
// TODO (lahsivjar): Where to parse stacktraces? | ||
lahsivjar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
se.Attributes().PutStr(AttributeErrorStacktrace, s.exceptionStacktrace) | ||
} | ||
if cfg.ErrorGroupingKey.Enabled { | ||
// See https://github.com/elastic/apm-data/issues/299 | ||
hash := md5.New() | ||
// ignoring errors in hashing | ||
if s.exceptionType != "" { | ||
io.WriteString(hash, s.exceptionType) | ||
} else if s.exceptionMessage != "" { | ||
io.WriteString(hash, s.exceptionMessage) | ||
} | ||
se.Attributes().PutStr(AttributeErrorGroupingKey, hex.EncodeToString(hash.Sum(nil))) | ||
lahsivjar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
} | ||
|
||
// getRepresentativeCount returns the number of spans represented by an | ||
|
@@ -540,3 +605,16 @@ | |
"HTTP 4xx", | ||
"HTTP 5xx", | ||
} | ||
|
||
func newUniqueID() (string, error) { | ||
var u [16]byte | ||
if _, err := io.ReadFull(rand.Reader, u[:]); err != nil { | ||
return "", err | ||
} | ||
|
||
// convert to string | ||
buf := make([]byte, 32) | ||
hex.Encode(buf, u[:]) | ||
|
||
return string(buf), nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should solve that by adding a
parent.id
->span_id
alias in the logs data stream.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid the "parent" in
parent.id
is not the span, but the parent of span, i.e. parent.id != span_id.I looked at the apm-data code, it appears that parent.id is really the span event's span's parent id. Not the span's id: https://github.com/elastic/apm-data/blob/cd0bb416903eebb5dc22e3eb6b6e4e9c75ad51fe/input/otlp/traces.go#L196
Parent id is passed onto the span event via cloning the span: https://github.com/elastic/apm-data/blob/cd0bb416903eebb5dc22e3eb6b6e4e9c75ad51fe/input/otlp/traces.go#L1072
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is true for errors. In case of errors, we specifically override the parent ID of the
SpanEvent
to be theSpanID
(or `TransactionID for elastic transaction): https://github.com/elastic/apm-data/blob/bd5e34116889942c53b186deab0f76beaf182de0/input/otlp/traces.go#L1255-L1259There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wow, I missed that. Then is it a bug for logs from span events in apm-server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this fix in apm-data look right? elastic/apm-data#374
If so, yes, I think we can have
parent.id
alias ->span.id
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are discussing on slack the alternative to make the UI use
span.id
directly: https://elastic.slack.com/archives/C95SB62AG/p1727693736192519?thread_ts=1727462154.365409&cid=C95SB62AG