Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a metric for counting the number of suppressed spans. #2135

Merged
merged 6 commits into from
Feb 17, 2021

Conversation

jkwatson
Copy link
Contributor

This will help in diagnosing mysterious "missing span" issues.

Copy link
Contributor

@breedx-splk breedx-splk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, a few small things for polish if you wanna.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just remembered, we have been removing CallDepth tracking (which prevents an instrumentation from creating multiple nested spans) in places where it is redundant with the nested CLIENT span check (#907), but that means those instrumentations will trigger this "suppressed instrumentation" log even though that's not really the case.

I'm not really sure how to detect a "real" suppressed instrumentation 🤔

@jkwatson
Copy link
Contributor Author

Oh, I just remembered, we have been removing CallDepth tracking (which prevents an instrumentation from creating multiple nested spans) in places where it is redundant with the nested CLIENT span check (#907), but that means those instrumentations will trigger this "suppressed instrumentation" log even though that's not really the case.

I'm not really sure how to detect a "real" suppressed instrumentation 🤔

Can you point me at an example of this? Maybe we need to update the code to only use this code path when we're actually doing "real" span suppression (whatever that means).

@trask
Copy link
Member

trask commented Jan 29, 2021

Check out #1811 for a bunch of places where I removed the CallDepth tracking since we can now rely on nested client span suppression.

@jkwatson
Copy link
Contributor Author

Oh, I just remembered, we have been removing CallDepth tracking (which prevents an instrumentation from creating multiple nested spans) in places where it is redundant with the nested CLIENT span check (#907), but that means those instrumentations will trigger this "suppressed instrumentation" log even though that's not really the case.

I'm not really sure how to detect a "real" suppressed instrumentation 🤔

So... is this an objection to this PR? Or just a musing that the word "suppression" might not be 100% accurate? If we're suppressing spans, we're suppressing them, whether or not it's due to too many nested spans, or other reasons, no?

@trask
Copy link
Member

trask commented Jan 29, 2021

So... is this an objection to this PR? Or just a musing that the word "suppression" might not be 100% accurate? If we're suppressing spans, we're suppressing them, whether or not it's due to too many nested spans, or other reasons, no?

I'm not sure how it's going to be useful since it's going to log lots of false positives due to the removal of CallDepth tracking.

Interestingly, in order to implement #465 (comment), we will probably have to bring back CallDepth tracking (or something similar), so we know whether we should suppress due to "CallDepth", or whether we should add an event to current span.

@jkwatson
Copy link
Contributor Author

So... is this an objection to this PR? Or just a musing that the word "suppression" might not be 100% accurate? If we're suppressing spans, we're suppressing them, whether or not it's due to too many nested spans, or other reasons, no?

I'm not sure how it's going to be useful since it's going to log lots of false positives due to the removal of CallDepth tracking.

Interestingly, in order to implement #465 (comment), we will probably have to bring back CallDepth tracking (or something similar), so we know whether we should suppress due to "CallDepth", or whether we should add an event to current span.

Are those actually false positives though? We are suppressing the spans in those cases, aren't we?

@jkwatson jkwatson force-pushed the log_span_suppression branch from 7b43c42 to 46811c4 Compare February 5, 2021 20:47
@jkwatson jkwatson changed the title Add debug logging when a span is suppressed. Add a metric for counting the number of suppressed spans. Feb 5, 2021
@jkwatson
Copy link
Contributor Author

jkwatson commented Feb 5, 2021

I changed this to use a LongCounter to keep track of the suppressed spans. There is one optimization to be made if we want to go this route.

@jkwatson
Copy link
Contributor Author

@trask @iNikem @anuraaga thoughts on the current state, using metrics for this?

@jkwatson jkwatson force-pushed the log_span_suppression branch from 46811c4 to b010cab Compare February 11, 2021 23:13
Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on the current state, using metrics for this?

I think it's a cool idea, let's try it out 👍

@iNikem
Copy link
Contributor

iNikem commented Feb 12, 2021

Metrics are cool, but... :) They are alpha and will remain such for a long time. Do we want for instrumentation-api do depend on alpha artifact? We cannot then release this API as stable., can we?

@anuraaga
Copy link
Contributor

@iNikem Hmm this is a good point. Even if it's an implementation detail for us, it could still cause dependency hell for users regardless.

But my guess is metrics will stabilize before instrumentation - we still have the question of naming things tracer vs instrumenter, only because of metrics.

So I'm ok with it and if we actually release instrumentation first at the time we can flag guard or something similar.

@iNikem
Copy link
Contributor

iNikem commented Feb 15, 2021

But my guess is metrics will stabilize before instrumentation

You are an optimist about metrics and pessimist about instrumentations :D

@jkwatson
Copy link
Contributor Author

Just thinking out loud, but another option would be to create some sort of "Supportability" interface that could be implemented by by whatever the user would prefer (logging, or metrics, or micrometer metrics, or a no-op). So, we could decouple the implementation from the need for providing some sort of observability for the internals of the instrumentation APIs/agent.

@iNikem iNikem merged commit 836da8d into open-telemetry:main Feb 17, 2021
@jkwatson jkwatson deleted the log_span_suppression branch February 17, 2021 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants