-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK tracing: clarify multi-processors scenarios #338
Conversation
Could you add a (pseudo-)code example (as comment or in the OTEP) how that could look like? E.g. I'm not sure I understand
Does this mean if I have an |
@Oberon00 I've added an example in the spec. Another pipeline sends all spans (unfiltered) to ZPages or stdout implemented as processors or exporters. You may want to send all spans to different exporters and potentially process (filter and batch) them differently depending on exporter capabilities, storage cost and other factors. Tagging cannot be done independently as instances of spans are shared among processors/exporters. Please let me know if it clarifies your question. |
Well, the class AnnotateSpanWithThreadNameSpanProcessor: ISpanProcessor {
public override void OnStartSpan(Span span) {
span.setAttribute("thread_name", Thread.CurrentThread.Name);
}
} It would not seem useful to add an exporter behind that span processor, as (1) it modifies the Span in-line and (2) it does not even do anything upon ending the Span. To me it seems that chainability in the base SpanProcessor interface would require that base interface to be Split into SpanStartProcessor and SpanEndProcessor interfaces and chainability would only be incorporated in the latter. But actually I think that the base interface can stay as it is now, and we can still have e.g.: class FilteringSpanProcessor: ISpanProcessor {
public FilteringSpanProcessor(Predicate<ReadableSpan> filter, ISpanProcessor next) { /* ... */ }
}
class MultiSpanProcessor: ISpanProcessor {
public MultiSpanProcessor(params ISpanProcessor[] next) { /* ... */ }
}
// ...
tracerFactory.AddSpanProcessor(new FilteringSpanProcessor(s => s.HasAttribute("foo"), new MultiSpanProcessor(
new BatchExportProcessor(new ZipkinExporter(...)),
new FooSpanProcessor(...)
)); |
According to spec we have this:
What I propose (no change in SpanProcessor interface) in addition to what spec tells is to encouradge chaining on processors:
It might be useful to split them further into Start and End, but seems still useful without splitting. Configuration then looks like
this may be done in a much more friendly way like we've done in .NET open-telemetry/opentelemetry-dotnet#286, samples |
So basically what @Oberon00 you propose and what I propose is a very similar thing. You may fork processing pipeline in multiple places (multi-processor is a fork). Your example forks before filtering (like you've done it). I fork first by adding two processors (to filter spans differently for different destinations). All I'm asking for to spec to clearly allow and encourage possibility of such behavior. |
I agree with @lmolkova 's example here #338 (comment) What I don't understand is why the spec needs to be explicit about it and even allow for multiple Processors. Both chaining and forked processing can be achieved through composition. There can be provided standard processors (or rather factory methods) implementing chained or forked/parallel pipelines. |
@yurishkuro I think it is beneficial to document a concept of chained processors. So when you have a need to implement one of these features - this type of extensibility will be used and not any other. |
specification/sdk-tracing.md
Outdated
| | | | | | | | ||
| | | | BatchProcessor | | SpanExporter | | ||
| | Span.start() leProcessor +---> (JaegerExporter) | | ||
| SDK | Span.end() | | | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the formatting broken here? I don't know how to read this.
@SergeyKanzhelev https://en.wikipedia.org/wiki/Decorator_pattern. What other explanation is needed? |
I think there is a different aspect to it - if there are some standard features that each SDK must provide, that's worth describing (assuming those qualify as MUST). |
specification/sdk-tracing.md
Outdated
or batching independently on each pipeline. | ||
|
||
SDK MUST allow implementing helpers as composable components that use the same | ||
chainable `SpanProcessor` interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we really want to have this functionality in a language library? This largely duplicates functionality and is very similar to what OpenTelemetry Collector implements. Batching, tagging, filtering and all sorts of processing already exists in the Collector.
I am worried that we are complicating client libraries with features that may be best implemented elsewhere (and are already implemented). In my opinion client libraries must stay as simple as possible and delegate all complicated functionality to a nearby running Agent or Collector.
This will result in considerable saving of time and effort we are spending on OpenTelemetry project (we have a bunch of language libraries but only one Collector) and ensure all that complicated behavior is implemented and once and consistently. If this is done in each individual library there is a chance they will behave slightly differently (which may be frustrating in mixed-language environments) and we are overburdening language library authors to write all this code and risk introducing bugs.
In my opinion language library should do only minimal processing that is important from performance perspective and just perform batching.
Even multiple pipelines seem to be an overkill to me.
I would like to know how others feel about this. I may be wrong here, so thoughts are welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certain tagging and filtering will be better off implemented in-proc. And to enable this we need to describe what extensibility points must be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeyKanzhelev yes, I understand that certain functionality is better done in-proc, particularly because otherwise it is significantly less efficient from overall CPU consumption perspective. I am arguing that we should however clearly articulate what is that certain functionality and instead of broadly recommending to implement rich set of processing capabilities in language libraries do exactly the opposite and recommend not to implement anything other than that minimal set of necessary functionality.
What exactly is the necessary functionality can be discussed separately. My argument is that this PR recommends and encourages library authors to do more than is necessary.
My initial take on this would be that we need only batching and head sampling (not even tail sampling). Everything else belongs to out-of process agent. I may be wrong on this so probably some experimentation and benchmarking is necessary to find the right minimal set of processors that we want.
Again, it may well be that we need all the listed processors (tagging, filtering, etc) but the recommendation should be expressed in the opposite form and encourages minimalism in implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this PR recommending to do more than ideal. When it talks about example - there is a note that it is "complicated configuration". Can you suggest the language that will help to discourage people to do it? Or perhaps explicitly mention that Collector can do all the processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this:
SDK authors are encouraged to implement common functionality such as queuing,
batching, tagging, etc. as helpers. This functionality will be applicable
regardless of what protocol exporter is used.
and instead say this:
SDK authors are encouraged to implement minimal functionality that must be done in-process for efficient operation (e.g. batching and head sampling) and delegate complicated processing functionality to out-of-process Agents or Collectors. This minimizes the burden of language library implementation and results in consistent behavior of such processing in multi-language deployment scenarios.
I would completely remove the following 2 sentences:
Processors (or exporters) may
implement tagging, batching, filtering and other advanced scenarios.
SDK MUST allow to end each pipeline with individual exporter and do filtering
or batching independently on each pipeline.
I would also remove the complicated example with the diagram.
@yurishkuro the problem here is that decorator pattern explains how to do chaining, but doesn't explain what kind of functionality will be chained. It also goes to your other comment. Filtering that can be explained via configuration is in many cases not descriptive enough. So it may be easier to tell how to code filters and where to plug them. |
Sorry I was on vacation and did not update it for a while. I'm all in to not require SDK to do any extra work. While batching and head sampling are needed everywhere (or almost everywhere), tagging and filtering are needed frequently and cannot/should not be done by Collector. Examples:
As active contributor to the OpenTelemtery .NET I wanted to understand how to design SpanProcessor and SpanExporter and their registration APIs on TraceFactory. I found that spec is too vague on what multiple processors mean and how to achieve the above scenarios in process. I'm attempting to close this gap. This PR and advanced example intend to clarify how it is going to be used: multi processors are for multiple destinations, chains of processors are advanced configuration users may do (or not). Based on the feedback I will:
|
@lmolkova I think what you do is very important, thank you. I am not against a particular functionality in client libraries. If we feel that it is a must have then let's have it. My objection is primarily conceptual: let's reverse our recommendation and instead of encouraging client libraries to add rich functionality, let's discourage them from adding functionality that is not absolutely required. The reason I want to do that is what I already wrote above:
If we feel that a particular functionality is important but can be implemented well outside the process let's recommend it to be implemented in OpenTelemetry Collector. |
@tigrannajaryan thanks, now I see your point and agree with it. I updated the doc. |
@@ -161,22 +162,29 @@ Manipulation of the span processors collection must only happen on `TracerFactor | |||
instances. This means methods like `addSpanProcessor` must be implemented on | |||
`TracerFactory`. | |||
|
|||
Each processor registered on `TracerFactory` is a start of pipeline that consist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmolkova I understand that you are clarifying the document and not suggesting new functionality but if you don't mind I would like to bring attention of approvers to this issue while we are working on this part of the spec.
@open-telemetry/specs-approvers what are the arguments for having multi-pipeline feature in-process rather than use the already existing equivalent functionality in OpenTelemetry Collector? In my opinion this could be removed from the SDK and we encourage using Collector for this, but perhaps there are use cases that I am not aware of that require this to be in SDK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it can be removed from SDK. There are definitely cases when collector cannot be installed. Also some spans can be filtered out right from the process to save on cross-proc communication with collector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan for the z-pages functionality that we want to support it is inefficient to stream everything to the collector. So at least one example that we want to support is:
- Send all spans that record events to the z-pages span processor
- Send all
sampled
spans to the configure exporter (collector probably)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to Sergey and Bogdan's comments
I don't think that having equivalent functionality is bad in this case (and I say this as someone who expects most of their customers to use the collector). As Sergey mentioned, there will always be cases where people can't use the collector or where it doesn't add much value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -161,22 +162,29 @@ Manipulation of the span processors collection must only happen on `TracerFactor | |||
instances. This means methods like `addSpanProcessor` must be implemented on | |||
`TracerFactory`. | |||
|
|||
Each processor registered on `TracerFactory` is a start of pipeline that consist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tigrannajaryan for the z-pages functionality that we want to support it is inefficient to stream everything to the collector. So at least one example that we want to support is:
- Send all spans that record events to the z-pages span processor
- Send all
sampled
spans to the configure exporter (collector probably)
Echoing from my comment further up: I don't think that having equivalent functionality in the libraries and the collector is bad in this case (and I say this as someone who expects most of their customers to use the collector). As Sergey mentioned, there will always be cases where people can't use the collector or where it doesn't add much value, and they should still be able to export to multiple backends simultaneously. |
Have enough approvals and active long enough. Merging |
* Recommend chaining on processors Fixes open-telemetry#316 * fix lint * fix lint2 * example for span chaining * fix formatting * Update sdk-tracing.md * Update sdk-tracing.md * Update sdk-tracing.md * remove advanced example, do not encourage implementing helpers in the SDK * recommend to move implement new common processing scearios out-of-proc * fix lint * BatchProcessor -> BatchEporterProcessor
* Recommend chaining on processors Fixes open-telemetry#316 * fix lint * fix lint2 * example for span chaining * fix formatting * Update sdk-tracing.md * Update sdk-tracing.md * Update sdk-tracing.md * remove advanced example, do not encourage implementing helpers in the SDK * recommend to move implement new common processing scearios out-of-proc * fix lint * BatchProcessor -> BatchEporterProcessor
Fixes #316
Clarifies that multiple processors registered on TracerFactory allow multiple exporters (i.e. not chainable) but chaining could be done on additionally.
Now spec recommends chaining on exporters, check out #316 for reasons why processors are better suited for it.