-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Changes to support W3C style IDs and propagation #33207
Conversation
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it. I suggested one API tweak to use enum instead of bool for exposing format.
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
let's assume new Activity is used with old HttpClient & old Asp.Net Core which are not aware of W3C. depending on If we want to be backward compatible we can try to detect such cases on tracing system level and inject traceparent ourselves, but I'd like to avoid this level of hacks. HttpClient in this case will send if (currentActivity.IsWC3Id)
request.Headers.Add(DiagnosticsHandlerLoggingStrings.TraceParentHeaderName, currentActivity.W3CId);
else
request.Headers.Add(DiagnosticsHandlerLoggingStrings.RequestIdHeaderName, currentActivity.Id); |
Under what situation would someone want to set Activity.UseW3CFormat = true, while at the same time continuing to use legacy software components in their services that do not support W3C? The closest scenario I am envisioning looks like this: In this case I'd still be scared to set Activity.UseW3CFormat = true in node A. In particular with HttpClient implemented as suggested that means that all my requests to Y and Z could switch from RequestID to TraceParent and by assumption Y and Z probably have old software in them that doesn't understand TraceParent. Because updating the software in the whole system will be hard, I'd want to make a very localized change that lets me send TraceParent to W, while leaving everything else untouched to minimize work and risk of regression. To accomplish this I might want to logically do something like this:
|
@lmolkova I don't think we have a scenario where you use old httpClient and new activity. Both are part of the .NET core framework and would be updated together. So I don't think your scenario happens. But on a larger, more philosophical note, I think we should keep our versioning scenarios few and as simple as possible. Cleary the 'all new case' has to work, as well as the 'all old case'. We MAY want to consider cases where some machines are old and some are new but even that case the main compelling scenario I want there is the ability to UPGRADE nodes of the system incrementally. People can stay on the old system until the whole system is able to use the new system. My skeptisism here is that what I have found is that if things like this are not SUPER SIMPLE, users don't end up doing them anyway (they just live without it or use the old system). I am also SUPER worried that it is fragile (we do our part, but other parts of the system do not, so it actually has no value to the end user). Thus my mantra here is to do the EASIEST/SIMPLEST thing first (the homogenous new, and while allowing the old system to work as well with the new code (no forced update)), and only add new scenarios beyond this if they have been vetted as useful, AND do not introduce much complexity. For what it is worth... |
Regarding old HttpClient + new DiagnosticSource. They do ship together, but the pace users update is different. ApplicationInsights will enforce new DiagnosticSource, while it may take years before users update to .NET Core 3.0. As a tracing system, we cannot require users to upgrade to latest .NET version. what you propose is that we detect that users do not have W3C enabled ASP.NET Core & HttpClient and turn off W3C for them. This will de-facto block W3C adoption and will create a discrepancy between different versions. Note that there is no way to convert Request-Id to W3C - old root Id is not compatible with new TraceId format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an absolute minimum of changes to support W3C it may work. But we clearly need to keep working on this. For instance, expose and allow to override span-id
, make tracestate
a list, measure perf an consider using specialized types for trace-id
and span-id
. In a spirit of moving forward can you please address feedback (mostly renaming's) before merging.
/// it is expected to be special cased (it has its own HTTP header), it is more | ||
/// convenient/efficient if it is not lumped in with other baggage. | ||
/// </summary> | ||
public string TraceState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the fact it is a string, not a list is still up with discussions, correct? Should it be marked in comments that this API is not final?
/// it is expected to be special cased (it has its own HTTP header), it is more | ||
/// convenient/efficient if it is not lumped in with other baggage. | ||
/// </summary> | ||
public string TraceState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec uses tracestate as a word. Please do not upper case State
. See discussion census-instrumentation/opencensus-specs#171
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other http headers/consts that I know of in Microsoft/System.Net are written with .net naming conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracestate
follows naming convention if considered to be a single word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am at best ambivalent about changing the case. Sure we are consistent with the spec, but is that the important thing? Is seems more useful to allow users (who probably never read the spec) to 'guess' the correct name. I would argue that they would guess TraceState (after all it is two English words).
In the end, this is a small thing (intellisense fixes it and frankly, the number of users of this API is very small), so will not fight things one way or the other. My recommendation is to let the API review people decide. I will leave it for now.
/// to store vendor-specific information that must flow. | ||
/// Logically it is just a kind of baggage (if flows just like baggage), but because | ||
/// it is expected to be special cased (it has its own HTTP header), it is more | ||
/// convenient/efficient if it is not lumped in with other baggage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this summary is helpful for the person uses this API. I'd suggest the following text:
The filed
tracestate
carries information supplemental to trace identity contained intraceparent
. List of key value pairs carries bytracestate
convey information about request position in multiple distributed tracing graphs. It is typically used by distributed tracing systems and should not be used as a general purpose baggage as this use may break correlation of a distributed trace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add this to the comment
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
I do not agree. We should understand what is the default and configurable behavior for Id format. Now it's defined by upstream service. We should allow either enforce Request-Id or W3C: deterministic behavior controlled by current service. |
@lmolkova I agree with you that we need to make it deterministic. This implementation make it an ASP.NET decision as parsing logic may decide to switch Activity to the new format when receiving Thus I'm saying it is an OK first step that may require many more steps |
I have added a strawman for Activity sampling. The idea is to make an 'always on' capability cheap enough so that we will indeed leave it always on (just turn the sampling down. Basically on incoming HTTP request (in RecordRequestStartEventLog) we now will do something like this
Logging systems can basically issue config (requests) to have a certain sampling level (SamplingPercent). Setting this to 15 will cause on average 15 out of every 100 events to be sampled. The activity class keeps track of all the outstanding request for sampling and basically chooses the largest. When the logging system disposes the config then that request for sampling goes away. The sampling system tries to keep the given sampling rate as well as honor any explicit requests to sample (e.g. from a W3C id or from SetRecordingDesired. |
@vancem would it be possible to make sampling proposal a separate PR? So we can keep making progress and sampling discussing will not block possible other discussions |
I have pulled the activity sampling logic into its own PR #33605. |
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Show resolved
Hide resolved
@vancem this seems like a stale PR with accidental merge instead of rebase (which makes it nearly impossible to review). I will close it for now. |
Reopening. I am updating the branch to fix the accidental merge. |
Is the "NO MERGE" still needed? Do we have rough ETA for the PR? |
Yes, the PR is still in a state that we would not want to check in. We want this in for 3.0, so we expect to finalize it this month. |
src/System.Diagnostics.DiagnosticSource/ref/System.Diagnostics.DiagnosticSource.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
see https://w3c.github.io/trace-context This is mostly for discussion purposes.
I added a workaround for the package cycle. This is going to fail package testing due to the dangling ref. Let me add a workaround for that. |
Reminding myself that the API review issue for this new API is https://github.com/dotnet/corefx/issues/34828 |
|
||
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> | ||
<ItemGroup Condition="$(TargetFramework.StartsWith('netstandard1.')) OR $(TargetFramework.StartsWith('netcoreapp1.'))"> | ||
<PackageReference Include="System.Memory" Version="4.5.1" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why pick this version over 4.5.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a temporary workaround in the package testing infra. It's nothing that persists into the product. You can tell folks to manually reference the latest.
@@ -38,6 +38,9 @@ | |||
<PackageReference Include="$(TargetingPackNugetPackageId)"> | |||
<Version >$(_TargetingPackVersion)</Version> | |||
</PackageReference> | |||
<PackageReference Include="System.Memory" Condition="'$(TargetGroup)' == 'net45' OR '$(TargetGroup)' == 'net46'"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other netfx versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/System.Diagnostics.DiagnosticSource/ref/System.Diagnostics.DiagnosticSourceActivity.cs
Outdated
Show resolved
Hide resolved
@@ -2,11 +2,8 @@ | |||
<Project ToolsVersion="14.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> | |||
<PropertyGroup> | |||
<BuildConfigurations> | |||
netstandard1.1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing this test configurations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We got failures trying to build these test configurations because of the way we build with xunit. After some discussion with @ericstj we concluded that these configurations did not make sense (it did not increase coverage over the other configurations that are still present). I can forward the e-mail conversation on it if you wish.
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
src/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/Activity.cs
Outdated
Show resolved
Hide resolved
Uses Utf8 helpers where possible.
OK, unless someone vetos it, I am merging this later today. I have incorperated @ahsonkhan feedback. I do want to get packages to the App-Insights folks. |
Question: Would we address API review feedback (if any) in a subsequent commit? I imagine people having opinions about the public ActivitySpanId(System.ReadOnlySpan<byte> idData, bool isUtf8Chars = false) { throw null; } |
if (idData.Length != 16) | ||
throw new ArgumentOutOfRangeException(nameof(idData)); | ||
idData.CopyTo(outBuff); | ||
_id1 = BinaryPrimitives.ReverseEndianness(_id1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% confident about reversing the endianness here, mainly because I am not sure why we get the bytes as big endian which ends up requiring this reversal. I would have thought we would want: if (!BitConverter.IsLittleEndian) { reverse }
, i.e. reverse only for BigEndian. Just highlighting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we want it the way it currently is. Basically the standard wants the bytes written in byte-by-byte order, which is 'standard' for network stuff. This happens to correspond to big-endian if you try to clump it as larger integers as we are doing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did not get any feedback on the API review on this API. We can change it, but the sooner the better, as it is a breaking change. The alternative is to pick one to use the span constructor and the other is a static (or make both methods that take strings (as Spans) to be static). If there is precedent for resolving this I am happy to follow it.
/// and returns the resulting string. | ||
/// </summary> | ||
internal static string SpanToHexString(ReadOnlySpan<byte> bytes) | ||
{ | ||
Debug.Assert(bytes.Length <= 16); // We want it to not be very bing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: very bing
Nice! |
* First set of changes to support W3C style IDs and propagation see https://w3c.github.io/trace-context This is mostly for discussion purposes. * Changed UseW3CFormat to be DefaultIdFormat * Review feedback * More comments. Small Renames * Added Sampling Support, Review feedback. * Added placeholder for SetRecordingDesired * Separated out the Sampling support into its own PR. * Remove more sampling support * Fix IsWC3Id -> IdFormat * Added Sampling Support, Review feedback. * Separated out the Sampling support into its own PR. * Add ForceW3C option. * Introduce ForceDefaultIdFormat * Adding SpanId and TraceId support * Change ulong->long in SpanID (probably temporary) * Remove undesired file changes * Fix up reference assembly. Note complete because of questions about Span<byte>, but closer now. * Turn on code that used Span<byte> * Defer setting IDFormat until start. This insures that the IDFormat property is either unknown or a given value (that never changes from there). * More implementation, made the interface more uniform. * Support to avoid using strings whenever possible Basically Id, SpanId TraceId properties are set lazily and only converted lazily. * Added some tests * First round of testing (and bugfixes) * More testing * Added Equality operators * Rename SpanId -> ActivitySpanId TraceId ->ActivityTraceId * Add Comments * Fix bad XML comment * Review feedback * Change AsBytes -> CopyTo Lifetime issues prevent returning Span<T> (which is what AsBytes does. Reverting to CopyTo instead. Also added System.Memory ref in attempt to resolve build errors (that don't reproduce locally). * Attempt to fix build break in netfx build * Deal with Verification errors. * Provide full namespace for SecuritySafeCriticalAttribute * More securitySafeCritical annotations to fix test failures on desktop * Another attempt on setting SecuritySafeCritical * Remove the readonly ref (to avoid perf issue) * Fix most -buildAllConfigurations issues * Tentative change to see if testing works if we ignore the older netstandard configs. * Workaround package cycle involving DiagnosticSource and Memory * Satisfy dangling System.Memory dangling reference in package tests * Add notes to remove the workarounds when Unsafe is fixed. * Review feedback Uses Utf8 helpers where possible. Commit migrated from dotnet/corefx@fa07e4f
* First set of changes to support W3C style IDs and propagation see https://w3c.github.io/trace-context This is mostly for discussion purposes. * Changed UseW3CFormat to be DefaultIdFormat * Review feedback * More comments. Small Renames * Added Sampling Support, Review feedback. * Added placeholder for SetRecordingDesired * Separated out the Sampling support into its own PR. * Remove more sampling support * Fix IsWC3Id -> IdFormat * Added Sampling Support, Review feedback. * Separated out the Sampling support into its own PR. * Add ForceW3C option. * Introduce ForceDefaultIdFormat * Adding SpanId and TraceId support * Change ulong->long in SpanID (probably temporary) * Remove undesired file changes * Fix up reference assembly. Note complete because of questions about Span<byte>, but closer now. * Turn on code that used Span<byte> * Defer setting IDFormat until start. This insures that the IDFormat property is either unknown or a given value (that never changes from there). * More implementation, made the interface more uniform. * Support to avoid using strings whenever possible Basically Id, SpanId TraceId properties are set lazily and only converted lazily. * Added some tests * First round of testing (and bugfixes) * More testing * Added Equality operators * Rename SpanId -> ActivitySpanId TraceId ->ActivityTraceId * Add Comments * Fix bad XML comment * Review feedback * Change AsBytes -> CopyTo Lifetime issues prevent returning Span<T> (which is what AsBytes does. Reverting to CopyTo instead. Also added System.Memory ref in attempt to resolve build errors (that don't reproduce locally). * Attempt to fix build break in netfx build * Deal with Verification errors. * Provide full namespace for SecuritySafeCriticalAttribute * More securitySafeCritical annotations to fix test failures on desktop * Another attempt on setting SecuritySafeCritical * Remove the readonly ref (to avoid perf issue) * Fix most -buildAllConfigurations issues * Tentative change to see if testing works if we ignore the older netstandard configs. * Workaround package cycle involving DiagnosticSource and Memory * Satisfy dangling System.Memory dangling reference in package tests * Add notes to remove the workarounds when Unsafe is fixed. * Review feedback Uses Utf8 helpers where possible. Commit migrated from dotnet/corefx@fa07e4f
See https://w3c.github.io/trace-context for more on what W3C Ids are.
This is mostly for design discussion purposes. Adding NO MERGE flag until we work through the details. (we will need a design review as well, but first things first).
Also see https://github.com/aspnet/Hosting/pull/1577/files for changes to the ASP.NET hosting code to support incoming HTTP propagation of the ID and TraceState.
This implementation was meant to be pretty much the 'minimal' change from where we are today that would seem to support the W3C style IDs. It does not (yet) support some important considerations including
This work includes feedback from Noah where we try to keep things super-simple and avoid 'support both' functionality if at all possible. Ideally in 5 years we only have one format (the W3C format), and everything else is only available via compatibility switches (or not supported).
You can see the changes to Activity are pretty minimal. Basically
@noahfalk @lmolkova @glennc @SergeyKanzhelev @davidfowl @pakrym