Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Changes to System.Activity To support W3C Style IDs and Propagation #28508

Closed
vancem opened this issue Jan 25, 2019 · 20 comments
Closed

API Changes to System.Activity To support W3C Style IDs and Propagation #28508

vancem opened this issue Jan 25, 2019 · 20 comments

Comments

@vancem
Copy link
Contributor

vancem commented Jan 25, 2019

There is a World Wide Web Consortium (W3C) draft standard for the format and propagation of IDs used to correlate logging from various machines in a distributed application (for details see TraceContext. Microsoft (in particular the Applications Insights team), wishes to support this standard in a first-class way.

There is a corresponding Pull Request dotnet/corefx#33207 with an actual implementation. This issue just covers the public surface area additions.

Summary

The changes include the following additions to the existing 'System.Activity' class

    public partial class Activity
    {
        public static ActivityIdFormat DefaultIdFormat { get { throw null; } set { } }
        public static bool ForceDefaultIdFormat { get { throw null; } set { } }
        public ActivityIdFormat IdFormat { get { throw null; } }

        public ref readonly ActivitySpanId ParentSpanId { get { throw null; } }
        public ref readonly ActivitySpanId SpanId { get { throw null; } }
        public ref readonly ActivityTraceId TraceId { get { throw null; } }
        public string TraceStateString { get { throw null; } set { } }
        public Activity SetParentId(in ActivityTraceId traceId, in ActivitySpanId spanId);
}

As well as the following new classes that support these new methods

    public enum ActivityIdFormat
    {
        Hierarchical = (byte)1,
        Unknown = (byte)0,
        W3C = (byte)2,
    }

    public readonly partial struct ActivitySpanId : IEquatable<ActivitySpanId>
    {
        public void CopyTo(Span<byte> destination) { }
        public static ActivitySpanId CreateFromBytes(ReadOnlySpan<byte> idData) { throw null; }
        public static ActivitySpanId CreateFromString(ReadOnlySpan<char> idData) { throw null; }
        public static ActivitySpanId CreateFromUtf8String(eadOnlySpan<byte> idData) { throw null; }
        public static ActivitySpanId CreateRandom() { throw null; }
        public bool Equals(ActivitySpanId spanId) { throw null; }
        public override bool Equals(object obj) { throw null; }
        public override int GetHashCode() { throw null; }
        public static bool operator ==(in ActivitySpanId spanId1, in ActivitySpanId spandId2) { throw null; }
        public static bool operator !=(in ActivitySpanId spanId1, in ActivitySpanId spandId2) { throw null; }
        public string ToHexString() { throw null; }
        public override string ToString() { throw null; }
    }
    public readonly partial struct ActivityTraceId : IEquatable<ActivityTraceId>
    {
        public void CopyTo(Span<byte> destination) { }
        public static ActivityTraceId CreateFromBytes(ReadOnlySpan<byte> idData) { throw null; }
        public static ActivityTraceId CreateFromString(ReadOnlySpan<char> idData) { throw null; }
        public static ActivityTraceId CreateFromUtf8String(ReadOnlySpan<byte> idData) { throw null; }
        public static ActivityTraceId CreateRandom() { throw null; }
        public bool Equals(ActivityTraceId traceId) { throw null; }
        public override bool Equals(object obj) { throw null; }
        public override int GetHashCode() { throw null; }
        public static bool operator ==(in ActivityTraceId traceId1, in ActivityTraceId traceId2) { throw null; }
        public static bool operator !=(in ActivityTraceId traceId1, in ActivityTraceId traceId2) { throw null; }
        public string ToHexString() { throw null; }
        public override string ToString() { throw null; }
    }

Detailed description

These APIs represent two logically distinct sets of functionality

  1. Add the ability for Activity to have more then one ID format.
    Activity's have always supported the concept of an ID, and while it was flexible the format ID was willing to accept, when it was generating IDs it created them in a 'hierarchical' format that looked like a file path (a list of named separated with a particular syntax). The WC3 ID is a much flatter, 'two level' ID (consisting of ActivityTraceId (a 16 byte binary number) representing 'independent' activities, and a 'SpanId' (a 8 byte number) that is used to distinguish among all the activities related to a top level (independent) activity. To support W3C seamlessly, Activity would need to generate its IDs in this format.
    The first feature is APIs that define the new W3C format and allow logging systems to control it (Note that this is not a breaking change because by default activity does what it did before), you have to ask for the new format.
  2. Support efficient ID manipulation.
    In theory just supporting the new Activity ID format however the fact that ID is represented as a string is a source of inefficiency (part of the reason W3C chose this ID format was to make it efficient to manipulate the IDs).
    To this end we define a ActivityTraceId and ActivitySpanId (these are the preferred in the W3C standard), struct types. These types hold the two components of the ID, and allow non-allocating manipulation of them. In addition we add some APIs to allow you to use these types instead of strings in the most important APIs. Conceptually these classes do very little (the simply hold a 16 byte or 8 byte binary value), but they do allow caching, and support obvious functionality (converting to wand from their string representation).

APIs for supporting multiple ID formats

The APIs to support multiple ID formats include

    public partial class Activity
    {
        public static System.Diagnostics.ActivityIdFormat DefaultIdFormat { get { throw null; } set { } }
        public static bool ForceDefaultIdFormat { get { throw null; } set { } }
        public System.Diagnostics.ActivityIdFormat IdFormat { get { throw null; } }
    }

    public enum ActivityIdFormat : byte
    {
        Hierarchical = (byte)1,
        Unknown = (byte)0,
        W3C = (byte)2,
    }

Basically we define a new enum 'ActivityIdFormat' which defines the formats Activity knows about, and have a IdFormat property that indicates which one is in use. That is used when generated new IDs to figure out which format to ue. If a Activity has a parent, then then by default the child activity uses the same format. If the activity is at the top level, it uses the Stack 'DefaultIdFormat' to decide.

The only slightly subtle thing is that App-Insights wanted the ability to FORCE the format REGARDLESS of what the parent's format was. This is what the 'ForceDefaultIdFormat' does. When true you use the default format and ignore the parent format.

TraceState

The W3C standard also defines something called 'Tracestate' in addition to the ID format. The details of its use are not important, only that it needs to propagate along with the Activity (it is like the existing 'Baggage' fields. We add direct support for this.

        public string TraceStateString { get { throw null; } set { } }

In general we don't mind adding 'well know' state like this (as long as we believe 'most' users will want it) Its implementation is trivial (we get and set it, and propagate it to children). We anticipate more of these over time and it they are a good things (this direct support makes it more efficient and more discoverable).

The property was given the name 'TraceStateString' not just 'TraceState' because there is the possibility that we may want to create a struct for this (like we did for ActivityTraceId), and wanted to allow the best name to be used for that if we do that.

ActivityTraceId and ActivitySpanId types.

The new API surface to support efficient manipulation of IDs is as follows.

    public readonly partial struct ActivitySpanId : IEquatable<ActivitySpanId>
    {
        public void CopyTo(Span<byte> destination) { }
        public static ActivitySpanId CreateFromBytes(ReadOnlySpan<byte> idData) { throw null; }
        public static ActivitySpanId CreateFromString(ReadOnlySpan<char> idData) { throw null; }
        public static ActivitySpanId CreateFromUtf8String(eadOnlySpan<byte> idData) { throw null; }
        public static ActivitySpanId CreateRandom() { throw null; }
        public bool Equals(ActivitySpanId spanId) { throw null; }
        public override bool Equals(object obj) { throw null; }
        public override int GetHashCode() { throw null; }
        public static bool operator ==(in ActivitySpanId spanId1, in ActivitySpanId spandId2) { throw null; }
        public static bool operator !=(in ActivitySpanId spanId1, in ActivitySpanId spandId2) { throw null; }
        public string ToHexString() { throw null; }
        public override string ToString() { throw null; }
    }
    public readonly partial struct ActivityTraceId : IEquatable<ActivityTraceId>
    {
        public void CopyTo(Span<byte> destination) { }
        public static ActivityTraceId CreateFromBytes(ReadOnlySpan<byte> idData) { throw null; }
        public static ActivityTraceId CreateFromString(ReadOnlySpan<char> idData) { throw null; }
        public static ActivityTraceId CreateFromUtf8String(ReadOnlySpan<byte> idData) { throw null; }
        public static ActivityTraceId CreateRandom() { throw null; }
        public bool Equals(ActivityTraceId traceId) { throw null; }
        public override bool Equals(object obj) { throw null; }
        public override int GetHashCode() { throw null; }
        public static bool operator ==(in ActivityTraceId traceId1, in ActivityTraceId traceId2) { throw null; }
        public static bool operator !=(in ActivityTraceId traceId1, in ActivityTraceId traceId2) { throw null; }
        public string ToHexString() { throw null; }
        public override string ToString() { throw null; }
    }

Basically ActivitySpanId and ActivityTraceId are simply 8 byte and 16 byte IDs. You can create them either by

  1. Given them binary data
  2. Giving them a hexadecimal UTF8 encoded string (as a ReadOnlySpan)
  3. Giving them a hexadecimal unicode char (2 byte) encoded string (as a ReadOnlySpan)

The operations on these types are basically to fetch the bytes and to print as a string. We may add more over time, but this is enough to start with.

We then add propertied that fetch these IDs from the activity (they return the 'null' ID if the Activity does not have a W3C formatted ID). The main way that Activities get IDs is either by setting the ID of the parent (thus we needed a SetParentID API that takes these ids), and when they are generated randomly (for top level Activities without parents), which is what the 'NewTraceId() method is for.

We do anticipate that more additions to Activity will be needed (even before V3.0 is finalized), but this group is logically coherent, and covers much of the ground we need for V3.0 (and support of W3C IDs).

@vancem
Copy link
Contributor Author

vancem commented Jan 25, 2019

@noahfalk @lmolkova FYI

@benaadams
Copy link
Member

/cc @cwe1ss

@terrajobst
Copy link
Contributor

@vancem

Is this a feature we plan on shipping as part of .NET Core 3.0?

@noahfalk
Copy link
Member

@terrajobst - yep 3.0

@vancem
Copy link
Contributor Author

vancem commented Feb 5, 2019

I am back from being out of the office. I would like to check this change in soon (ideally this week), so that other teams can start building on it.

Any pushback on the design? I would like to close on this. Note that the main concern that was raised (offline), was a concern that the WC3 standard is not finalized. However with the exception of the TraceState variable (which frankly we support only by giving it a (potentially) optimized (trivial) property), the dependency on VERY stable parts of the standard (the ID format has been stable for YEARS, and because other vendors already use it is very unlikely to change). Moreover most of the API change above is simply allowing Activity to embrace the concept of another kind of ID format, and that that format has two parts (TraceId and SpanId). These concepts are very unlikely to change.

The details of the system (e.g. how these get encoded into HTTP, and versioned etc...) are potentially in flux, but these will not affect the API surface in a breaking way (we may add more things). Thus we feel very confident that this API surface is stable. (Indeed we are happy that we were able to make this (rather profound) change in an additive way, because we tried hard with the Activity API this is being added to make something that could be evolved over time. We believe this is just the next step.

So are there any issues? Any further pushback?
What is the next step in the approval process?

@terrajobst
Copy link
Contributor

The name SpanId is very unfortunate, especially because the API proposal also includes the usage of Span<T>. Could we rename this type?

@vancem
Copy link
Contributor Author

vancem commented Feb 7, 2019

I agree that the name collision is unfortunate. The standard https://w3c.github.io/trace-context/ as well as other related correlation technology like open tracing (https://opentracing.io/docs/overview/) use the term 'Span' for what we call 'Activity' (it is a logical operation that has a begining ending, and parent-child relationships). Indeed when Activity was first introduced there were people who wanted it named Span.

Here are some alternatives

  1. Leave it as SpanId. Relying on the fact that this lives in System.Diagnostics namespace to avoid confusion with System.Span.
  2. Put these declarations into the Activity Class. They become Activity.SpanId and Activity.TraceId
  3. Make a namespace System.Diagnostics.Activity and put in there.
  4. Rename them ActivitySpanId and ActivityTraceId

There are probably other options as well, but those are the ones that occur to me.

Note that these types are pretty specialized. They will not be used widely (only logging systems will care), and thus they don't have to be particularly 'Pretty' or Short. This argues against options (1) and event (3) above because the convenience factor is that that big of a deal. I kind of like (2), but I could also definitely live with (4).

Comments?

@noahfalk
Copy link
Member

noahfalk commented Feb 7, 2019

I'm going to add two more options that I thought of:
5) You could put them in a namespace that evokes the standard, for example System.Diagnostics.W3CTraceContext.SpanId
6) The standard also uses another name for SpanId called ParentId. I find ParentId very confusing for its intended usage, but its there.

(5) sounds nice to me but I'm ignorant of prior art we may have used naming things that implement standards. I agree with Vance that (2) and (4) seem pretty reasonable too.

@lmolkova
Copy link

lmolkova commented Feb 8, 2019

I'm fine with (1 - keep SpanId), (2 - Activity.SpanId), (5 - W3CTraceContext.SpanId).

Regarding 6:

The standard also uses another name for SpanId called ParentId

ParentId is existing API and we don't want to change it. We have new ParentSpanId, but somehow it is not mentioned here.

@vancem in the implementation PR dotnet/corefx#33207 we have public ref readonly System.Diagnostics.SpanId ParentSpanId on the Activity. Did you skip it here intentionally?

@vancem
Copy link
Contributor Author

vancem commented Feb 8, 2019

Because I would like to rap this up sooner rather than later, I would like to put a stake in the ground on the naming of SpanId and TraceId. Basically as implementer, I am expressing a preference. This is not a conclusion, but it is saying that unless someone champions another option, it I will implement this choice.

My preference is for both SpanId and TraceId to be nested in Activity. Normally we don't like to do this because it does make the names longer, however, I think that this is appropriate here because

  • These ID are strongly tied to Activity. Moreover we expect these W3C ID to become the standard format for Activity (that is why naming them WC3SpanId or something like that is not as valuable). I don't like WC3 in the names because frankly it is not a well know what it is, and frankly does not add value in the limit (Because we believe from most people's point of view there is only one ID format for activities and it is this one. Naming them Activity.SpanId makes that pretty clear.

  • This also has the effect of 'hiding' these types (only people playing with Activities will find them), and that is a good thing (they are rather specialized).

As mentioned, I could live with other options as well, but I am putting naming them as nested types of Activity (thus Activity.SpanId and Activity.TraceId) as the option to beat.

Does anyone want to champion another option or should we consider this closed?

@vancem
Copy link
Contributor Author

vancem commented Feb 8, 2019

In working with these APIs a bit, it seems very likely that users will want to compare SpanId and TraceId for equality. It is also very reasonable for these APIs to be used as keys in a Dictionary<K, T>. Thus I am also proposing that we add the following APIs to these types, and that both types implement IEquatable

        public static bool operator ==(in TraceId traceId1, in TraceId traceId2);
        public static bool operator !=(in TraceId traceId1, in TraceId traceId2);
        public bool Equals(TraceId traceId):
        public override bool Equals(object obj);
        public override int GetHashCode();

Note that I use 'in' (that is readonly refernce), for the == and != oerpator, but I don't use it on Equals becasue I need to match IEquatable's signature. This allows these IDs to be used as keys in dictionaries without boxing overhead.

Comments?

Note the 'in' operators for the parameters, which basically say pass it by ref.

@vancem
Copy link
Contributor Author

vancem commented Feb 8, 2019

I ran into a snag with making SpanId and TraceId be nested inside activity. We have fields with those names, and C# does not allow both the nested type name and the field name simultaneously. Thus one or the other needs to be moved/renamed.

That leaves us with

  1. ActivitySpanId
  2. Make a namespace System.Diagnostics.Activities and put the SpanId there. (can't use System.DiagnosticsActivty for namespace since that is the name of the class).
  3. A variation on 2, calling it System.Diagnostics.W3C (or something).

I am not a big fan of making the name of the standard part of the name. Standards are mostly NOT well known. Also that would be more appropriate if Activities were going to support several ID formats over time, but our intent is that there will only be this one (we will obsolete the existing Hierarchical format).

Right now I am implementing option 1.

@vancem
Copy link
Contributor Author

vancem commented Feb 20, 2019

I have updated the initial proposal to reflect the feedback and changes because of implementation issues. The changes are

  1. Rename SpanId -> ActivitySpandId - This avoids confusion with Span
  2. Rename TraceId -> ActivityTraceId - For consistancly with ActivitySpanId
  3. ActivitySpandId.AsBytes -> ActivitySpanId.CopyTo(Span byte) - this was needed because AsBytes could not be implemented safely because of lifetime issues. CopyTo avoids these lifetime issues. A analagous change was made for ActivityTraceId.AsBytes.
  4. Removed the readonly attribute for the ref return from Activity.SpanId and Activity.TraceId. The problem is that this forces a copy if you try to use any instance methods of ActivitySpanId or ActivityTraceId. This does open an avenue for mistakes/abuse, but this API is not likely to be used by novices.

The implementation is getting very close to passing all tests. I believe I have addressed all API design feedback, and would like to place a deadline for approval very soon (say tomorrow).

Thanks

@terrajobst

@benaadams
Copy link
Member

4. Removed the readonly attribute for the ref return from

Can they be readonly structs? Also then makes using in an option for passing them around

@vancem
Copy link
Contributor Author

vancem commented Feb 20, 2019

Can they be readonly structs?

The ActivitySpanId and ActivityTraceId are designed to be 'caches' for the binary and/or string value of the ID. Thus they are not perfectly read-only (since the cache of the other format is a write).

I played with the idea of declaring them as a readonly struct, and using some Unsafe casting to update the cache field. That would require a dependence on System.Runtime.CompilerServices.Unsafe which I thought was problematic, but have since learned that it may not be.

I will take one more run at it to see if I can make that work and if it does indeed work out I will put the 'readonly back in (and make the structs readonly).

@vancem
Copy link
Contributor Author

vancem commented Feb 22, 2019

I looked into the option of making ActivityTraceId and ActivitySpanId readonly and it works, but I needed System.Runtime.CompilerServices.Unsafe to make it work (I need to break the read-only-ness in the implementation).

I had a complete working implementation, but there was a snag referencing the Unsafe class (see issue #27918). The current plan is to check in the API above in preview, but to change ActivityTraceId and ActivitySpanId to be readonly when #27918 is fixed (which will be before we ship officially for V3.0).

So we will be making them read-only structs, just not initially...

@stephentoub
Copy link
Member

@terrajobst, the PR for this got merged. Can we please prioritize reviewing this at the next meeting?

@terrajobst
Copy link
Contributor

terrajobst commented Mar 4, 2019

Totally. I'll add it to the list tomorrow.

@stephentoub
Copy link
Member

Thanks.

@vancem
Copy link
Contributor Author

vancem commented Mar 8, 2019

@terrajobst - I have updated the Summary of the API above to include all the suggestions from the Review (they have actually also been implemented, see PRs above).

So take a look at the summary. If it looks good then we are done. Otherwise let me know and I will incorporate any additional feedback.

@vancem vancem closed this as completed Mar 25, 2019
@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants