Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent transform of attribute values #2010

Closed
alanwest opened this issue Apr 24, 2021 · 11 comments
Closed

Consistent transform of attribute values #2010

alanwest opened this issue Apr 24, 2021 · 11 comments
Labels
enhancement New feature or request Stale Issues and pull requests which have been flagged for closing due to inactivity

Comments

@alanwest
Copy link
Member

There has been some recent discussion about how to handle the differences in
what the .NET APIs allows as attribute values and what is declared in the OpenTelemetry specification.
Opening this issue to continue the discussion. (#1966, #1973)

The specification states that an attribute values is either:

  • A primitive type: string, boolean, double precision floating point (IEEE 754-1985) or signed 64 bit integer.
  • An array of primitive type values. The array MUST be homogeneous, i.e. it MUST NOT contain values of different types. For protocols that do not natively support array values such values SHOULD be represented as JSON strings.

The API for setting a tag on a .NET Activity is broader in scope than what the OpenTelemetry specification requires.
A tag's value can be any type.

public System.Diagnostics.Activity SetTag (string key, object? value);

Currently, the exporters in this project convert tag values to a string, boolean, double, or int64 - or an array of these - as appropriate.

I propose we continue to embrace the flexibility of the .NET Activity API and ensure that we are consistently converting all C# built-in types to the most appropriate type declared in the OpenTelemetry specification as follows:

  • All numeric types which would not be truncated should be converted to a double or int64 value.
    • Examples: short, int, long, float, double
  • All numeric types which may be truncated if converted to a double or int64 should be converted to a string.
    • Examples: decimal, ulong
  • A bool is a bool
  • All other types would be converted to a string.

For the most part, I think this is pretty close to what the exporters already do, but there are a few open questions:

  1. There exist numeric built-in types that are currently converted to a string
    • byte, sbyte, ushort, uint
  2. Arrays of built-in types should work
    • Currently a byte[], char[], or decimal[], for example, are converted to the string System.Byte[] and System.Char[], and System.Decimal[].
  3. ToString can be perilous. Should we be concerned?
    • ToString on an arbitrary type could result in an exception (or worse a deadlock... I've actually seen this). Currently, I think our exporters would throw away an entire batch if an exception were thrown when transforming an attribute value. Should we be more defensive here and discard attributes that cannot be transformed?
@alanwest
Copy link
Member Author

Ah apologies, I just saw this related issue and comment from @cijothomas #1946 (comment). If this is our stance then this would alleviate the concern of risky calls to ToString. Though, I'd still like to consider broadening our handling of built-in types many of which are not called out as supported in the spec, but since the API allows for them I think it would be unintuitive to drop them.

@Oberon00
Copy link
Member

This needs some more clarification though. E.g. if you interpreted the OTel spec very strictly, then int values (32 bit in .NET) are not valid, only long. And what about unsigned?

@cg110
Copy link

cg110 commented May 4, 2022

I've certainly been bitten by conversion issues in this area (originally with the new relic exporter) so ended up with a custom processor that in the OnEnd updates the tags to be something that it handled, but I still use with the OpenTelemetry Exporter.

If it's a string it's left as a string
Other types are tested with:

var typeCode = Type.GetTypeCode(type);
return typeCode == TypeCode.Decimal || (type.IsPrimitive && typeCode != TypeCode.Object && typeCode != TypeCode.Char);

(I now look at it, and realize that Object is never primitive, so should just be IsPrimative and not Char)

If they don't meet the above test, then they get converted to strings, but with certain standardization:
enums are sent as strings (using an enum to string cache, as enum.ToString() is slow in netfx (slow enough that it shows up in profiling))
DateTime set to Utc (setting the Unspecified kind as UTC, as in our code base they are generally UTC if not specified) and converted with ToString("O", CultureInfo.InvariantCulture)
DateTimeOffset converted with ToString("O", CultureInfo.InvariantCulture)
TimeSpan converted with ToString("c", CultureInfo.InvariantCulture)

Everything else is put through:
Convert.ToString(value, CultureInfo.InvariantCulture)

I've no doubt that arrays won't be correct (however, we've not used arrays for tags yet)

So it would be useful to know what OTel will do with the above types, and if I need to maintain the conversions.

Perhaps it'd be useful if there was something I could call to check if OTel can handle the Type, so I can munge anything OTel can't.

@cg110
Copy link

cg110 commented May 4, 2022

Also how far do you go, eg should anything that's IEnumerable<int> or ICollection<int> actually be converted to an int array on the wire? (so lists or similar would also work alongside [])

@alanwest
Copy link
Member Author

alanwest commented May 9, 2022

This needs some more clarification though. E.g. if you interpreted the OTel spec very strictly, then int values (32 bit in .NET) are not valid, only long. And what about unsigned?

Given that the .NET API is not strict (allows object-typed tag values), I'd like us to settle on a happy medium ground between a strict read of the spec and handling what we think will be commonly used datatypes that are also easy to handle.

@alanwest
Copy link
Member Author

alanwest commented May 9, 2022

DateTime set to Utc (setting the Unspecified kind as UTC, as in our code base they are generally UTC if not specified) and converted with ToString("O", CultureInfo.InvariantCulture) DateTimeOffset converted with ToString("O", CultureInfo.InvariantCulture) TimeSpan converted with ToString("c", CultureInfo.InvariantCulture)

Everything else is put through: Convert.ToString(value, CultureInfo.InvariantCulture)

I'm leaning toward not handling enums, DateTime, TimeSpan, and "everything else". Different users may want these things formatted differently, so rather than the SDK imposing an opinion, I'd prefer users handle ToStringing these values to fit their specific need.

@cg110
Copy link

cg110 commented May 9, 2022

DateTime set to Utc (setting the Unspecified kind as UTC, as in our code base they are generally UTC if not specified) and converted with ToString("O", CultureInfo.InvariantCulture) DateTimeOffset converted with ToString("O", CultureInfo.InvariantCulture) TimeSpan converted with ToString("c", CultureInfo.InvariantCulture)

Everything else is put through: Convert.ToString(value, CultureInfo.InvariantCulture)

I'm leaning toward not handling enums, DateTime, TimeSpan, and "everything else". Different users may want these things formatted differently, so rather than the SDK imposing an opinion, I'd prefer users handle ToStringing these values to fit their specific need.

That's fine, as everyone will have different needs (some will want datetime as unix epoch etc), for me it'd be nice to know if something is already handled before attempting to handle it (rather than having a hard coded list, although I suspect it won't change much)

I don't know if that's a hook needed or similar that can add more conversions, or if processor is enough.

@alanwest
Copy link
Member Author

alanwest commented May 9, 2022

Also how far do you go, eg should anything that's IEnumerable or ICollection actually be converted to an int array on the wire? (so lists or similar would also work alongside [])

I think handling just array keeps things simple. Handling IEnumerable could be risky. I might have a class that implements IEnumerable<int> that cannot be safely enumerated concurrently or multiple times.

@cijothomas
Copy link
Member

I think handling just array keeps things simple.

+1 to this

Copy link
Contributor

This issue was marked stale due to lack of activity and will be closed in 7 days. Commenting will instruct the bot to automatically remove the label. This bot runs once per day.

@github-actions github-actions bot added the Stale Issues and pull requests which have been flagged for closing due to inactivity label Sep 25, 2024
Copy link
Contributor

github-actions bot commented Oct 3, 2024

Closed as inactive. Feel free to reopen if this issue is still a concern.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale Issues and pull requests which have been flagged for closing due to inactivity
Projects
None yet
Development

No branches or pull requests

4 participants