-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
messaging.client_id
-> messaging.client.id
rename causes issues with code generation
#1031
Comments
Looks like this change has not yet been released so there is still time to do something to avoid the collision. I would greatly appreciate it if it could be taken into consideration with the new name. |
it looks like this has happened before:
but hasn't cause a codegen problem yet because the deprecated attributes were just completely dropped from the yaml files in these cases |
I'd be interested to see if this affects other language code generators |
Yeah I think it's likely cc @jack-berg |
It's interesting to note that from the perspective of the user of the generated semconvs, this scenario is ideal because it does not require changing references to these constants. Wouldn't one possible approach be to consider that in the case of such a conflict, only the non-experimental version is retained? |
I agree this may be the best option
the non-ideal part is that it will automatically change (some of) the emitted telemetry to a newer schema version while the instrumentation is still emitting an older schema version url |
This looks to be affecting the Go code generation: https://github.com/open-telemetry/opentelemetry-go/actions/runs/9180409074/job/25244759934?pr=5394 |
Previously, when renaming of this variety we DROPPED the old attribute. Now we do not. I see a few paths forward here:
In any case, I'll take the blame for not having more discussion of this issue prior to release. It's my opinion that the correct short (and longer term fix) will be on the codegen side. I shouldn't have forced that decision though. |
Also affecting PHP codegen, for the same reasons as others: |
In SemConv we differentiate between
I don't think Option 1 is a good solution for SemConv, as So, for short term that would leave us (IMHO) with options 2. or 5. |
maybe:
it's not perfect, and we could still end up back here if there's a rename from
|
We had some discussion on this in the semconv tooling group. I think there's a few options for how to rename keys, particularly:
No matter the path forward, we're going to pull together a quick statistic of how many |
Follow-up from the SC tooling meeting: It was asked how many attributes currently have an underscore to measure the impact of how the above changes might effect generated code:
In addition to a number of prefix's with |
I would also consider there are a number of attributes that might become a namespace of another attribute if we convert
A blanket rewriting could make I think we might want to make a more flexible template, maybe one that lets the users of codegen specify how the normalization works best for their language. |
That sounds nice but we're getting 2 generated constants that conflict with each other and cause compilation issues. We would have to then post-process the generated code to remove conflicts, which seems clunky at best. If the generator could handle these collisions on its own maybe it would be ok
I agree we don't want the telemetry to change out from under the user. Seems likely to result in telemetry where the telemetry doesn't match what its schema url claims.
Both of these example seem the opposite of what I would expect. I was surprised to see 1.26 released with this known issue. Will there be a 1.26.1 to rectify it? |
@brettmc keep in mind you're running into the situation mentioned above where users are going to have telemetry changed underneath them without realizing it. I'd caution against this. |
1.26 was released assuming this is a codegen specific issue as we've made renames like this in the past. (see my apology above for making the decision, perhaps preemptively). I still think this is an issue with codegen, but I'm asking the other semconv maintainers their opinion on backing off the change for now until a solution is found. |
cc @open-telemetry/specs-semconv-maintainers |
It would be nice if this could be handled by codegen, but keep in mind that changing the way the codegen works is thorny for languages which already have released stable semconv packages. It means likely deprecating all old names and moving to the new style, which results in a lot of unneeded work in instrumentations to follow the new naming scheme. |
Great point! It seems JavaScript is the only affected language. Given that it uses old tooling/templates and separates resource/other attributes into two different files, would it be fair to say that some breaking changes are inevitable there @dyladan ? If so, this and other changes can be batched together and released as semconv v2 package. Since (it seems) the cost of breaking is still low, I think we should disambiguate and make sure that different attribute are guaranteed to have different constant names. The alternative I see is to tolerate the downside @trask brought up
We should never rename a stable attribute and this would be a minor disturbances for experimental ones. Still it might be surprising for users that their query no longer works even though the attribute constant name has not changed and I'd prefer to fix it if we can. |
This also affects PHP and Go at least. I suspect it also affects others. Separating resource/other attributes into separate files is an unrelated issue though. The issue is that we need both old and new names in order to handle the double-emit telemetry for the compatibility story. JS is already planning to change how we generate semconv in the future (PR: open-telemetry/opentelemetry-js#4690). We're going to keep the old names around and mark them as deprecated, but the new names are causing this problem. See #1064 to see how we're generating the new names. I believe both the old and new generation scheme would have the same problems though.
I'm not sure I agree that the cost of the break is "low" because the level of surprise would be quite high if we changed names out from under users without them making code changes. My preferred fix would be to disallow any and all collisions, including with deprecated names, where non-alphanumeric characters are treated the same. For example |
In Go we release separate versions of semconv as separate packages. Dropping deprecated values would be acceptable for us in this situation given a user will need to explicitly make the upgrade by switching packages. |
@dyladan If I understand your reply, we're talking about the same solution:
|
Opentelemetry-cpp was affected also: Generation for the old name was disabled in the template.
|
Adding to this: I started writing a codegen for C# and I'm running into the same issue. The only way around it given the information at the time of rendering the template is to disable rendering of any deprecated attributes to avoid name clashes. |
it should be possible to modify the function that generates constant name. It will affect existing attributes, other than The tooling will provide the proper function for it, so this would be a workaround. What's important is to agree on the consistent formatting. For languages that use camelCase or PascalCase it could probably be formatted as it can be achieved with existing tooling with a macro similar to {%- macro to_const_name_v2(attr_name) -%}
{%- set ns=namespace(up=True) -%}
{%- for l in attr_name -%}
{%- if ns.up -%}
{{l | upper}}
{%- elif l != '.' -%}
{{l}}
{%- endif -%}
{%- set ns.up=(l=='.' or l=='_') -%}
{%- endfor -%}
{%- endmacro %} In any case, please do share your thoughts on the format we should provide in tooling (whether |
I think this needs more investigations. For example, there are collisions between |
They will result in different names: The alternative is to do something like |
Note this also affect class names
|
Correct, I missed the fact that the So, to summarize: Semantic conventions:
can be generated as:
or as:
depending of the language style (CamelCase, UPPERCASE). This is a breaking change for every semantic convention that contains a The breaking change can not be avoided, by definition: the mapping for one of the colliding names has to change. @lmolkova This solution will work for us (opentelemetry-cpp). |
Assuming this is satisfactory for all SIG, could we have a new release of https://github.com/open-telemetry/build-tools, so that the primitives that convert names are adjusted (or new primitives are provided) ? Then each SIG can use the fixed primitives to generate code that disambiguates collisions. cc @open-telemetry/cpp-maintainers |
Thanks! I will use this for the time being until this is fixed in build-tools. I agree with @marcalff that the |
Discussed at the SemConv and maintainers SIGs:
|
The recommendation for Motivation:
semantic-conventions/docs/messaging/messaging-spans.md Lines 38 to 43 in cde003c
Example on how to implement configurable dropping in Jinja - https://github.com/crossoverJie/semantic-conventions-java/pull/1/files See #1118 (comment) for discussion on the general issue (and steps we're taking to prevent future collisions). |
closing this one: see #1031 (comment) for client_id specific guidance and #1118 (comment) for the future approach. For the time being such changes are prohibited and guarded with a policy check in CI #1209 |
Area(s)
area:messaging
What happened?
Last week in #948
messaging.client_id
was renamed tomessaging.client.id
. In the JS code generator, we use{{ attribute.fqn | to_const_name }}
to generate variable names. This results in conflicting constants with the same nameMESSAGING_CLIENT_ID
. I'm not sure anything can be done about this, but wanted to raise it to the semconv group as this is the first time I've seen such a conflict.Semantic convention version
main
Additional context
We're currently updating our semconv package. We continue to export deprecated attributes in order to make changes to the package non-breaking. The conflicting names get in the way of the code generator. I don't want to special-case the name for a single attribute if I can avoid it, and the "good" name is currently squatted on by the old deprecated attribute.
The text was updated successfully, but these errors were encountered: