Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid use of trace context static methods in favor of Tracer-defined behavior #3082

Merged
merged 13 commits into from
Apr 17, 2023

Conversation

raphw
Copy link
Contributor

@raphw raphw commented Mar 28, 2023

Currently, the agent-core version of TraceContext offers different constant values such as header names, and static methods to operate on these headers. This hard-codes plugins to use specific header names, and also leaves some interpretation towards how these names should be translated if they contain characters that are not otherwise allowed.

This change avoids all use of the TraceContext class and rather delegates to the Tracer API. The ElasticApmTracer in turn simply implements the new methods by using the existing TraceContext implementation. Additionally, the existing static constants are replaced by a getter via the Tracer API.

In this context, I found one detail that I did not fully understand. In multiple plugins, there is a propagation of W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME and ELASTIC_TRACE_PARENT_TEXTUAL_HEADER_NAME inline in code. Within the copy/remove/contains method, a TRACESTATE_HEADER_NAME is however also considered. Is this meant to happen? I also wonder why containsTraceContextTextHeaders only checks for W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME.

The currently failing SQS tests are an example for this: they transfer W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME and TRACESTATE_HEADER_NAME, but not ELASTIC_TRACE_PARENT_TEXTUAL_HEADER_NAME. When is what header used?

Also, I found that different plugins use custom flavours of headers. I moved all this to a TraceHeaderDisplay which encapsulates the logic for all flavors that are used throughout the project, such that all methods become "flavor sensitive".

@github-actions github-actions bot added agent-java community Issues and PRs created by the community triage labels Mar 28, 2023
@github-actions
Copy link

👋 @raphw Thanks a lot for your contribution!

It may take some time before we review a PR, so even if you don’t see activity for some time, it does not mean that we have forgotten about it.

Every once in a while we go through a process of prioritization, after which we are focussing on the tasks that were planned for the upcoming milestone. The prioritization status is typically reflected through the PR labels. It could be pending triage, a candidate for a future milestone, or have a target milestone set to it.

@apmmachine
Copy link
Contributor

apmmachine commented Mar 28, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview previewSnapshots

Expand to view the summary

Build stats

  • Start Time: 2023-04-17T13:45:27.126+0000

  • Duration: 14 min 7 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run benchmark tests : Run the benchmark tests.

  • run jdk compatibility tests : Run the JDK Compatibility tests.

  • run integration tests : Run the Agent Integration tests.

  • run end-to-end tests : Run the APM-ITs.

  • run windows tests : Build & tests on windows.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@JonasKunz
Copy link
Contributor

In this context, I found one detail that I did not fully understand. In multiple plugins, there is a propagation of W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME and ELASTIC_TRACE_PARENT_TEXTUAL_HEADER_NAME inline in code. Within the copy/remove/contains method, a TRACESTATE_HEADER_NAME is however also considered. Is this meant to happen? I also wonder why containsTraceContextTextHeaders only checks for W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME.

The currently failing SQS tests are an example for this: they transfer W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME and TRACESTATE_HEADER_NAME, but not ELASTIC_TRACE_PARENT_TEXTUAL_HEADER_NAME. When is what header used?

You can find an explanation of the usage of the headers in this spec.
To quickly summarize the most important parts:

  • W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME is the current way of propagating the actual TraceContext (e.g. traceId, spanId)
  • ELASTIC_TRACE_PARENT_TEXTUAL_HEADER_NAME is the legacy way of propagating the TraceContext. We switched from using this header to W3C_TRACE_PARENT_TEXTUAL_HEADER_NAME three years ago. For backwards compatibility reasons, our agent still looks for the header when decoding (e.g. if a request is received from an ancient APM agent) and has a configuration option to optionally use this header when encoding.
  • TRACESTATE_HEADER_NAME is used to propagate additonal information in addition to the standard TraceContext. For example, elastic uses it to propagate the used sampling rate for probabilistic sampling. This sampling rate is used in our APM server to extrapolate the sampled transactions to provide accurate metrics (e.g. if 10 Transaction per second are observed with a sampling rate of 0.1, the actual throughput is 100 Transactions/sec)

@JonasKunz JonasKunz self-assigned this Apr 13, 2023
@raphw
Copy link
Contributor Author

raphw commented Apr 14, 2023

I understand this better now, thanks for the clarifications. I agree with your arguments in the respect that the responsibility for the very (optional) encoding should lie with the plugins. It feels more right, and the enum does not cover all possible cases anyways.

I do however not agree that the need for such encoding will disappear. Imagine, for example, that you wanted to write some plugin for some service that only accepts captialized names as headers. In this case, the W3C headers would still need to be adjusted. I also think that the plugins for binary or JMS should keep a sanity check, in case some future header would, for some reason, fail some constraint. I don't think that the W3C has all JVM technologies in the back of their head when they write their specifications.

I think one of the reason I went for this approach was to avoid allocation during tracing, but revisiting, the encoding still allocates. With wrapping header getters and setters, the relevant headers can however be translated ahead of time and then it's only an (allocation free) map lookup from there. The objects are already immutable and not reallocated anyways.

What I would try to do would be to reduce the Tracer to only the getter for the headers. The contract would then be that the context propagation uses these headers and the header setters/getters have to translate these headers on the fly if that is what's relevant for their application.

@raphw
Copy link
Contributor Author

raphw commented Apr 14, 2023

I removed the encoding and let the plugins handle the header name translation now. This seems much cleaner to me, as you suggested.

raphw and others added 2 commits April 14, 2023 03:30
# Conflicts:
#	apm-agent-plugins/apm-spring-webflux/apm-spring-webflux-testapp/src/main/java/co/elastic/apm/agent/springwebflux/testapp/GreetingAnnotated.java
#	apm-agent-plugins/apm-spring-webflux/apm-spring-webflux-testapp/src/main/java/co/elastic/apm/agent/springwebflux/testapp/GreetingFunctional.java
#	apm-agent-plugins/apm-vertx/apm-vertx-common/src/main/java/co/elastic/apm/agent/vertx/AbstractVertxWebHelper.java
@JonasKunz
Copy link
Contributor

I do however not agree that the need for such encoding will disappear. Imagine, for example, that you wanted to write some plugin for some service that only accepts captialized names as headers. In this case, the W3C headers would still need to be adjusted. I also think that the plugins for binary or JMS should keep a sanity check, in case some future header would, for some reason, fail some constraint. I don't think that the W3C has all JVM technologies in the back of their head when they write their specifications.

Good points, I agree 100% here.

@JonasKunz
Copy link
Contributor

run elasticsearch-ci/docs

@JonasKunz JonasKunz merged commit 01b7db0 into elastic:main Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-java community Issues and PRs created by the community triage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants