-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing W3C TraceContext (fixes #116) #180
Implementing W3C TraceContext (fixes #116) #180
Conversation
This introduces a w3c TraceContext propagator, primarily inspired by opencensus.
Currently have one typing issue I'm sorting through, but the rest should be working as intended. |
pass | ||
if context == trace.INVALID_SPAN_CONTEXT: | ||
return | ||
traceparent_string = "-".join( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traceparent_string = "00-{:032x}-{:016x}-{:02x}".format(trace_id, span_id, trace_options)
|
||
|
||
_DELIMITER_FORMAT = "[ \t]*,[ \t]*" | ||
_MEMBER_FORMAT = "(%s)(=)(%s)" % ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we use "".format(...)
instead of %
.
def __setitem__(self, key: str, value: str) -> None: | ||
# According to the w3c spec, we can only store 32 values | ||
if len(self) >= self.MAX_TRACESTATE_VALUES: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we want to pop the oldest value and take the new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, by this logic the tracestate becomes immutable once it reaches the limit. However, overwriting an existing key does not change length and should work IMO.
Also I am reading the limit in the spec as a MAY, so IMO it would make sense to apply the limit during extraction only (but not prevent people from sending large tracestate).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we should just keep TraceState being a dictionary (or ordered dictionary) and not entangle it with the W3C spec more closely than required. So the formatter would just strip superfluous values on extract (and maybe inject).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good. @unitaker your point is valid: existing keys should be overrideable. Can add.
@reyang the spec is ambiguous as to what should be done when a maximum is reached. I think either could be an issue.
@Oberon00 that is a good point. will move it to the extract / injection part. Or probably remove that limit entirely.
All that said it's a good point that this is an optional choice. I'd probably just vote to remove this restriction in the basic implementation: I worry about people not knowing the spec well enough and running into a tracecontext member limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toumorokoshi the spec is clear about this:
Then entries SHOULD be removed starting from the end of tracestate.
Note that other truncation strategies like safe list entries, blocked list entries,
or size-based truncation MAY be used, but are highly discouraged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, I've left some minor suggestions, none of them are blocking though.
Let's try to get this reviewed and merged, and then work on smaller PR to polish it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good to me 👍 , just nitpicks.
@@ -12,7 +12,7 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
# | |||
|
|||
import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not using regular expressions for this, as the format should be simple enough to process with "".split("-")
. Also, I think implementing https://www.w3.org/TR/trace-context/#versioning-of-traceparent would be easier that way. But that is no strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would require checking the validity of the individual values, but I can take a look. regex is very efficient for these type of operations.
I was just curious: the spec looks like it provides no guarantees around the format of future traces. so the most we could do is split on first "-", check the version, and pass-through or ignore if the version is above 00. would that be the right forward-compatible behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think regex is the most efficient way (scientifically) to handle string parsing - if implemented using the correct Ken Thompson DFA/NFA. No strong opinion though :)
return trace.INVALID_SPAN_CONTEXT | ||
"""Extracts a valid SpanContext from the carrier. | ||
|
||
If a header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete sentence.
if not header: | ||
return trace.INVALID_SPAN_CONTEXT | ||
|
||
match = re.search(cls._TRACEPARENT_HEADER_FORMAT_RE, header[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match = re.search(cls._TRACEPARENT_HEADER_FORMAT_RE, header[0]) | |
match = cls._TRACEPARENT_HEADER_FORMAT_RE.fullmatch(header[0]) |
- If you have a regex object, I think it looks better to use the method instead of the the
re
module's function (I did not know that this even works) search
is bad because it may skip over any junk before the content we are searching for.match
is better because it ignores only junk after the end butfullmatch
is probably what we should use here.
EDIT: I see now you have used ^$
in the regex, but IMHO fullmatch
is still more clear than search
(and maybe more efficient).
|
||
if trace_id == "0" * 32 or span_id == "0" * 16: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this check is very useful here. I'd only check after constructing span_context
at the end, that way we can do integer comparisons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems ok. The one point there is we have to reject all values if either is invalid, as well as this skips some level of processing. But should be fine.
A string that adheres to the w3c tracestate | ||
header format. | ||
""" | ||
return ",".join(map(lambda key: key + "=" + tracestate[key], tracestate)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return ",".join(map(lambda key: key + "=" + tracestate[key], tracestate)) | |
return ",".join(key + "=" + tracestate[key] for key in tracestate) |
or
return ",".join(map(lambda key: key + "=" + tracestate[key], tracestate)) | |
return ",".join(key + "=" + value for key, value in tracestate.items()) |
@@ -250,7 +252,7 @@ def get_default(cls) -> "TraceOptions": | |||
DEFAULT_TRACE_OPTIONS = TraceOptions.get_default() | |||
|
|||
|
|||
class TraceState(typing.Dict[str, str]): | |||
class TraceState(OrderedDict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not happy about that but it seems the best we can do since there is no generic OrderedDict. (And all dicts being ordered is only guaranteed since 3.7 and implemented since 3.6).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no need for this anymore since we're making the "TraceState" object in otel more generic.
A string that adheres to the w3c tracestate | ||
header format. | ||
""" | ||
return ",".join(map(lambda key: key + "=" + tracestate[key], tracestate)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we now allow any key/value-types in the tracestate, should we use str(key)
/str(value)
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good point. I'll probably add this logic in a subsequent pr.
@@ -261,10 +263,39 @@ class TraceState(typing.Dict[str, str]): | |||
https://www.w3.org/TR/trace-context/#tracestate-field | |||
""" | |||
|
|||
MAX_TRACESTATE_VALUES = 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these really belong here? I think TraceState is not tightly bound to the W3C trace state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this logic entirely.
insertion order as iteration order is not a guarantee in older version of python.
I'm merging this now to unblock. Open to any further suggestions as well. |
|
||
RFC 4.2.2: | ||
|
||
If no traceparent header is received, the vendor creates a new trace-id and parent-id that represents the current request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you wrap these comments?
for start_time and end_time Make lint happy Addressing comments Addressing comments Allowing 0 as start and end time Fix lint issues Metrics API RFC 0003 cont'd (open-telemetry#136) * Create functions Comments for Meter More comments Add more comments Fix typos * fix lint * Fix lint * fix typing * Remove options, constructors, seperate labels * Consistent naming for float and int * Abstract time series * Use ABC * Fix typo * Fix docs * seperate measure classes * Add examples * fix lint * Update to RFC 0003 * Add spancontext, measurebatch * Fix docs * Fix comments * fix lint * fix lint * fix lint * skip examples * white space * fix spacing * fix imports * fix imports * LabelValues to str * Black formatting * fix isort * Remove aggregation * Fix names * Remove aggregation from docs * Fix lint * metric changes * Typing * Fix lint * Fix lint * Add space * Fix lint * fix comments * address comments * fix comments Adding a working propagator, adding to integrations and example (open-telemetry#137) Adding a full, end-to-end example of propagation at work in the example application, including a test. Adding the use of propagators into the integrations. Metrics API RFC 0009 (open-telemetry#140) * Create functions Comments for Meter More comments Add more comments Fix typos * fix lint * Fix lint * fix typing * Remove options, constructors, seperate labels * Consistent naming for float and int * Abstract time series * Use ABC * Fix typo * Fix docs * seperate measure classes * Add examples * fix lint * Update to RFC 0003 * Add spancontext, measurebatch * Fix docs * Fix comments * fix lint * fix lint * fix lint * skip examples * white space * fix spacing * fix imports * fix imports * LabelValues to str * Black formatting * fix isort * Remove aggregation * Fix names * Remove aggregation from docs * Fix lint * metric changes * Typing * Fix lint * Fix lint * Add space * Fix lint * fix comments * handle, recordbatch * docs * Update recordbatch * black * Fix typo * remove ValueType * fix lint Console exporter (open-telemetry#156) Make use_span more flexible (closes open-telemetry#147). (open-telemetry#154) Co-Authored-By: Reiley Yang <[email protected]> Co-Authored-By: Chris Kleinknecht <[email protected]> WSGI fixes (open-telemetry#148) Fix http.url. Don't delay calling wrapped app. Skeleton for azure monitor exporters (open-telemetry#151) Add link to docs to README (open-telemetry#170) Move example app to the examples folder (open-telemetry#172) WSGI: Fix port 80 always appended in http.host (open-telemetry#173) Build and host docs via github action (open-telemetry#167) Add missing license boilerplate to a few files (open-telemetry#176) sdk/trace/exporters: add batch span processor exporter (open-telemetry#153) The exporters specification states that two built-in span processors should be implemented, the simple processor span and the batch processor span. This commit implements the latter, it is mainly based on the opentelemetry/java one. The algorithm implements the following logic: - a condition variable is used to notify the worker thread in case the queue is half full, so that exporting can start before the queue gets full and spans are dropped. - export is called each schedule_delay_millis if there is a least one new span to export. - when the processor is shutdown all remaining spans are exported. Implementing W3C TraceContext (fixes open-telemetry#116) (open-telemetry#180) * Implementing TraceContext (fixes open-telemetry#116) This introduces a w3c TraceContext propagator, primarily inspired by opencensus. fix time conversion bug (open-telemetry#182) Introduce Context.suppress_instrumentation (open-telemetry#181) Metrics Implementation (open-telemetry#160) * Create functions Comments for Meter More comments Add more comments Fix typos * fix lint * Fix lint * fix typing * Remove options, constructors, seperate labels * Consistent naming for float and int * Abstract time series * Use ABC * Fix typo * Fix docs * seperate measure classes * Add examples * fix lint * Update to RFC 0003 * Add spancontext, measurebatch * Fix docs * Fix comments * fix lint * fix lint * fix lint * skip examples * white space * fix spacing * fix imports * fix imports * LabelValues to str * Black formatting * fix isort * Remove aggregation * Fix names * Remove aggregation from docs * Fix lint * metric changes * Typing * Fix lint * Fix lint * Add space * Fix lint * fix comments * handle, recordbatch * docs * Update recordbatch * black * Fix typo * remove ValueType * fix lint * sdk * metrics * example * counter * Tests * Address comments * ADd tests * Fix typing and examples * black * fix lint * remove override * Fix tests * mypy * fix lint * fix type * fix typing * fix tests * isort * isort * isort * isort * noop * lint * lint * fix tuple typing * fix type * black * address comments * fix type * fix lint * remove imports * default tests * fix lint * usse sequence * remove ellipses * remove ellipses * black * Fix typo * fix example * fix type * fix type * address comments Implement Azure Monitor Exporter (open-telemetry#175) Span add override parameters for start_time and end_time (open-telemetry#179) CONTRIBUTING.md: Fix clone URL (open-telemetry#177) Add B3 exporter to alpha release table (open-telemetry#164) Update README for alpha release (open-telemetry#189) Update Contributing.md doc (open-telemetry#194) Add **simple** client/server examples (open-telemetry#191) Remove unused dev-requirements.txt (open-telemetry#200) The requirements are contained in tox.ini now. Fx bug in BoundedList for Python 3.4 and add tests (open-telemetry#199) * fix bug in BoundedList for python 3.4 and add tests collections.deque.copy() was introduced in python 3.5, this commit changes that by the deque constructor and adds some tests to BoundedList and BoundedDict to avoid similar problems in the future. Also, improve docstrings of BoundedList and BoundedDict classes Move util.time_ns to API. (open-telemetry#205) Add Jaeger exporter (open-telemetry#174) This adds a Jeager exporter for OpenTelemetry. This exporter is based on https://github.com/census-instrumentation/opencensus-python/tree/master/contrib/opencensus-ext-jaeger. The exporter uses thrift and can be configured to send data to the agent and also to a remote collector. There is a long discussion going on about how to include generated files in the repo, so for now just put them here. Add code coverage Revert latest commit Fix some "errors" found by mypy. (open-telemetry#204) Fix some errors found by mypy (split from open-telemetry#201). Update README for new milestones (open-telemetry#218) Refactor current span handling for newly created spans. (open-telemetry#198) 1. Make Tracer.start_span() simply create and start the Span, without setting it as the current instance. 2. Add an extra Tracer.start_as_current_span() to create the Span and set it as the current instance automatically. Co-Authored-By: Chris Kleinknecht <[email protected]> Add set_status to Span (open-telemetry#213) Initial commit Initial version
This introduces a w3c TraceContext propagator, primarily inspired by opencensus.