-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] OTel tracing throughout the codebase #1238
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
return decorator | ||
|
||
|
||
def add_attributes_to_current_span(attributes: Dict[str, Any]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline - make this strongly typed and put onus of conversion on caller, error internally if not expected types
otel_collection_headers: Optional[Dict[str, str]], | ||
otel_granularity: OpenTelemetryGranularity, | ||
) -> None: | ||
"""Initializes module-level state for OpenTelemetry.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document params
Description of changes
This PR adds OpenTelemetry tracing to ~all major methods throughout our codebase. It also adds configuration to specify where these traces should be sent. I focused more on laying the groundwork for tracing than on collecting all the data we need everywhere.
Default behavior is unchanged: no tracing, no printing.
Summarize the changes made by this PR.
The only major design decision I made was to fully separate OpenTelemetry stuff and Posthog (product telemetry) stuff.
Justification:
It's tempting to combine OTel and product telemetry behind a single internal interface. I don't think this coupling is worth it. Product telemetry cares about a small and relatively static set of uses, whereas tracing by nature should be very deep in our codebase. I see two ways to couple them and problems with each:
collection.add()
s,collection.delete()
s, etc. This seems weird to me: tracing code would be implicit in some cases but explicit in others, making the codebase less easily comprehensible. Also if an engineer later decides to add product telemetry to a codepath that already has tracing, they need to know to remove existing tracing. This increases the cognitive overhead required to work on Chroma, reducing the readability and maintainability of our codebase.Instead we have two well-encapsulated telemetry modules which we can modify and use without worrying about the other telemetry. The OTel module provides some lightweight helpers to make OTel a little easier to use, but we can put raw OTel code throughout our codebase and it'll play nicely.
Test plan
How are these changes tested?
pytest
for python,yarn test
for jsManual testing:
various granularities.
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository?
Docs PR to land before this does.