-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make OTLP exporter memory mode API public #6469
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6469 +/- ##
============================================
- Coverage 90.86% 90.85% -0.01%
+ Complexity 6169 6154 -15
============================================
Files 678 675 -3
Lines 18507 18454 -53
Branches 1818 1813 -5
============================================
- Hits 16816 16766 -50
+ Misses 1154 1151 -3
Partials 537 537 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
* <p>>When memory mode is {@link MemoryMode#REUSABLE_DATA}, serialization is optimized to reduce | ||
* memory allocation. | ||
*/ | ||
public OtlpHttpLogRecordExporterBuilder setMemoryMode(MemoryMode memoryMode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that this is part of public api of a stable module and once we have added this method we can't easily get rid of it? If so then perhaps it would be better to not tie this to implementation details like immutable or reusable data, but rather something abstract like minimize allocations and maximize throughput (I guess this would only make sense if the one that allocates more has a bit better performance). That way the behavior of this method could more easily accommodate future changes which could, for example, include deleting one of the memory modes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that this is part of public api of a stable module and once we have added this method we can't easily get rid of it?
We can't get rid of it without a major version bump, which we don't plan on doing anytime soon.
perhaps it would be better to not tie this to implementation details like immutable or reusable data, but rather something abstract like minimize allocations and maximize throughput (I guess this would only make sense if the one that allocates more has a bit better performance). That way the behavior of this method could more easily accommodate future changes which could, for example, include deleting one of the memory modes.
The MemoryMode enum is also part of the stable API at this point so we can't delete the memory modes. In hindsight it might have been preferable to choose a name like MemoryMode.LOW_ALLOCATION
instead MemoryMode.REUSABLE_DATA
. LOW_ALLOCATION describes the intended outcome (more user facing) where REUSABLE_DATA describes what is happening under the covers. The flip side of this is that REUSABLE_DATA communicates some important information to MetricReader / MetricExporter implementations: we're going to reuse MetricData classes so they won't function right after the CompletableResultCode from MetricExporter.export() resolves. So there's benefits to naming for the outcome and also how the outcome is accomplished.
REUSABLE_DATA doesn't describes what's happening with the serializers as well as it describes what's happening with the metrics SDK where it was originally introduced, but its still somewhat accurate. One thing that sticks out is that with the metrics SDK, implementers of MetricReader / MetricExporter have to be aware of the semantics of REUSABLE_DATA. With serializers, there's no impact to user semantics. The only indication a user can see that something has changed is a shift in the CPU / memory behavior.
Overall, I think its preferable to have one memory mode configuration concept, even if the words we chose for that concept (IMMUTABLE_DATA and REUSABLE_DATA) don't perfectly describe all the places we make that configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flip side of this is that REUSABLE_DATA communicates some important information to MetricReader / MetricExporter implementations: we're going to reuse MetricData classes so they won't function right after the CompletableResultCode from MetricExporter.export() resolves
👍
The newly added memory mode setting works without a hitch as far as I can tell.
I talk about its benefits in a new blog post on opentelemetry.io called "Java Metric Systems Compared" - currently pending PR review open-telemetry/opentelemetry.io#4512.
Some considerations for making the API public:
MetricData
afterMetricExporter#export(Collection<MetricData>)
returns. Althought this type of thing is an edge case, its still valid. Allowing memory mode to be configurable toIMMUTABLE_DATA
acts as an escape hatch. Logs and traces don't have the same use case, but maybe future JVMs can get so good at escape analysis that immutable data ends up outperforming data structure reuse. It seems harmless to make memory mode configurable for traces and logs.