-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extrapolate Jaeger transaction count from reported sampling rate #3722
Conversation
Rounding error on this is going to be pretty bad for some sampling rates (e.g 0.3, 0.4), so it might be worth scaling all counts recorded in the in-memory histograms up when recording (e.g. multiply by 100), and then back down when creating metricset docs. |
Codecov Report
@@ Coverage Diff @@
## master #3722 +/- ##
==========================================
- Coverage 80.35% 80.30% -0.06%
==========================================
Files 131 131
Lines 6022 6042 +20
==========================================
+ Hits 4839 4852 +13
- Misses 1183 1190 +7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a fairly straight forward way to approximate the number of transactions. Not sure if it could lead to problematic edge cases, but I couldn't think of any that would be triggered by this handling. Would be great to see this getting in to provide better APM UI usability for Jaeger agent collected data.
69b8f86
to
f266588
Compare
💚 Build SucceededExpand to view the summary
Build stats
Test stats 🧪
Steps errorsExpand to view the steps failures
|
6c7b5cf
to
e3aedd6
Compare
Set Transaction.RepresentativeCount based on the value of sampler.param if sampler.type=probabilistic.
e3aedd6
to
dcb130d
Compare
Once the UI side of this is in, the histograms will be used for RPM graphs, which will take sampling into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs LGTM
…stic#3722) * model: add Transaction.RepresentativeCount field * jaeger: set Transaction.RepresentativeCount Set Transaction.RepresentativeCount based on the value of sampler.param if sampler.type=probabilistic. * aggregation/txmetrics: use RepresentativeCount * Update changelog * docs: remove caveats about Jaeger sampling & RPMs Once the UI side of this is in, the histograms will be used for RPM graphs, which will take sampling into account. * Fix/add comments
…) (#3932) * model: add Transaction.RepresentativeCount field * jaeger: set Transaction.RepresentativeCount Set Transaction.RepresentativeCount based on the value of sampler.param if sampler.type=probabilistic. * aggregation/txmetrics: use RepresentativeCount * Update changelog * docs: remove caveats about Jaeger sampling & RPMs Once the UI side of this is in, the histograms will be used for RPM graphs, which will take sampling into account. * Fix/add comments
…ate (elastic#3722)" This reverts commit 76ac96e.
Motivation/summary
When Jaeger is configured to sample a percentage of traces, then the statistics reported in APM UI will be proportional to the sampling rate, and not the actual number of operations. Jaeger reports the sampling rate as a pair of tags (sampler.type and sampler.param) with each span. We can use these to extrapolate the number of transactions when performing aggregations.
Checklist
make check-full
for static code checks and linting)How to test these changes
apm-server.aggregation.enabled=true
)transaction.duration.histogram
fields is 1000 (e.g. using Histogram field type support for ValueCount and Avg aggregations elasticsearch#55933 when it lands)Related issues
Closes #3011