Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize metrics wrapper #1850

Merged
merged 7 commits into from
Aug 30, 2023
Merged

Optimize metrics wrapper #1850

merged 7 commits into from
Aug 30, 2023

Conversation

kyri-petrou
Copy link
Collaborator

@kyri-petrou kyri-petrou commented Aug 24, 2023

Fairly certain we can optimise it further, but at least now it's not cripplingly slow. Open to suggestions on what else can be improved. The metrics wrapper is now ~x50 faster but still twice as slow as the ApolloTracing wrapper.

Before:

[info] Benchmark                                      Mode  Cnt    Score    Error  Units
[info] NestedZQueryBenchmark.apolloTracingBenchmark  thrpt    5  202.018 ± 21.100  ops/s
[info] NestedZQueryBenchmark.metricsBenchmark        thrpt    5    5.500 ±  0.258  ops/s
[info] NestedZQueryBenchmark.noWrappersBenchmark     thrpt    5  511.841 ±  8.066  ops/s

After

[info] Benchmark                                      Mode  Cnt    Score   Error  Units
[info] NestedZQueryBenchmark.apolloTracingBenchmark  thrpt    5  199.308 ± 3.427  ops/s
[info] NestedZQueryBenchmark.metricsBenchmark        thrpt    5  117.211 ± 3.722  ops/s
[info] NestedZQueryBenchmark.noWrappersBenchmark     thrpt    5  515.891 ± 7.015  ops/s

If we fork the writing of metrics, then the metricsBenchmark becomes faster than the ApolloTracing wrapper:

[info] Benchmark                                Mode  Cnt    Score   Error  Units
[info] NestedZQueryBenchmark.metricsBenchmark  thrpt    5  213.406 ± 4.114  ops/s

@ghostdogpr ghostdogpr requested a review from frekw August 25, 2023 04:36
Comment on lines +77 to +82
(for {
_ <- failures.get.flatMap(metrics.recordFailures)
timings <- timings.get
nodeOffsets = resolveNodeOffsets(timings)
_ <- metrics.recordSuccesses(nodeOffsets, timings)
} yield ()).forkDaemon
Copy link
Collaborator Author

@kyri-petrou kyri-petrou Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I feel about forking this effect, but it's by far the most performant implementation. With forking, it becomes faster than the ApolloTracing wrapper (see performance result in PR description)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to not make the execution slower because we're updating the metrics. And since this wrapper does't wrap pure fields, we don't create too many useless fibers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way we're forking only once per query execution here, so the effect of the forking itself should be extremely minimal. My only concern is that in the very unlikely case that the calculation of offsets + writing of metrics is slower than the query execution, this could lead to a memory leak.

However, the chances of that happening are extremely low since in almost all cases the query execution will be slower than this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think being able to separate the measurements from the final calculation/reporting of metrics is the most beautiful and clever part of this change, so I definitely think we should keep the forkDaemon. :)

Also, a single forkDaemon call is likely to pale in comparison to all other fork calls during query execution.

Copy link
Collaborator

@frekw frekw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful optimization (once I managed to figure out how it works :)).

The only thing that caught me is if the .view here actually does anything, since we're directly materializing it to a vector?

@kyri-petrou
Copy link
Collaborator Author

kyri-petrou commented Aug 30, 2023

The only thing that caught me is if the .view here actually does anything, since we're directly materializing it to a vector?

@frekw it's a space-complexity optimization rather than a time-complexity one. Without .view we're materializing an intermediate List when we call .reverse, only to then convert it to a Vector. With .view we skip materializing the intermediate List.

@kyri-petrou
Copy link
Collaborator Author

@ghostdogpr are you okay with merging this one in?

@ghostdogpr ghostdogpr merged commit cfce824 into series/2.x Aug 30, 2023
@ghostdogpr ghostdogpr deleted the optimize-metrics-wrapper branch August 30, 2023 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants