-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Improve approx_topk
performance by reducing allocations.
#15450
perf: Improve approx_topk
performance by reducing allocations.
#15450
Conversation
approx_topk
performance by reducing allocations.approx_topk
performance by reducing allocations.
So there's no actual perf. change here now. The benchmark is worth including, but how are we getting to a state where the len/cap of the slice is going beyond the max we set in the first place? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, lgtm, just left some questions but not really things that need to result in changes to the code here
|
||
// Add our metric if we haven't seen it | ||
if _, ok := v.observed[metricString]; !ok { | ||
id := xxhash.Sum64(v.buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's already a hash function for labels.Labels
we could use? I suppose what you're doing is actually better, since if we called metric.Bytes
and metric.Hash
we're iterating over the whole slice of labels twice
the only difference is prometheus' labels hash function limits the total size of the bytes used for the hash https://github.com/prometheus/prometheus/blob/main/model/labels/labels.go#L76-L87
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prometheus' labels hash function limits the total size of the bytes used for the hash
Not quite. It grows the bytes slice if it's not big enough. Unfortunately the function won't let one reuse the byte slice to avoid allocations. That's what I've found doing efac690
@@ -68,7 +71,7 @@ func TestCountMinSketchSerialization(t *testing.T) { | |||
Sketch: &logproto.CountMinSketch{ | |||
Depth: 2, | |||
Width: 4, | |||
Counters: []float64{0, 0, 0, 42, 0, 42, 0, 0}, | |||
Counters: []float64{0, 42, 0, 0, 0, 42, 0, 0}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this change needed? how was the test failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the labels.Bytes
change that hash values. That's why the position of 42 is different.
…fana#15450) **What this PR does / why we need it**: The metrics slice should keep a constant amount of memory by removing the smallest item when the maximum of labels is reached. ``` › benchstat before.log after.log goos: linux goarch: amd64 pkg: github.com/grafana/loki/v3/pkg/logql cpu: AMD Ryzen 7 3700X 8-Core Processor │ before.log │ after.log │ │ sec/op │ sec/op vs base │ _HeapCountMinSketchVectorAdd-16 839.0m ± 3% 418.9m ± 2% -50.07% (p=0.000 n=10) │ before.log │ after.log │ │ B/op │ B/op vs base │ _HeapCountMinSketchVectorAdd-16 72.58Mi ± 0% 12.10Mi ± 0% -83.33% (p=0.000 n=10) │ before.log │ after.log │ │ allocs/op │ allocs/op vs base │ _HeapCountMinSketchVectorAdd-16 4073.9k ± 0% 116.9k ± 0% -97.13% (p=0.000 n=10) ``` **Checklist** - [ ] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [ ] Documentation added - [x] Tests updated - [ ] Title matches the required conventional commits format, see [here](https://www.conventionalcommits.org/en/v1.0.0/) - **Note** that Promtail is considered to be feature complete, and future development for logs collection will be in [Grafana Alloy](https://github.com/grafana/alloy). As such, `feat` PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior. - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](grafana@0d4416a)
What this PR does / why we need it:
The metrics slice should keep a constant amount of memory by removing the smallest item when the maximum of labels is reached.
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR