perf: Improve `approx_topk` performance by reducing allocations. #15450

jeschkies · 2024-12-17T16:54:11Z

What this PR does / why we need it:

The metrics slice should keep a constant amount of memory by removing the smallest item when the maximum of labels is reached.

› benchstat before.log after.log
goos: linux
goarch: amd64
pkg: github.com/grafana/loki/v3/pkg/logql
cpu: AMD Ryzen 7 3700X 8-Core Processor             
                                │ before.log  │              after.log              │
                                │   sec/op    │   sec/op     vs base                │
_HeapCountMinSketchVectorAdd-16   839.0m ± 3%   418.9m ± 2%  -50.07% (p=0.000 n=10)

                                │  before.log  │              after.log               │
                                │     B/op     │     B/op      vs base                │
_HeapCountMinSketchVectorAdd-16   72.58Mi ± 0%   12.10Mi ± 0%  -83.33% (p=0.000 n=10)

                                │  before.log  │              after.log              │
                                │  allocs/op   │  allocs/op   vs base                │
_HeapCountMinSketchVectorAdd-16   4073.9k ± 0%   116.9k ± 0%  -97.13% (p=0.000 n=10)

Checklist

Reviewed the CONTRIBUTING.md guide (required)
Documentation added
Tests updated
Title matches the required conventional commits format, see here
- Note that Promtail is considered to be feature complete, and future development for logs collection will be in Grafana Alloy. As such, feat PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.
Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

pkg/logql/count_min_sketch.go

cstyan · 2024-12-17T23:46:10Z

So there's no actual perf. change here now. The benchmark is worth including, but how are we getting to a state where the len/cap of the slice is going beyond the max we set in the first place?

…ations

cstyan

nice, lgtm, just left some questions but not really things that need to result in changes to the code here

cstyan · 2024-12-18T17:19:12Z

pkg/logql/count_min_sketch.go


 	// Add our metric if we haven't seen it
-	if _, ok := v.observed[metricString]; !ok {
+	id := xxhash.Sum64(v.buffer)


there's already a hash function for labels.Labels we could use? I suppose what you're doing is actually better, since if we called metric.Bytes and metric.Hash we're iterating over the whole slice of labels twice

the only difference is prometheus' labels hash function limits the total size of the bytes used for the hash https://github.com/prometheus/prometheus/blob/main/model/labels/labels.go#L76-L87

prometheus' labels hash function limits the total size of the bytes used for the hash

Not quite. It grows the bytes slice if it's not big enough. Unfortunately the function won't let one reuse the byte slice to avoid allocations. That's what I've found doing efac690

cstyan · 2024-12-18T17:23:01Z

pkg/logql/count_min_sketch_test.go

@@ -68,7 +71,7 @@ func TestCountMinSketchSerialization(t *testing.T) {
 		Sketch: &logproto.CountMinSketch{
 			Depth:       2,
 			Width:       4,
-			Counters:    []float64{0, 0, 0, 42, 0, 42, 0, 0},
+			Counters:    []float64{0, 42, 0, 0, 0, 42, 0, 0},


why was this change needed? how was the test failing?

Using the labels.Bytes change that hash values. That's why the position of 42 is different.

…fana#15450) **What this PR does / why we need it**: The metrics slice should keep a constant amount of memory by removing the smallest item when the maximum of labels is reached. ``` › benchstat before.log after.log goos: linux goarch: amd64 pkg: github.com/grafana/loki/v3/pkg/logql cpu: AMD Ryzen 7 3700X 8-Core Processor │ before.log │ after.log │ │ sec/op │ sec/op vs base │ _HeapCountMinSketchVectorAdd-16 839.0m ± 3% 418.9m ± 2% -50.07% (p=0.000 n=10) │ before.log │ after.log │ │ B/op │ B/op vs base │ _HeapCountMinSketchVectorAdd-16 72.58Mi ± 0% 12.10Mi ± 0% -83.33% (p=0.000 n=10) │ before.log │ after.log │ │ allocs/op │ allocs/op vs base │ _HeapCountMinSketchVectorAdd-16 4073.9k ± 0% 116.9k ± 0% -97.13% (p=0.000 n=10) ``` **Checklist** - [ ] Reviewed the [`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md) guide (**required**) - [ ] Documentation added - [x] Tests updated - [ ] Title matches the required conventional commits format, see [here](https://www.conventionalcommits.org/en/v1.0.0/) - **Note** that Promtail is considered to be feature complete, and future development for logs collection will be in [Grafana Alloy](https://github.com/grafana/alloy). As such, `feat` PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior. - [ ] Changes that require user attention or interaction to upgrade are documented in `docs/sources/setup/upgrade/_index.md` - [ ] If the change is deprecating or removing a configuration option, update the `deprecated-config.yaml` and `deleted-config.yaml` files respectively in the `tools/deprecated-config-checker` directory. [Example PR](grafana@0d4416a)

jeschkies added 2 commits December 17, 2024 16:48

Benchmark HeapCountMinSketchVectorAdd

714aa0a

Remove the smallest element when max is reached.

e19a384

jeschkies requested a review from cstyan December 17, 2024 16:54

jeschkies requested a review from a team as a code owner December 17, 2024 16:54

pull-request-size bot added the size/S label Dec 17, 2024

jeschkies changed the title ~~Improve approx_topk performance by reducing allocations.~~ perf: Improve approx_topk performance by reducing allocations. Dec 17, 2024

jeschkies commented Dec 17, 2024

View reviewed changes

pkg/logql/count_min_sketch.go Outdated Show resolved Hide resolved

jeschkies added 4 commits December 17, 2024 18:06

Add todo

dc5dacd

Check metrics capacity

fb1f72e

Check length and capacity

dd8fe62

Do net realloc

f15dcae

pull-request-size bot added size/M and removed size/S labels Dec 17, 2024

Convert labels to bytes slice with a buffer to avoid allocations.

19cfc52

pull-request-size bot added size/L and removed size/M labels Dec 18, 2024

jeschkies and others added 6 commits December 18, 2024 09:16

Remove todo

0024eaa

Fix test

c220803

Merge remote-tracking branch 'grafana/main' into karsten/reduce-alloc…

850016e

…ations

Revert formatting vendor

42bfbbb

Correct test

5ffb733

Fix typo

4d00e72

cstyan approved these changes Dec 18, 2024

View reviewed changes

jeschkies merged commit 04994ca into grafana:main Dec 19, 2024
58 checks passed

jeschkies deleted the karsten/reduce-allocations branch December 19, 2024 12:31

This was referenced Dec 23, 2024

chore(k234): release 3.4.0 #15536

Open

chore(k235): release 3.4.0 #15555

Open

loki-gh-app bot mentioned this pull request Jan 6, 2025

chore(k236): release 3.4.0 #15595

Open

loki-gh-app bot mentioned this pull request Jan 13, 2025

chore(k237): release 3.4.0 #15705

Open

loki-gh-app bot mentioned this pull request Jan 20, 2025

chore(k238): release 3.4.0 #15847

Open

This was referenced Feb 3, 2025

chore(k240): release 3.4.0 #16074

Closed

chore(k239): release 3.4.0 #16102

Merged

chore(k241): release 3.4.0 #16153

Closed

loki-gh-app bot mentioned this pull request Feb 12, 2025

chore(k239): release 3.4.0 (backport main) #16210

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Improve `approx_topk` performance by reducing allocations. #15450

perf: Improve `approx_topk` performance by reducing allocations. #15450

jeschkies commented Dec 17, 2024 •

edited

Loading

cstyan commented Dec 17, 2024

cstyan left a comment

cstyan Dec 18, 2024

jeschkies Dec 19, 2024

cstyan Dec 18, 2024

jeschkies Dec 19, 2024

perf: Improve approx_topk performance by reducing allocations. #15450

perf: Improve approx_topk performance by reducing allocations. #15450

Conversation

jeschkies commented Dec 17, 2024 • edited Loading

cstyan commented Dec 17, 2024

cstyan left a comment

Choose a reason for hiding this comment

cstyan Dec 18, 2024

Choose a reason for hiding this comment

jeschkies Dec 19, 2024

Choose a reason for hiding this comment

cstyan Dec 18, 2024

Choose a reason for hiding this comment

jeschkies Dec 19, 2024

Choose a reason for hiding this comment

perf: Improve `approx_topk` performance by reducing allocations. #15450

perf: Improve `approx_topk` performance by reducing allocations. #15450

jeschkies commented Dec 17, 2024 •

edited

Loading