Receiver: cache matchers for series calls #7353

pedro-stanaka · 2024-05-13T06:40:07Z

Summary

We have tried caching matchers before with a time-based expiration cache, this time we are trying with LRU cache.

We saw some of our receivers busy with compiling regexes and with high CPU usage, similar to the profile of the benchmark I added here:

Benchmark results

Expand!

Result on store-proxy-cache-matchers
BenchmarkProxySeriesRegex-11    	 1545795	       768.7 ns/op	    1144 B/op	      19 allocs/op
BenchmarkProxySeriesRegex-11    	 1548040	       769.4 ns/op	    1144 B/op	      19 allocs/op
BenchmarkProxySeriesRegex-11    	 1545019	       778.3 ns/op	    1144 B/op	      19 allocs/op
BenchmarkProxySeriesRegex-11    	 1539387	       771.1 ns/op	    1144 B/op	      19 allocs/op

Result on main
BenchmarkProxySeriesRegex-11    	  130292	      8803 ns/op	   10288 B/op	      78 allocs/op
BenchmarkProxySeriesRegex-11    	  124045	      8533 ns/op	   10288 B/op	      78 allocs/op
BenchmarkProxySeriesRegex-11    	  125092	      8712 ns/op	   10288 B/op	      78 allocs/op
BenchmarkProxySeriesRegex-11    	  120110	      8676 ns/op	   10288 B/op	      78 allocs/op

The results indicate that the "store-proxy-cache-matchers" branch considerably outperforms the "main" branch in all observed aspects of the BenchmarkProxySeriesRegex function. It is roughly 10 times faster regarding execution time while using about 9 times less memory and making about 4 times fewer allocations per operation. These improvements suggest significant optimizations in the regex handling or related data processing in the "store-proxy-cache-matchers" branch compared to the "main" branch

Changes

Adding matcher cache for method MatchersToPromMatchers and a new version which uses the cache.
The main change is in matchesExternalLabels function which now receives a cache instance.

Verification

I have added tests to the change and new benchmarks.

GiedriusS

The results indicate that the "store-proxy-cache-matchers" branch considerably outperforms the "main" branch in all observed aspects of the BenchmarkProxySeriesRegex function. It is roughly 10 times faster regarding execution time while using about 9 times less memory and making about 4 times fewer allocations per operation. These improvements suggest significant optimizations in the regex handling or related data processing in the "store-proxy-cache-matchers" branch compared to the "main" branch

Was this AI generated? 😄

GiedriusS · 2024-05-28T13:43:02Z

pkg/store/storepb/matcher_cache.go

+
+func (c *MatchersCache) GetOrSet(key LabelMatcher, newItem NewItemFunc) (*labels.Matcher, error) {
+	c.metrics.requestsTotal.Inc()
+	if item, ok := c.cache.Get(key); ok {


I suggest using singleflight here to reduce allocations even more

Thanks! Did that.

Kindly as from Cortex: :D

Is it possible to make this interface receive the Prometheus types, instead thanos ones, so we can reuse the same implementation on cortex?

Ex:

GetOrSet(t labels.MatchType, n, v string, newItem NewItemFunc) (*labels.Matcher, error)

@alanprot could you link where in Cortex you would use this? I introduced an interface now, which prompb.LabelMatcher implements and made the storepb.LabelMatcher implement it as well. Let me know if that is enough for Cortex to reuse the code.

Ok..
I think that works!

But i think the interface definition should be in the storecache package? Other than that i think it would work just fine for us!

GiedriusS · 2024-05-28T13:46:08Z

cmd/thanos/receive.go

@@ -973,6 +986,8 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
 			"about order.").
 		Default("false").Hidden().BoolVar(&rc.allowOutOfOrderUpload)

+	cmd.Flag("matcher-cache-size", "The size of the cache used for matching against external labels. Using 0 disables caching.").Default("0").IntVar(&rc.matcherCacheSize)


Should we add this to other components as well like Thanos Store?

I can do it, I fear making the PR hard to review though.

pkg/store/prometheus.go

GiedriusS · 2024-05-28T13:49:33Z

pkg/store/storepb/matcher_cache.go

+	}
+}
+
+func NewMatchersCache(opts ...MatcherCacheOption) (*MatchersCache, error) {


Maybe we can just use pkg/cache/inmemory.go? It's another LRU implementation that already exists in the tree.

I would need to make it generic first, no? Or you mean adding this LRU to also be stored there? Also, I feel like this would introduce the need for the user to configure the cache via YAML configuration in the receiver for example, which would get quite complex.

yeya24 · 2024-12-03T00:19:40Z

Hi @pedro-stanaka, do you plan to continue this PR?
We want to add the same cache in Cortex cortexproject/cortex#6382 and think it would be nice to contribute the cache to Thanos directly so that both projects can share the same code.

If you are busy with something else and not planning to come back to this PR, can we create a new PR to add this cache?

pedro-stanaka · 2024-12-03T07:54:09Z

Hi @pedro-stanaka, do you plan to continue this PR? We want to add the same cache in Cortex cortexproject/cortex#6382 and think it would be nice to contribute the cache to Thanos directly so that both projects can share the same code.

If you are busy with something else and not planning to come back to this PR, can we create a new PR to add this cache?

I can take a stab at finishing this up this week probably. Will put on my TODO.

pedro-stanaka · 2024-12-05T13:00:38Z

@yeya24 @GiedriusS please take a look at the current version.

Some stuff I think might be good doing, but not sure:

We now have feature flags for receivers, should we put this behind a feature flag with a sensible default for the LRU size (whilst still allow users to override)?
As Giedrius said, should I already change the Store gateway to use this cache as well?

yeya24 · 2024-12-06T19:20:15Z

pkg/store/storepb/matcher_cache.go

@@ -0,0 +1,150 @@
+// Copyright (c) The Thanos Authors.


Should we move this code out of storepb package? storepb sounds more related to the proto itself but this matcher cache can be more generic

Moved it to the storecache package, which seems a generic cache package.

yeya24 · 2024-12-06T19:20:47Z

pkg/store/storepb/matcher_cache.go

+type MatchersCache interface {
+	// GetOrSet retrieves a matcher from cache or creates and stores it if not present.
+	// If the matcher is not in cache, it uses the provided newItem function to create it.
+	GetOrSet(key LabelMatcher, newItem NewItemFunc) (*labels.Matcher, error)


Same here. Can we take prometheus matcher as input key?

Because we want to convert to a prometheus matcher, I dont see the reason to use it as key. I will create a middle-term matcher that can be represented in a more neutral way. Let's see if with that we can use the cache in Cortex as well.

We have tried caching matchers before with a time-based expiration cache, this time we are trying with LRU cache. We saw some of our receivers busy with compiling regexes and with high CPU usage, similar to the profile of the benchmark I added here: * Adding matcher cache for method `MatchersToPromMatchers` and a new version which uses the cache. * The main change is in `matchesExternalLabels` function which now receives a cache instance. adding matcher cache and refactor matchers Co-authored-by: Andre Branchizio <[email protected]> Signed-off-by: Pedro Tanaka <[email protected]> Using the cache in proxy and tsdb stores (only receiver) Signed-off-by: Pedro Tanaka <[email protected]> fixing problem with deep equality Signed-off-by: Pedro Tanaka <[email protected]> adding some docs Signed-off-by: Pedro Tanaka <[email protected]> Adding benchmark Signed-off-by: Pedro Tanaka <[email protected]> undo unecessary changes Signed-off-by: Pedro Tanaka <[email protected]> Adjusting metric names Signed-off-by: Pedro Tanaka <[email protected]> adding changelog Signed-off-by: Pedro Tanaka <[email protected]> wiring changes to the receiver Signed-off-by: Pedro Tanaka <[email protected]> Fixing linting Signed-off-by: Pedro Tanaka <[email protected]>

Signed-off-by: Pedro Tanaka <[email protected]>

pull-request-size bot added the size/L label May 13, 2024

pedro-stanaka force-pushed the store-proxy-cache-matchers branch from 598b480 to d56e024 Compare May 13, 2024 06:41

pedro-stanaka marked this pull request as ready for review May 13, 2024 08:13

pedro-stanaka marked this pull request as draft May 13, 2024 08:14

pedro-stanaka force-pushed the store-proxy-cache-matchers branch 2 times, most recently from 0528b9c to a58508d Compare May 13, 2024 09:29

pedro-stanaka changed the title ~~Receivers|Store: cache matchers for series calls~~ Receiver: cache matchers for series calls May 13, 2024

pedro-stanaka force-pushed the store-proxy-cache-matchers branch 3 times, most recently from 34e4852 to 3f852a5 Compare May 13, 2024 14:48

pedro-stanaka marked this pull request as ready for review May 14, 2024 13:05

GiedriusS reviewed May 28, 2024

View reviewed changes

pedro-stanaka force-pushed the store-proxy-cache-matchers branch from f7b7697 to 813b5fe Compare December 5, 2024 09:48

pull-request-size bot added size/XL and removed size/L labels Dec 5, 2024

pedro-stanaka force-pushed the store-proxy-cache-matchers branch 9 times, most recently from f493eb7 to 3a10cf1 Compare December 6, 2024 15:23

yeya24 reviewed Dec 6, 2024

View reviewed changes

pedro-stanaka force-pushed the store-proxy-cache-matchers branch 2 times, most recently from 417104d to b2b65a2 Compare December 9, 2024 16:58

pedro-stanaka force-pushed the store-proxy-cache-matchers branch from b2b65a2 to c771511 Compare December 9, 2024 17:05

pedro-stanaka added 11 commits December 10, 2024 12:13

docs

50007bd

Signed-off-by: Pedro Tanaka <[email protected]>

using singleflight to get or set items

713770d

Signed-off-by: Pedro Tanaka <[email protected]>

improve metrics

e81dc9f

Signed-off-by: Pedro Tanaka <[email protected]>

Introduce interface for matchers cache

843b128

Signed-off-by: Pedro Tanaka <[email protected]>

fixing unit test

c5eabc2

Signed-off-by: Pedro Tanaka <[email protected]>

adding changelog

b85fea2

Signed-off-by: Pedro Tanaka <[email protected]>

fixing benchmark

ede5164

Signed-off-by: Pedro Tanaka <[email protected]>

moving matcher cache to storecache package

5315bba

Signed-off-by: Pedro Tanaka <[email protected]>

Trying to make the cache more reusable introducing interface

2370775

Signed-off-by: Pedro Tanaka <[email protected]>

Fixing problem with wrong initialization

314b4c6

Signed-off-by: Pedro Tanaka <[email protected]>

pedro-stanaka force-pushed the store-proxy-cache-matchers branch from 77e479f to 314b4c6 Compare December 10, 2024 11:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Receiver: cache matchers for series calls #7353

Receiver: cache matchers for series calls #7353

pedro-stanaka commented May 13, 2024 •

edited

Loading

GiedriusS left a comment

GiedriusS May 28, 2024

pedro-stanaka Dec 5, 2024

alanprot Dec 6, 2024

pedro-stanaka Dec 9, 2024

alanprot Dec 12, 2024

GiedriusS May 28, 2024

pedro-stanaka Dec 5, 2024

GiedriusS May 28, 2024

pedro-stanaka Dec 5, 2024

yeya24 commented Dec 3, 2024

pedro-stanaka commented Dec 3, 2024

pedro-stanaka commented Dec 5, 2024

yeya24 Dec 6, 2024

pedro-stanaka Dec 9, 2024

yeya24 Dec 6, 2024

pedro-stanaka Dec 9, 2024

Receiver: cache matchers for series calls #7353

Are you sure you want to change the base?

Receiver: cache matchers for series calls #7353

Conversation

pedro-stanaka commented May 13, 2024 • edited Loading

Summary

Benchmark results

Changes

Verification

GiedriusS left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yeya24 commented Dec 3, 2024

pedro-stanaka commented Dec 3, 2024

pedro-stanaka commented Dec 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pedro-stanaka commented May 13, 2024 •

edited

Loading