Adding mapping cache #198

SpencerMalone · 2019-04-03T15:56:02Z

Mapping is one of the most expensive operations done, but throwing a cache in front of it is an easy way to reduce that pain.

Alongside that, updating the config invokes a lock and wipes the cache, and on k8s if your config is a configmap, it will be regularly synced into the container (even with no changes), at which point you will get needless config reloads (which now wipes the cache, just to be safe). Only reloading if the sha changes is a good way to keep the cache around as long as possible.

Here's the benchmark results:

BenchmarkGlob10Rules-12                                 	 5000000	       258 ns/op
BenchmarkGlob10RulesCached-12                           	20000000	        71.8 ns/op
BenchmarkRegex10RulesAverage-12                         	 1000000	      1134 ns/op
BenchmarkRegex10RulesAverageCached-12                   	20000000	        71.9 ns/op
BenchmarkGlob100Rules-12                                	 5000000	       260 ns/op
BenchmarkGlob100RulesCached-12                          	20000000	        72.3 ns/op
BenchmarkGlob100RulesMultipleCaptures-12                	 1000000	      1908 ns/op
BenchmarkGlob100RulesMultipleCapturesCached-12          	20000000	        73.6 ns/op
BenchmarkRegex100RulesMultipleCapturesWorst-12          	  100000	     11116 ns/op
BenchmarkRegex100RulesMultipleCapturesWorstCached-12    	20000000	        74.7 ns/op

pkg/mapper/mapper.go

pkg/mapper/mapper_cache.go

bakins · 2019-04-03T18:05:20Z

There are two unrelated changes in this PR: caching mapping lookups and changing the way config file reloads are handled. IMO, these should be two PRs - one for each behavioral change.

SpencerMalone · 2019-04-03T20:30:30Z

I think that's a fair point, I went down this path after a prior PR where it sounded like the preference was to group like-changes together, but the context was different (grouping together a behavioral change and an observability increase VS grouping together two behavioral changes). I'll split off the config reloading into a new PR

SpencerMalone · 2019-04-15T14:57:58Z

So, using the LRU cache brought me from 70 ns/op -> ~118 ns/op, which to me is acceptable for the feature gain of a max size, but adding a hit/miss counter bumped me up to 500 ns/op from a goroutine, and to ~220 ns/op for a single threaded approach, neither of which seemed worth it to me.

With that in mind, I'll be tracking the size of the cache, but not a counter of cache requests.

matthiasr · 2019-04-16T09:17:51Z

main.go

@@ -144,6 +144,8 @@ func main() {
 		mappingConfig   = kingpin.Flag("statsd.mapping-config", "Metric mapping configuration file name.").String()
 		readBuffer      = kingpin.Flag("statsd.read-buffer", "Size (in bytes) of the operating system's transmit read buffer associated with the UDP connection. Please make sure the kernel parameters net.core.rmem_max is set to a value greater than the value specified.").Int()
 		dumpFSMPath     = kingpin.Flag("debug.dump-fsm", "The path to dump internal FSM generated for glob matching as Dot file.").Default("").String()
+
+		cacheSize = kingpin.Flag("statsd.cache-size", "Maximum size of your metric mapping cache. Relies on least recently used replacement policy if max size is reached.").Default("1000").Int()


I would keep the LRU out of the usage and document it in the README maybe?

the unit is not immediately clear from this – "size" suggests bytes but I think it's the number of cached mappings?

Suggested change

cacheSize = kingpin.Flag("statsd.cache-size", "Maximum size of your metric mapping cache. Relies on least recently used replacement policy if max size is reached.").Default("1000").Int()

cacheSize = kingpin.Flag("statsd.cache-size", "Maximum number of cached mappings.").Default("1000").Int()

You've got it!

matthiasr · 2019-04-16T09:25:46Z

pkg/mapper/mapper.go

+	} else {
+		cache, err := NewMetricMapperCache(cacheSize)
+		if err != nil {
+			log.Warnf("Unable to setup metric cache. Performance may be negatively impacted. Caused by: %s", err)


Following the bread crumbs, the only error that could happen here would be invalid configuration. I would prefer to

prevent negative cache sizes ourselves

if we fail to initialize the cache for another reason, fail hard

I'm not a fan of silent failures, even if it's something "optional" like a cache.

Aside from that, this is purely a matter of taste (so no need to change it if you disagree): I'm not a fan of "if use $feature …" – could we install a NOOP cache instead?

Happy to do so! I should have followed the breadcrumbs more carefully myself.

pkg/mapper/mapper_cache.go

Signed-off-by: SpencerMalone <[email protected]>

SpencerMalone · 2019-05-09T12:32:11Z

I think this should be good to go!

matthiasr · 2019-05-15T13:04:57Z

Beautiful, thanks a lot for pushing this through!

Signed-off-by: Matthias Rampke <[email protected]>

* [CHANGE] Do not run as root in the Docker container by default ([#202](#202)) * [FEATURE] Add metric for count of events by action ([#193](#193)) * [FEATURE] Add metric for count of distinct metric names ([#200](#200)) * [FEATURE] Add UNIX socket listener support ([#199](#199)) * [FEATURE] Accept Datadog [distributions](https://docs.datadoghq.com/graphing/metrics/distributions/) ([#211](#211)) * [ENHANCEMENT] Add a health check to the Docker container ([#182](#182)) * [ENHANCEMENT] Allow inconsistent label sets ([#194](#194)) * [ENHANCEMENT] Speed up sanitization of metric names ([#197](#197)) * [ENHANCEMENT] Enable pprof endpoints ([#205](#205)) * [ENHANCEMENT] DogStatsD tag parsing is faster ([#210](#210)) * [ENHANCEMENT] Cache mapped metrics ([#198](#198)) * [BUGFIX] Fix panic if a mapping resulted in an empty name ([#192](#192)) * [BUGFIX] Ensure that there are always default quantiles if using summaries ([#212](#212)) * [BUGFIX] Prevent ingesting conflicting metric types that would make scraping fail ([#213](#213)) With #192, the count of events rejected because of negative counter increments has moved into the `statsd_exporter_events_error_total` metric, instead of being lumped in with the different kinds of successful events. Signed-off-by: Matthias Rampke <[email protected]>

In #198 the signature for setting up the exporter changed slightly, a change that #212 and #213 didn't have. This doesn't change anything substantial. Signed-off-by: Matthias Rampke <[email protected]>

SpencerMalone commented Apr 3, 2019

View reviewed changes

pkg/mapper/mapper.go Show resolved Hide resolved

SpencerMalone commented Apr 3, 2019

View reviewed changes

pkg/mapper/mapper_cache.go Outdated Show resolved Hide resolved

SpencerMalone force-pushed the mapping-cache branch 2 times, most recently from 72875fd to c6a635a Compare April 3, 2019 16:03

SpencerMalone changed the title ~~Adding mapping cache~~ Adding mapping cache, only refresh config if file contents are changed Apr 3, 2019

SpencerMalone force-pushed the mapping-cache branch from c6a635a to 1f952fe Compare April 3, 2019 16:09

bakins reviewed Apr 3, 2019

View reviewed changes

pkg/mapper/mapper_cache.go Outdated Show resolved Hide resolved

SpencerMalone force-pushed the mapping-cache branch 3 times, most recently from 74f4bbf to a9b901f Compare April 4, 2019 14:29

SpencerMalone changed the title ~~Adding mapping cache, only refresh config if file contents are changed~~ Adding mapping cache Apr 4, 2019

SpencerMalone force-pushed the mapping-cache branch 6 times, most recently from 7ed8341 to 2fd8a00 Compare April 15, 2019 14:32

SpencerMalone force-pushed the mapping-cache branch from 2fd8a00 to ae2681e Compare April 15, 2019 14:58

matthiasr reviewed Apr 16, 2019

View reviewed changes

SpencerMalone force-pushed the mapping-cache branch from ae2681e to 5c9edb7 Compare April 26, 2019 21:39

Adding mapping cache

35d1a99

Signed-off-by: SpencerMalone <[email protected]>

SpencerMalone force-pushed the mapping-cache branch from 5c9edb7 to 35d1a99 Compare April 26, 2019 22:00

matthiasr merged commit 698bcdf into prometheus:master May 15, 2019

matthiasr pushed a commit that referenced this pull request May 15, 2019

Add changelog entry for #198

228a969

Signed-off-by: Matthias Rampke <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding mapping cache #198

Adding mapping cache #198

SpencerMalone commented Apr 3, 2019 •

edited

Loading

bakins commented Apr 3, 2019 •

edited

Loading

SpencerMalone commented Apr 3, 2019 •

edited

Loading

SpencerMalone commented Apr 15, 2019

matthiasr Apr 16, 2019

matthiasr Apr 16, 2019

SpencerMalone Apr 18, 2019

matthiasr Apr 16, 2019

matthiasr Apr 16, 2019

SpencerMalone Apr 18, 2019

SpencerMalone commented May 9, 2019

matthiasr commented May 15, 2019

	cacheSize = kingpin.Flag("statsd.cache-size", "Maximum size of your metric mapping cache. Relies on least recently used replacement policy if max size is reached.").Default("1000").Int()
	cacheSize = kingpin.Flag("statsd.cache-size", "Maximum number of cached mappings.").Default("1000").Int()

Adding mapping cache #198

Adding mapping cache #198

Conversation

SpencerMalone commented Apr 3, 2019 • edited Loading

bakins commented Apr 3, 2019 • edited Loading

SpencerMalone commented Apr 3, 2019 • edited Loading

SpencerMalone commented Apr 15, 2019

matthiasr Apr 16, 2019

Choose a reason for hiding this comment

matthiasr Apr 16, 2019

Choose a reason for hiding this comment

SpencerMalone Apr 18, 2019

Choose a reason for hiding this comment

matthiasr Apr 16, 2019

Choose a reason for hiding this comment

matthiasr Apr 16, 2019

Choose a reason for hiding this comment

SpencerMalone Apr 18, 2019

Choose a reason for hiding this comment

SpencerMalone commented May 9, 2019

matthiasr commented May 15, 2019

SpencerMalone commented Apr 3, 2019 •

edited

Loading

bakins commented Apr 3, 2019 •

edited

Loading

SpencerMalone commented Apr 3, 2019 •

edited

Loading