Faster glob matching using a finite state machine #157

fffonion · 2018-09-10T19:07:51Z

This patch proposes a fast and efficient matching algorithm that as a drop in replacement to existing glob matching. On our production environment, the go regexp engine which glob matching uses takes a lot more CPU time than system calls under heavy workloads. We have deployed this patch on our environment and seeing a great performance boost. See this blog if you are interested with the background : )

By default, a general string takes higher priority than * when matching. Backtrack will run under the following circumstances:

The rules needing backtrack to be mapped correctly. A warning log like WARN[0000] backtracking required because of match "test2.*.*",matching performance may be degraded source="mapper.go:418" will be printed.
glob_disable_ordering is set to false or unset.

Full changelog:

Add a new cli argument statsd.dump-fsm to dump the current state machine to a GNU dot file. This can help to debug the behaviour of internal state machine.
Add a new config glob_disable_ordering to defaults section to indicate that the rules are not strictly in the order of it's occurrence in the mapping config. Setting this to on will boost matching performance under certain large amount of sets of rules.

Signed-off-by: Wangchong Zhou <[email protected]>

matthiasr · 2018-09-10T20:05:20Z

Hmm hmm, this is very intriguing. I am not sure about adding another match type that is so similar to glob … is it theoretically possible to make the FSM method fully compatible with the current behavior of the glob type? What would be the performance impact of that be? Where fallbacks are necessary, can this be automatically detected (e.g. whether backtracking is necessary?)

matthiasr · 2018-09-10T20:07:50Z

… what I am thinking is: can we turn this into a full-featured replacement of the current glob implementation? And then, if necessary, add specific settings for further optimizations like treating * with lower priority?

matthiasr · 2018-09-10T20:11:55Z

and where ever we can, auto-detect and warn instead of requiring an explicit setting, like

WARN: backtracking required for match client.*.request.size.*, falling back to slower glob method

(but I would prefer if we could sensibly do the backtracking in the FSM implementation instead of indefinitely carrying around the regex-based-glob implementation)

fffonion · 2018-09-10T21:15:09Z

Yes it's possible to implement the backtrack. Basically it will traverse over all possible transitions until it get one match. The performance impact added will be very little as I can see, since each lookup of transitions is in a map. I like this approach better too, optionally we can detect it when loading as well and also give user an option to do backtracking or not. (And we don't need the fsm_fallback anymore)

The ordering of rules can be implemented by replacing the map with array, so that we can keep the order when iterating over it. The performance impact will depend on the types of rules, the more transitions it has for each state, the longer the array will be, and the more the performance impact it will have. Having multiple captures in the label is similar to that and we can basically have a array to store the captureIdx'es and iterate it when formatting. One thing is that once we go this approach, the performance impact on doing backtrack will be magnified.

How does this sound to you:

By default this fsm matcher replaces glob, it uses array to find transitions, having multiple captures and supports backtrack and should pass all the tests for glob matching
Have a option that satisfy simpler use cases, it uses map to find transitions, can have only one capture per label

matthiasr · 2018-09-10T21:57:09Z

That sounds perfect! what would be a good name for the latter option? Should it be one, or a separate toggle for each? I would like to pull the debug "dump FSM" option out of the mapping configuration into a flag, just to keep the mapping tidy. That way the file writing can also be handled by the exporter, instead of mixing it in with the rather pure mapping logic. Does that sound reasonable?

…

On Mon, Sep 10, 2018, 23:15 Wangchong Zhou ***@***.***> wrote: Yes it's possible to implement the backtrack. Basically it will traverse over all possible transitions until it get one match. The performance impact added will be very little as I can see, since each lookup of transitions is in a map. I like this approach better too, optionally we can detect it when loading as well and also give user an option to do backtracking or not. (And we don't need the fsm_fallback anymore) The ordering of rules can be implemented by replacing the map with array, so that we can keep the order when iterating over it. The performance impact will depend on the types of rules, the more transitions it has for each state, the longer the array will be, and the more the performance impact it will have. Having multiple captures in the label is similar to that and we can basically have a array to store the captureIdx'es and iterate it when formatting. One thing is that once we go this approach, the performance impact on doing backtrack will be magnified. How does this sound to you: - By default this fsm matcher replaces glob, it uses array to find transitions, having multiple captures and supports backtrack and should pass all the tests for glob matching - Have a option that satisfy simpler use cases, it uses map to find transitions, can have only one capture per label, and don't support backtrack. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#157 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAICBoK3hM5U_jRpktDy_BeY3GJRGc5nks5uZtZdgaJpZM4Wh_2u> .

fffonion · 2018-09-10T22:10:13Z

That sounds perfect! what would be a good name for the latter option?
Should it be one, or a separate toggle for each?

Hmm it's hard to name it separately but I also feel it's better to separate them from end-user's view. Do you have any suggestions on this?

I would like to pull the debug "dump FSM" option out of the mapping
configuration into a flag

Yep this sounds good to me.

matthiasr · 2018-09-10T22:31:58Z

Thinking about it, only the ordering is something the mapper needs to know about ahead of time to preserve performance, right? if my rules don't require backtracking, the mapper won't spend time on it, so we don't need a setting for it – maybe log a warning like I sketched out earlier. For the label substitution, well, we do the work that the label requires, so if I only use a simple substitution won't it still be fast? if absolutely necessary, the mapper could switch modes automatically, but I'd like to see benchmarks or profiles showing that this complication is necessary. For anything that can be handled automatically I'd rather not even add a configuration knob.

…

On Tue, Sep 11, 2018, 00:10 Wangchong Zhou ***@***.***> wrote: That sounds perfect! what would be a good name for the latter option? Should it be one, or a separate toggle for each? Hmm it's hard to name it separately but I also feel it's better to separate them from end-user's view. Do you have any suggestions on this? I would like to pull the debug "dump FSM" option out of the mapping configuration into a flag Yep this sounds good to me. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#157 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAICBs-JIKm44yLeOM-anHIJ-oBFDJYdks5uZuNFgaJpZM4Wh_2u> .

fffonion · 2018-09-10T22:37:22Z

Yep let me try to implement this : ) Thanks for discussion!

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion · 2018-09-13T19:17:15Z

Hi @matthiasr, I've finished developing fsm as a drop-in replacement for glob matching. Could you take another look?

Ordering is implemented with minimum performance penalty, still using a map. And we do backtrack only we detect that rules need backtracking. Also I added a test TestRPS and shows that the new algorithm is about 33% faster than using regex for glob, when ordering is disabled and no backtrack is needed.

Signed-off-by: Wangchong Zhou <[email protected]>

matthiasr

Thank you! Functionality wise, this is the best solution. However, I would like to work on the code organization a bit:

With all the FSM traversal and regex matching in the GetMapping function, it becomes really hard to follow that. Can you break that out into one or more functions?

Would it be possible to collect everything FSM related (except for method calls from the initialization and GetMapping) into a separate file, or even a separate package? I would prefer the latter, because it forces us to think about a sensible interface for it.

I think for dumping the FSM state, there is still too much responsibility in the mapper – I don't want it to handle opening and closing a file. Can you change it so that there is a method to dump the FSM? This method accept a writer, and then main.go should decide when to call this method (after loading the configuration? or only after the initial configuration load) and what to write to – the FSM specific code should not have to care whether that's a bufio.Writer, a string writer, or stdout.

Finally, since the motivation for this change is performance, could you please add benchmarks that demonstrate the different cases (regex/glob, ordering/no ordering, backtracking necessary / no backtracking)? Ideally with different numbers of mappings. That will also give us something to point to in the README to help users understand the choices and their consequences.

And finally, all this needs to be documented in the README, especially the user-facing configuration and flags.

Thank you for working through this with me, and sorry for the delay in reviewing!

fffonion · 2018-09-20T17:03:32Z

I'll address the comment soon! Thank you!

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion · 2018-09-24T23:07:56Z

@matthiasr I've finished the dev part of your comment.

GetMapping func in mapper is cleaned up(no gotos 😄) also more comments are added.

FSM parts are moved to a sub package fsm inside mapper package, because I feel like they are not in a equal level and mapper should include fsm .

The function dumpFSM is also changed to accept a io.Writer instance.

For the performance part, the test func TestPerformance is refined and charged with more cases. In short, fsm will have best performance if the rules amount is not so less and captures (use of *) is not to much in a single rule. Ordering and backtracking won't hurt performance much.

I'll start working on the doc. Would you like that to be a separate PR or inside this one with separate commits?

Signed-off-by: Wangchong Zhou <[email protected]>

matthiasr · 2018-09-26T20:03:03Z

I renamed the PR to reflect the direction it has taken

fffonion · 2018-09-26T20:14:05Z

I'll reflect those in the doc : ) Awesome, thanks!

matthiasr

This is really starting to take on shape!

I am wondering about all the places where the FSM needs to understand the statsd name format and does split("."), and conversely does all the formatting into "names" and "labels". I feel like we are really close to not having this entanglement – what if we promote both into the mapper package? (or break them into another sub-package but that may be excessive)

For example, what if instead of func (f *FSM) GetMapping(statsdMetric string, statsdMetricType string) (interface{}, string, prometheus.Labels, bool) we had func (f *FSM) Get(fragments []string) (interface{} result, bool matched)? The caller (the mapping package) would have to take care of splitting the metric name, and prepending the mapping type since it's essentially a match before we even start matching names. The FSM in turn could be completely agnostic to these – it would simply feed a slice of strings into the state machine and spit out the result, without knowing the meaning of either.

matthiasr · 2018-09-26T20:06:22Z

main.go

@@ -118,6 +120,20 @@ func watchConfig(fileName string, mapper *mapper.MetricMapper) {
 	}
 }

+func dumpFSM(mapper *mapper.MetricMapper, dumpFilename string) error {
+	f, err := os.Create(dumpFilename)


What happens if the file already exists? What should the behavior be?

It will rewrite the existing file, I feel this is acceptable?

yes, sounds good!

matthiasr · 2018-09-26T20:08:59Z

main.go

@@ -126,6 +142,7 @@ func main() {
 		statsdListenTCP = kingpin.Flag("statsd.listen-tcp", "The TCP address on which to receive statsd metric lines. \"\" disables it.").Default(":9125").String()
 		mappingConfig   = kingpin.Flag("statsd.mapping-config", "Metric mapping configuration file name.").String()
 		readBuffer      = kingpin.Flag("statsd.read-buffer", "Size (in bytes) of the operating system's transmit read buffer associated with the UDP connection. Please make sure the kernel parameters net.core.rmem_max is set to a value greater than the value specified.").Int()
+		dumpFSMPath     = kingpin.Flag("statsd.dump-fsm", "The path to dump internal FSM generated for glob matching as Dot file.").Default("").String()


now I'm wondering if the whole --statsd.* flag namespace makes sense. Maybe there should be a "mapping" namespace? (then --statsd.mapping-config should also move into that, but that should be a separate change)

I'm really not sure about this, so it's fine to leave as is. just putting the idea here …

or for this particular one, it could be --debug.dump-fsm? since this is really only a debugging feature.

Yeah i second on the debug.dump-fsm, let me change it.

matthiasr · 2018-09-26T20:09:24Z

main.go

+		if *dumpFSMPath != "" {
+			err := dumpFSM(mapper, *dumpFSMPath)
+			if err != nil {
+				log.Fatal("Error dumpping FSM:", err)


typo: dumpping -> dumping

matthiasr · 2018-09-26T20:13:17Z

pkg/mapper/mapper.go

@@ -31,18 +31,24 @@ var (
 	metricLineRE = regexp.MustCompile(`^(\*\.|` + statsdMetricRE + `\.)+(\*|` + statsdMetricRE + `)$`)
 	metricNameRE = regexp.MustCompile(`^([a-zA-Z_]|` + templateReplaceRE + `)([a-zA-Z0-9_]|` + templateReplaceRE + `)*$`)
 	labelNameRE  = regexp.MustCompile(`^[a-zA-Z_][a-zA-Z0-9_]+$`)
+
+	templateReplaceCaptureRE = regexp.MustCompile(`\$\{?([a-zA-Z0-9_\$]+)\}?`)


is this still needed here? the only reference seems to be in the fsm package

no it's not needed, will cleanup

matthiasr · 2018-09-26T20:13:47Z

pkg/mapper/fsm/fsm.go

+)
+
+var (
+	templateReplaceCaptureRE = regexp.MustCompile(`\$\{?([a-zA-Z0-9_\$]+)\}?`)


move this to formatter.go please, since it's only used there

matthiasr · 2018-09-26T21:05:03Z

pkg/mapper/mapper.go

+				mappings = append(mappings, mapping.Match)
+			}
+		}
+		n.FSM.TestIfNeedBacktracking(mappings)


what happens to the value of this? as it is, isn't this a NOOP?

I see now, the method also mutates the FSM, I would prefer not to do that – see my comment there.

matthiasr · 2018-09-26T21:10:03Z

pkg/mapper/fsm/fsm.go

+				if i2 != i1 && len(re1.FindStringSubmatchIndex(r2)) > 0 {
+					// log if we care about ordering and the superset occurs before
+					if !f.disableOrdering && i1 < i2 {
+						log.Warnf("match \"%s\" is a super set of match \"%s\" but in a lower order, "+


for clarity: match \"%s\" is a super set of match \"%s\" but comes first, it will never be matched

I don't think you need \n when logging

matthiasr · 2018-09-26T21:14:50Z

pkg/mapper/fsm/fsm.go

+	// backtracking will always be needed if ordering of rules is not disabled
+	// since transistions are stored in (unordered) map
+	// note: don't move this branch to the beginning of this function
+	// since we need logs for superset rules


we only log if !f.disableOrdering, and we only honor the result of the computation otherwise. would it make sense to split this into two methods:

one that, assuming ordered matches, logs warnings for dead rules

one that, assuming unordered matches, calculates whether backtracking is needed

then call one or the other, depending on the disableOrdering configuration?

matthiasr · 2018-09-26T21:16:05Z

pkg/mapper/fsm/fsm.go

+	// since transistions are stored in (unordered) map
+	// note: don't move this branch to the beginning of this function
+	// since we need logs for superset rules
+	f.needsBacktracking = !f.disableOrdering || needBacktrack


please either change f (imperative style) or return something (functional style) and set in the caller. I prefer the latter.

as far as I can see, if you split "log warnings for ordered mappings" from "calculate backtracking need for unordered mappings", then neither method needs the FSM itself?

yes, i'll fix this along with the other comment

matthiasr · 2018-09-26T21:18:37Z

pkg/mapper/fsm/fsm.go

+
+}
+
+// DumpFSM accepts a io.writer and write the current FSM into dot file format


maybe move this method to its own file? it's fairly extra from everything else

matthiasr

For the documentation, I think I would like to have two for different audiences:

in the existing README, explain the ordering configuration option and why it matters.

in a new README in the mapper package explain how the FSM works, how mappings are translated into a state machine, and how a metric is matched against these. This can reference existing documents (Wikipedia or a some other intro). I don't have a CS background, and it took me a bit to get what is happening here. The FSM method is quite a bit more complex than the naive regexing that was happening before, and I would like to give me-in-half-a-year something to hold on to and understand what is happening.

In general, the comments before exported functions (what gets turned into godoc) could also be more extensive – describe what the inputs and outputs of each method mean, at least. The comments should form full sentences and end in a period.

matthiasr · 2018-09-26T21:30:10Z

pkg/mapper/mapper.go

@@ -75,6 +81,20 @@ var defaultQuantiles = []metricObjective{
 	{Quantile: 0.99, Error: 0.001},
 }

+func min(x, y int) int {


I think these are moved into the FSM now too?

oh yeah, sorry 😅

matthiasr · 2018-09-26T21:32:46Z

pkg/mapper/mapper_test.go

+	return rules
+}
+
+func TestPerformance(t *testing.T) {


Please write this in terms of a Go benchmark – it handles the number of iterations automatically for us. Also move it to a separate file, just because this one is so terribly long already.

matthiasr · 2018-09-26T21:39:08Z

Thank you for continuing to work on this! I am sorry for all the comments and back and forth – this is a pretty massive change and I am a bit scared that we won't be able to understand and improve it later unless we get it really well structured and documented now.

fffonion · 2018-09-26T21:47:28Z

No problem, I totally understand that.: ) I'll address the comments and finish the docs, it will take some time but I'll let you know when I get there.

Signed-off-by: Wangchong Zhou <[email protected]>

matthiasr · 2018-09-28T21:31:14Z

almost forgot: please also add go test -bench . to the TravisCI build so we can continuously see the results there. thank you!

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion · 2018-10-04T17:55:46Z

Hi @matthiasr, I've finished the doc and addressed your comments! Could you take another look please?

straightening out some wording, and fixing unquoted `*`. Signed-off-by: Matthias Rampke <[email protected]>

Format quote as such, capitalize FSM. Signed-off-by: Matthias Rampke <[email protected]>

Signed-off-by: Matthias Rampke <[email protected]>

Fixing a typo and language. Signed-off-by: Matthias Rampke <[email protected]>

we should not change the vendored Makefile.common here. Signed-off-by: Matthias Rampke <[email protected]>

matthiasr

I've made a few minor tweaks, and it's good to go now! Thank you very much for getting this together, the performance improvements are significant, and now everybody using the glob matching can benefit from them.

fffonion · 2018-10-10T22:03:08Z

Thank you @matthiasr for reviewing and taking care of this!

Implement "fsm" match type

825b734

Signed-off-by: Wangchong Zhou <[email protected]>

replace glob matching

bfe2329

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from ca4f535 to bfe2329 Compare September 12, 2018 00:34

fffonion added 2 commits September 12, 2018 14:44

detect backtracking

dad04e9

Signed-off-by: Wangchong Zhou <[email protected]>

implement backtrack

c2742aa

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from ff8ba8d to 4d2b6bc Compare September 13, 2018 19:12

ordering

9ebab25

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from 4d2b6bc to 9ebab25 Compare September 13, 2018 19:13

cleanup

a751c0c

Signed-off-by: Wangchong Zhou <[email protected]>

matthiasr reviewed Sep 18, 2018

View reviewed changes

fffonion added 2 commits September 21, 2018 18:10

move fsm to sepeare pkg

e634997

Signed-off-by: Wangchong Zhou <[email protected]>

abstract dumpFsm

5e1df60

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from 8805949 to 5e1df60 Compare September 22, 2018 01:11

fffonion added 3 commits September 21, 2018 18:13

license header

668e31e

Signed-off-by: Wangchong Zhou <[email protected]>

cleanup and add comments

f387766

Signed-off-by: Wangchong Zhou <[email protected]>

add more perf test

a8dcc58

Signed-off-by: Wangchong Zhou <[email protected]>

more perf test

a0681a0

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from 3a4994b to a0681a0 Compare September 24, 2018 23:08

gofmt

6d709d5

Signed-off-by: Wangchong Zhou <[email protected]>

matthiasr changed the title ~~Implement "fsm" match type~~ Faster glob matching using a finite state machine Sep 26, 2018

matthiasr reviewed Sep 26, 2018

View reviewed changes

tidy up and use go benchmark

5262b29

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from 6a6b96c to 5262b29 Compare September 28, 2018 19:58

promote formatLabels to mapper package

fcf11f0

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from e1d6b26 to fcf11f0 Compare September 28, 2018 21:06

fffonion added 2 commits September 28, 2018 14:42

add benchmark test to ci

699fa13

Signed-off-by: Wangchong Zhou <[email protected]>

optimizations

c10e80c

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch 3 times, most recently from ea07aca to 581ea4e Compare October 3, 2018 23:54

add readmes

4d9ce8c

Signed-off-by: Wangchong Zhou <[email protected]>

fffonion force-pushed the feat/fsm-matcher branch from 581ea4e to 4d9ce8c Compare October 4, 2018 17:55

Matthias Rampke added 5 commits October 10, 2018 21:18

Copy-edits in README

761e64d

straightening out some wording, and fixing unquoted `*`. Signed-off-by: Matthias Rampke <[email protected]>

Copy edits in FSM README

9fc976d

Format quote as such, capitalize FSM. Signed-off-by: Matthias Rampke <[email protected]>

Put in-line image on its own paragraph

97f71db

Signed-off-by: Matthias Rampke <[email protected]>

Copy edit comments in fsm/formatter

e5734e3

Fixing a typo and language. Signed-off-by: Matthias Rampke <[email protected]>

Move bench target out of common Makefile

4e53440

we should not change the vendored Makefile.common here. Signed-off-by: Matthias Rampke <[email protected]>

matthiasr approved these changes Oct 10, 2018

View reviewed changes

matthiasr merged commit 00e2c3f into prometheus:master Oct 10, 2018

hbagdi deleted the feat/fsm-matcher branch October 10, 2018 22:43

matthiasr mentioned this pull request Dec 4, 2020

glob takes precedence over regex, bug or design? #349

Closed


		}

		// DumpFSM accepts a io.writer and write the current FSM into dot file format

Faster glob matching using a finite state machine #157

Faster glob matching using a finite state machine #157

Conversation

fffonion commented Sep 10, 2018 • edited Loading

matthiasr commented Sep 10, 2018

matthiasr commented Sep 10, 2018

matthiasr commented Sep 10, 2018

fffonion commented Sep 10, 2018 • edited Loading

matthiasr commented Sep 10, 2018 via email

fffonion commented Sep 10, 2018

matthiasr commented Sep 10, 2018 via email

fffonion commented Sep 10, 2018

fffonion commented Sep 13, 2018

matthiasr left a comment

Choose a reason for hiding this comment

fffonion commented Sep 20, 2018

fffonion commented Sep 24, 2018

matthiasr commented Sep 26, 2018

fffonion commented Sep 26, 2018

matthiasr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthiasr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthiasr commented Sep 26, 2018

fffonion commented Sep 26, 2018

matthiasr commented Sep 28, 2018

fffonion commented Oct 4, 2018

matthiasr left a comment

Choose a reason for hiding this comment

fffonion commented Oct 10, 2018

fffonion commented Sep 10, 2018 •

edited

Loading

fffonion commented Sep 10, 2018 •

edited

Loading