Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster glob matching using a finite state machine #157

Merged
merged 23 commits into from
Oct 10, 2018
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@
package main

import (
"bufio"
"net"
"net/http"
"os"
"strconv"

"github.com/howeyc/fsnotify"
Expand Down Expand Up @@ -118,6 +120,20 @@ func watchConfig(fileName string, mapper *mapper.MetricMapper) {
}
}

func dumpFSM(mapper *mapper.MetricMapper, dumpFilename string) error {
f, err := os.Create(dumpFilename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the file already exists? What should the behavior be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will rewrite the existing file, I feel this is acceptable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sounds good!

if err != nil {
return err
}
log.Infoln("Start dumping FSM to", dumpFilename)
w := bufio.NewWriter(f)
mapper.FSM.DumpFSM(w)
w.Flush()
f.Close()
log.Infoln("Finish dumping FSM")
return nil
}

func main() {
var (
listenAddress = kingpin.Flag("web.listen-address", "The address on which to expose the web interface and generated Prometheus metrics.").Default(":9102").String()
Expand All @@ -126,6 +142,7 @@ func main() {
statsdListenTCP = kingpin.Flag("statsd.listen-tcp", "The TCP address on which to receive statsd metric lines. \"\" disables it.").Default(":9125").String()
mappingConfig = kingpin.Flag("statsd.mapping-config", "Metric mapping configuration file name.").String()
readBuffer = kingpin.Flag("statsd.read-buffer", "Size (in bytes) of the operating system's transmit read buffer associated with the UDP connection. Please make sure the kernel parameters net.core.rmem_max is set to a value greater than the value specified.").Int()
dumpFSMPath = kingpin.Flag("statsd.dump-fsm", "The path to dump internal FSM generated for glob matching as Dot file.").Default("").String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now I'm wondering if the whole --statsd.* flag namespace makes sense. Maybe there should be a "mapping" namespace? (then --statsd.mapping-config should also move into that, but that should be a separate change)

I'm really not sure about this, so it's fine to leave as is. just putting the idea here …

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or for this particular one, it could be --debug.dump-fsm? since this is really only a debugging feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i second on the debug.dump-fsm, let me change it.

)

log.AddFlags(kingpin.CommandLine)
Expand Down Expand Up @@ -183,6 +200,12 @@ func main() {
if err != nil {
log.Fatal("Error loading config:", err)
}
if *dumpFSMPath != "" {
err := dumpFSM(mapper, *dumpFSMPath)
if err != nil {
log.Fatal("Error dumpping FSM:", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: dumpping -> dumping

}
}
go watchConfig(*mappingConfig, mapper)
}
exporter := NewExporter(mapper)
Expand Down
68 changes: 68 additions & 0 deletions pkg/mapper/fsm/formatter.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// Copyright 2018 The Prometheus Authors
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package fsm

import (
"fmt"
"strconv"
"strings"
)

type templateFormatter struct {
captureIndexes []int
captureCount int
fmtString string
}

func newTemplateFormatter(valueExpr string, captureCount int) *templateFormatter {
matches := templateReplaceCaptureRE.FindAllStringSubmatch(valueExpr, -1)
if len(matches) == 0 {
// if no regex reference found, keep it as it is
return &templateFormatter{captureCount: 0, fmtString: valueExpr}
}

var indexes []int
valueFormatter := valueExpr
for _, match := range matches {
idx, err := strconv.Atoi(match[len(match)-1])
if err != nil || idx > captureCount || idx < 1 {
// if index larger than captured count or using unsupported named capture group,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were named capture groups supported before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's only supported in regex, although glob used regex under the hood it was not possible to name the group

// replace with empty string
valueFormatter = strings.Replace(valueFormatter, match[0], "", -1)
} else {
valueFormatter = strings.Replace(valueFormatter, match[0], "%s", -1)
// note: the regex reference variable $? starts from 1
indexes = append(indexes, idx-1)
}
}
return &templateFormatter{
captureIndexes: indexes,
captureCount: len(indexes),
fmtString: valueFormatter,
}
}

func (formatter *templateFormatter) format(captures map[int]string) string {
if formatter.captureCount == 0 {
// no label substitution, keep as it is
return formatter.fmtString
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told it's more Go idiomatic to not use an else here, but to just carry on:

if <bail condition> {
  return
}

<do stuff>
return

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is more interesting if there are multiple bail conditions / returns though, when otherwise you'd get deeply nested else's (or use monads), so I'm fine with leaving it as is.

indexes := formatter.captureIndexes
vargs := make([]interface{}, formatter.captureCount)
for i, idx := range indexes {
vargs[i] = captures[idx]
}
return fmt.Sprintf(formatter.fmtString, vargs...)
}
}
Loading