Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4189] Fix colliding filters by updating match method for MultiCharSequenceFilter to iterate through all patterns for the most complete match and not return on the first pattern found #4188

Merged
merged 9 commits into from
Feb 22, 2023
29 changes: 25 additions & 4 deletions src/metrics/filters/filter.go
Original file line number Diff line number Diff line change
Expand Up @@ -585,14 +585,35 @@ func (f *multiCharSequenceFilter) matches(val []byte) ([]byte, bool) {
return nil, false
}

var matchIndex int
var bestPattern []byte
for _, pattern := range f.patterns {
if f.backwards && bytes.HasSuffix(val, pattern) {
return val[:len(val)-len(pattern)], true
if len(pattern) > len(val) {
continue
}

if !f.backwards && bytes.HasPrefix(val, pattern) {
return val[len(pattern):], true
if f.backwards {
if bytes.HasSuffix(val, pattern) {
if bestPattern == nil || len(pattern) > len(bestPattern) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: seems like the bestPattern == nil isn't needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, updated.

bestPattern = pattern
matchIndex = len(val) - len(pattern)
}
}
} else {
if bytes.HasPrefix(val, pattern) {
if bestPattern == nil || len(pattern) > len(bestPattern) {
bestPattern = pattern
matchIndex = len(pattern)
}
}
}
}

if bestPattern != nil {
if f.backwards {
return val[:matchIndex], true
}
return val[matchIndex:], true
}

return nil, false
Expand Down
13 changes: 13 additions & 0 deletions src/metrics/filters/filter_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,19 @@ import (
"github.com/stretchr/testify/require"
)

func TestPrefixCompositeR2Filter(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a better place to add these test cases is in TestMultiCharSequenceFilter in this file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I also think the naming also isn't great. Is R2 naming only relevant internally to Uber?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, there's R2 references aplenty in OSS world.

id0 := "arachne_failures"
id1 := "arachne_failures_by_rack"
f, err := newMultiCharSequenceFilter([]byte("arachne_failures,arachne_failures_by_rack"), false)
require.NoError(t, err)
val1, matches1 := f.matches([]byte(id0))
require.True(t, matches1)
require.True(t, len(val1) == 0)
val2, matches2 := f.matches([]byte(id1))
require.True(t, matches2)
require.True(t, len(val2) == 0)
}

func TestNewFilterFromFilterValueInvalidPattern(t *testing.T) {
inputs := []string{"ab]c[sdf", "abc[z-a]", "*con[tT]ains*"}
for _, input := range inputs {
Expand Down