-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dbnode] Validate indexable metric tags before insert into index queue #2046
Conversation
@@ -0,0 +1,398 @@ | |||
// Copyright (c) 2018 Uber Technologies, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit 19
Codecov Report
@@ Coverage Diff @@
## master #2046 +/- ##
========================================
+ Coverage 63.3% 72.4% +9%
========================================
Files 1001 1007 +6
Lines 86014 86414 +400
========================================
+ Hits 54523 62571 +8048
+ Misses 27580 19668 -7912
- Partials 3911 4175 +264
Continue to review full report at Codecov.
|
Makefile
Outdated
@@ -89,6 +89,7 @@ TOOLS := \ | |||
clone_fileset \ | |||
dtest \ | |||
verify_commitlogs \ | |||
verify_data_files \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be added to the license check? (Also comparator if you're adding that fix)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to FOSSA, looks like comparator is already there 👍
return ErrUsingReservedFieldName | ||
} | ||
if !utf8.Valid(tagValue) { | ||
return fmt.Errorf("document contains invalid field value: %v", tagValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; add name in error log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
func ValidateMetricTag(tag ident.Tag) error { | ||
tagName := tag.Name.Bytes() | ||
tagValue := tag.Value.Bytes() | ||
if !utf8.Valid(tagName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is this a more likely error scenario than the ReservedFieldNameID case? Also can probably wait to get the value bytes until after name passes validation
"name_as_string=%s, name_as_hex=%s"+ | ||
"value_as_string=%s, value_as_hex=%s", | ||
err, | ||
tag.Name.Bytes(), tag.Name.Bytes(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think first one of these should be .String()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether you give it a string or []byte it acts the same thankfully - avoids allocations if leave it as []byte so I usually don't provide as a string to the formatter. e.g.:
https://play.golang.org/p/-yhIOkLan9H
if !utf8.Valid(id.Bytes()) { | ||
return readEntryResult{invalidID: true}, | ||
fmt.Errorf("invalid id: err=%s, as_string=%s, as_hex=%x", | ||
"non-utf8", id.Bytes(), id.Bytes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think first should be a string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether you give it a string or []byte it acts the same thankfully - avoids allocations if leave it as []byte so I usually don't provide as a string to the formatter. e.g.:
https://play.golang.org/p/-yhIOkLan9H
// ValidateSeriesTag validates a series tag for use with m3ninx. | ||
func ValidateSeriesTag(tag ident.Tag) error { | ||
tagName := tag.Name.Bytes() | ||
tagValue := tag.Value.Bytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: May be better to only take the value after name validation passes?
@@ -241,6 +241,11 @@ func NewTags(values ...Tag) Tags { | |||
return Tags{values: values} | |||
} | |||
|
|||
// Reset resets the tags for reuse. | |||
func (t *Tags) Reset(values []Tag) { | |||
t.values = values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to take ownership of the tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followed up, ident.ID does the ref tracking so we are ok here.
} | ||
|
||
// Get all fileset files. | ||
log.Info("discovering file sets", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meganit: could build up a fields := []zapcore.Fields{zap.Strings("namespaces", namespaces)}
and append to it as we go?
return nil, fmt.Errorf("could not read dir names: %v", err) | ||
} | ||
|
||
results := make([]string, 0, len(entries)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can do this in place without an alloc results := entries[:0]
and keep the rest the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
|
||
results := make([]string, 0, len(entries)) | ||
for _, p := range entries { | ||
if p == "." || p == ".." || p == "./.." || p == "./" || p == "../" || p == "./../" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do if p[0] == '.' { continue }
? Or is there a chance there'll be valid files starting with .
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct they could be "./foo"
return err | ||
} | ||
|
||
defer reader.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be deferred earlier before the error above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close() is only valid after an Open(), hence doing it below the return
|
||
check, err := readEntry(id, tags, data, checksum) | ||
data.Finalize() // Always finalize data. | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: reduce nesting with a continue?
if err == nil {
continue
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
|
||
// NOTE: we output to a new directory so that we don't clobber files. | ||
writeFsOpts := fsOpts.SetFilePathPrefix(opts.fixDir) | ||
writer, err := fs.NewWriter(writeFsOpts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this defer writer.Close()
and return nil at the end instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since writer.Close()
does quite a few final steps, I'm going to continue to return an error if it returns an error.
To make sure we always close up I changed it to this:
success := false
defer func() {
if !success {
writer.Close()
}
}()
// do work
success = true
return writer.Close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good 👍
// Need to remove invalid tags. | ||
tagsBuffer = tagsBuffer[:0] | ||
for _, tag := range currTags { | ||
if err := convert.ValidateSeriesTag(tag); err != nil { | ||
removedTags++ | ||
continue | ||
} | ||
tagsBuffer = append(tagsBuffer, tag) | ||
} | ||
|
||
// Make sure we write out the modified tags. | ||
writeTags = tagsBuffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this risk 2 series with the same tagset? Should probably skip the entire series instead IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, can remove the series.
What this PR does / why we need it:
This also adds a tool to fix existing file sets.
Special notes for your reviewer:
Does this PR introduce a user-facing and/or backwards incompatible change?:
Does this PR require updating code package or user-facing documentation?: