Reject invalid tags #1348

replay · 2019-06-14T02:46:46Z

This gives users the option to reject ingested metrics with invalid tags. Even when rejection is disabled it counts the metrics with invalid tags, this should make it easy for users to predict whether enabling the rejection is going to actually reject some of their metrics or not.
Also updates the vendored raintank/schema to the latest version, because it uses the new tag validation functions.

This is part of: grafana/grafana#15261
As discussed in: #1217 (comment)
Depends on this PR to be merged, because otherwise with the latest raintank/schema the serialization of cwr.ChunkWriteRequest is broken: #1335

The benchmarks look about the same with and without this change:

benchmark                                     old ns/op     new ns/op     delta
BenchmarkProcessMetricDataUniqueMetrics-3     656           731           +11.43%
BenchmarkProcessMetricDataSameMetric-3        657           754           +14.76%
benchmark                                     old allocs     new allocs     delta
BenchmarkProcessMetricDataUniqueMetrics-3     0              0              +0.00%
BenchmarkProcessMetricDataSameMetric-3        0              0              +0.00%
benchmark                                     old bytes     new bytes     delta
BenchmarkProcessMetricDataUniqueMetrics-3     5             5             +0.00%
BenchmarkProcessMetricDataSameMetric-3        5             5             +0.00%

Dieterbe · 2019-06-14T18:06:13Z

why is it optional? what happens to invalid tags if we "accept" them? and why would we accept them?

replay · 2019-06-14T18:27:05Z

It is optional because when enforced then it introduces new restrictions which previously have not been applied. That way some current user who isn't sure whether some of their metrics would get rejected can just disable the enforcement and look at the new invalid_tags metric to see if it increases or not, if it doesn't increase then none of their current metrics would get rejected.

replay · 2019-06-14T18:53:42Z

If metrics with invalid tags get accepted, as long as they still contain a = which is required for the index to be able to handle them, they will just get added to the index like others. If one does not contain a = at all, then the "invalid tag counter" gets increased and this tag gets ignored, but the metrics still get added to the index via their other tags and the name. If such invalid tags got accepted, it's then possible that they're not properly queriable, because graphite might not pass the query on.

adds new configuration option which decides whether ingested metrics with invalid tags should be rejected or not. even if rejection is disabled a new counter will be increased every time when an invalid metric has been received. this should make it easy for users who are not sure whether they currently send invalid metrics to MT to figure it out before they enable the rejection

ingests valid metrics, such with invalid tags, and such with invalid tag values. it does each of those cases once with invalid metric rejection enabled and once with it disabled. at the end of each test it verifies whether the counters have been updated correctly and whether the index has been grown by the expected size.

replay · 2019-06-18T22:43:43Z

Now that #1335 is merged I've rebased this onto master

Dieterbe · 2019-06-19T18:41:15Z

see https://github.com/grafana/metrictank/blob/master/docs/CONTRIBUTING.md#when-contributing-prs rule 16: potentially breaking change requires changelog update

Dieterbe · 2019-06-19T18:45:55Z

also rule 8!

Dieterbe · 2019-06-19T19:07:17Z

input/input.go

+			in.invalidTagMD.Inc()
+			if !rejectInvalidTags {
+				log.Debugf("in: Invalid metric %v, not rejecting it because rejection due to invalid tags is disabled: %s", md, err)
+				ignoreError = true


this is fragile because md.Validate() just returns the first failure it encounters (out of possibly many). we rely on the implementation detail that in md.Validate(), the tag check is the last one.
it's possible that an md is invalid for multiple reasons. if we ever add another validation check after the tag check, we would ignore that other failure.

But i don't see a simple solution, so it's probably fine. maybe add a comment about this

good point, will add a comment

Dieterbe · 2019-06-19T19:09:38Z

input/input_test.go

+	rejectInvalidTags = false
+	data := getTestMetricData()
+	data.Tags = []string{"valid=tag"}
+	testIngestMetricData(t, "valid_with_rejection_0", data, handler, index, 0, 0, 1)


why don't we follow the best practice of 1 function that has all the cases listed as entries in a table?

Just because it results in clearer errors when one fails. but I can easily change that

also, this is a lot of code for testing something that should be fairly trivial. especially all the code around dynamically generating test data, do we need it?

It definitely helps to make sure that the logic does what we want it to, and to ensure it doesn't get changed by accident in the future for example due to changes in raintank/schema.

Dieterbe

needs a couple tweaks

replay · 2019-06-19T21:41:31Z

@Dieterbe thx for the review. I think I've addressed all the comments.

replay force-pushed the reject_invalid_tags branch from a4f76c5 to 1b4d285 Compare June 14, 2019 02:49

replay added 5 commits June 18, 2019 22:35

update vendored raintank/schema to latest

a796977

fix benchmarks

d5c92ef

update docs with new counter

ae9ee2e

replay force-pushed the reject_invalid_tags branch from 1b4d285 to ae9ee2e Compare June 18, 2019 22:36

add log statement

ca2efba

Dieterbe reviewed Jun 19, 2019

View reviewed changes

Dieterbe suggested changes Jun 19, 2019

View reviewed changes

replay force-pushed the reject_invalid_tags branch from c22f89d to f3283df Compare June 19, 2019 21:31

replay added 3 commits June 19, 2019 21:38

update docs and example configs

61eaf69

add comment

dcc06fe

refactor tests

ba021f9

replay force-pushed the reject_invalid_tags branch from f3283df to ba021f9 Compare June 19, 2019 21:40

update dashboard

096af6c

Dieterbe approved these changes Jun 20, 2019

View reviewed changes

Dieterbe merged commit 4ee8716 into master Jun 20, 2019

Dieterbe deleted the reject_invalid_tags branch June 20, 2019 06:57

robert-milan restored the reject_invalid_tags branch June 20, 2019 15:00

replay deleted the reject_invalid_tags branch June 22, 2019 14:43

replay mentioned this pull request Jul 10, 2019

Characters supported in tag values #1216

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reject invalid tags #1348

Reject invalid tags #1348

replay commented Jun 14, 2019

Dieterbe commented Jun 14, 2019

replay commented Jun 14, 2019

replay commented Jun 14, 2019 •

edited

Loading

replay commented Jun 18, 2019

Dieterbe commented Jun 19, 2019

Dieterbe commented Jun 19, 2019

Dieterbe Jun 19, 2019

replay Jun 19, 2019

Dieterbe Jun 19, 2019

replay Jun 19, 2019

Dieterbe Jun 19, 2019

replay Jun 19, 2019

Dieterbe left a comment

replay commented Jun 19, 2019

Reject invalid tags #1348

Reject invalid tags #1348

Conversation

replay commented Jun 14, 2019

Dieterbe commented Jun 14, 2019

replay commented Jun 14, 2019

replay commented Jun 14, 2019 • edited Loading

replay commented Jun 18, 2019

Dieterbe commented Jun 19, 2019

Dieterbe commented Jun 19, 2019

Dieterbe Jun 19, 2019

Choose a reason for hiding this comment

replay Jun 19, 2019

Choose a reason for hiding this comment

Dieterbe Jun 19, 2019

Choose a reason for hiding this comment

replay Jun 19, 2019

Choose a reason for hiding this comment

Dieterbe Jun 19, 2019

Choose a reason for hiding this comment

replay Jun 19, 2019

Choose a reason for hiding this comment

Dieterbe left a comment

Choose a reason for hiding this comment

replay commented Jun 19, 2019

replay commented Jun 14, 2019 •

edited

Loading