-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Enum Processor #3772
Add Enum Processor #3772
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pull request, I'm very interested in this one and I know quite a few other are as well. Here are a few ideas and observations I have:
plugins/processors/enum/README.md
Outdated
[processors.enum.fields.value_mappings] | ||
green = 0 | ||
yellow = 1 | ||
red = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will want a way to specify a default case, or perhaps we should remove the field if it does not match? If we emit the original string the data type will change, which will cause problems on insertion to the database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using this plugin together with #3763, which drops any string values and thus removing unmatched values. For a more generic use-case I see the benefit in having a default behaviour to ensure a constant data type. I proposed to make dropping the field if it contains an unspecified value the default behaviour and optionally use a configurable default value:
[[processors.enum]]
[[processors.enum.fields]]
source = "status"
default = -1
[processors.enum.fields.value_mappings]
green = 0
yellow = 1
Would that be a reasonable solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
plugins/processors/enum/README.md
Outdated
[processors.enum.fields.value_mappings] | ||
green = 0 | ||
yellow = 1 | ||
red = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice if you could optionally modify the output fieldkey, so that you could transform like this:
-foo status="red"
+foo status="red",status_code=2i
Maybe something like this?:
[[processors.enum]]
[[processors.enum.fields]]
source = "status"
destination = "status_code"
[processors.enum.fields.value_mappings]
green = 0
yellow = 1
red = 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good idea. I will implement it.
plugins/processors/enum/enum.go
Outdated
} | ||
|
||
func (mapper *EnumMapper) Apply(in ...telegraf.Metric) []telegraf.Metric { | ||
out := make([]telegraf.Metric, 0, len(in)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try to make this as efficient as possible since it will be ran in over all metrics, perhaps we can modify in
directly?
plugins/processors/enum/enum.go
Outdated
} | ||
|
||
func (mapper *EnumMapper) applyMappings(source telegraf.Metric) (telegraf.Metric, error) { | ||
if fields, changed := mapper.applyFieldMappings(source.Fields()); changed == true { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of grabbing all the fields, it might be faster to use HasField
to check for the fields we are interested in and modify the metric instead of creating a new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking with HasField
first is a great suggestion. I will implement that.
As discussed above I have trouble when modifying the metric directly in the case where it has just the one metric we are interested in. Note, that the implementation will only create a new metric when a substitution was made and pass on the original object otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I flipped through the other comments and couldn't find the discussion of this, can you tell me more about the problem modifying the metric directly or point me in the right direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can provide a simple test to demonstrate my observations:
import (
"fmt"
"testing"
"time"
"github.com/influxdata/telegraf/metric"
)
func TestDirectModification(t *testing.T) {
metric, _ := metric.New("m",
map[string]string{},
map[string]interface{}{"test": "value"},
time.Now(),
)
fmt.Println(metric.String()) // original state
metric.RemoveField("test")
fmt.Println(metric.String()) // field is not removed
metric.AddField("test", 13)
fmt.Println(metric.String()) // added new value
metric.RemoveField("test")
fmt.Println(metric.String()) // remove old value, but colon error
metric.Fields() // panic because of colon from above
}
This creates the following output:
$ go test
m test="value" 1519041597145903000
m test="value" 1519041597145903000
m test="value",test=13i 1519041597145903000
m ,test=13i 1519041597145903000
--- FAIL: TestDirectModification (0.00s)
panic: runtime error: index out of range [recovered]
panic: runtime error: index out of range
goroutine 5 [running]:
testing.tRunner.func1(0xc4201080f0)
/usr/local/Cellar/go/1.9.3/libexec/src/testing/testing.go:711 +0x2d2
panic(0x1292140, 0x1458a40)
/usr/local/Cellar/go/1.9.3/libexec/src/runtime/panic.go:491 +0x283
github.com/influxdata/telegraf/metric.(*metric).Fields(0xc42009a200, 0x1)
/Workspaces/go/src/github.com/influxdata/telegraf/metric/metric.go:345 +0xa22
github.com/influxdata/telegraf/plugins/processors/enum.TestDirectModification(0xc4201080f0)
/Workspaces/go/src/github.com/influxdata/telegraf/plugins/processors/enum/demo_test.go:24 +0x4f9
testing.tRunner(0xc4201080f0, 0x12f5940)
/usr/local/Cellar/go/1.9.3/libexec/src/testing/testing.go:746 +0xd0
created by testing.(*T).Run
/usr/local/Cellar/go/1.9.3/libexec/src/testing/testing.go:789 +0x2de
exit status 2
FAIL github.com/influxdata/telegraf/plugins/processors/enum 0.020s
So the following behaviour can be observed:
RemoveField(test)
on the last field does not change the metric, the field remains.AddField(test,13)
creates a new field with a different valueRemoveField(test)
removes the old value but leads to a leading ',' which indicates a problem- the problem from above manifests in a panic when the
Fields()
is called
This problem is always present, if the leading field in the string representation is removed. Even when more fields are present. Therefore, I decided, that modifying a field value directly by this approach is not possible. The issue lies in the unreliability of RemoveField(...)
.
Is there another approach to a direct modification? I really want to keep the feature of replacing the original value not just adding the mapped value as a new field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm rewriting the implementation of telegraf.Metric
for 1.6 and I'll fix this, I'll let you know when its ready for testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix is included in #3924
closes #2791 |
What if we need to map few strings to the same numeric value? |
Using this processor you specify a map what string to map to what value. You can easily specify multiple strings mapping to the same value, e.g.: [[processors.enum]]
[processors.enum.fields.value_mappings]
green = 0
yellow = 1
red = 1 |
ce31a1d
to
4fbcd1f
Compare
I committed changes addressing the questions by @danielnelson quite some time ago. Can we have another review on the current state of this PR? |
This processor allows substitution of field values of string or bool type with numeric values. It can be used, e.g. to map status codes such as "red", "amber" and "green" to values such as 0, 1, 2. Multiple fields can be configured, each with its own mapping table. Signed-off-by: Karsten Schnitter <[email protected]>
4fbcd1f
to
994adb5
Compare
I re-pushed to fix the merge conflict. |
I tried to map MAC-Addresses to hostname, but it doesn't work for me. Could it be possible that handling of more complex strings (containing : or _ ) and hwaddr is not yet covered? Or is therer a limitation for tag-fields? With this setting hostname is not filled with default value nor hostnames to MAC-Addresses.
|
Hi @gutschein, thanks for trying out the enum processor plugin. Currently, the plugins supports only mapping for fields, but not for tags. If I understand your example correctly, the configuration You could try to first emit the mac address as a field using If your problem persists, please provide an example measurement generated by your input configuration and the desired output from the enum processor plugin. This makes it easier to understand the issue and provide a solution. Best Regards, |
thanks for writing enum... :) and thanks for your response. Good idea to change field to tag later usinge the converter. I give it a try after the basic is running. You mentioned an option to order of the processing, too...interessting I wasn't aware of this option. Back to the basics: I tried it with the destination field as a simple field (not tag) but it stilld doesn't work. Even the My target: simplified target It seems to me that enum just accepts strings w/o space (an w/o |
I do not think, that a whitespace character is an issue. I think, that you suffer from pitfall 1: The enum processor plugin can only read fields, but no tags. In your example the plugin would look for a field If you want support for tags in the enum processor plugin, I suggest to create an issue with the corresponding feature request. Regarding the other pitfalls (2 and 3) it would be nice to have a short example that proves the misbehaviour. Can you provide that? So far I am not convinced, that there is an issue. But I will look into this in the next few days. |
Ah, ok, thanks for explanation. So...I will try next weekend: Test with field instead of a tag containing a whitespace. |
Tag support for |
@glinton, I missed out on that one. This is a great feature. I guess @gutschein just needs to change |
I just tested it...indeed...my example from 7 days ago works perfect with |
This processor allows substitution of field values of string or bool type with
numeric values. It can be used, e.g. to map status codes such as "red", "amber"
and "green" to values such as 0, 1, 2. Multiple fields can be configured, each
with its own mapping table.
Required for all PRs: