Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: trim whitespace on ethtool input plugin #9901

Merged
merged 5 commits into from
Oct 19, 2021

Conversation

powersj
Copy link
Contributor

@powersj powersj commented Oct 11, 2021

The VMware vmxnet3 driver has hard-coded whitespace in the statistics descriptions. This is done to help make the ethtool output look "nice". However, this is causing lots of weird output and escaped characters in Telegraf output. This PR will remove any leading and trailing whitespace from these description keys.

Resolves: #9902

@powersj powersj changed the title debug ethtool fields fix: debug ethtool fields Oct 11, 2021
@powersj powersj changed the title fix: debug ethtool fields fix: trim whitespace on ethtool input plugin Oct 11, 2021
@influxdata influxdata deleted a comment from telegraf-tiger bot Oct 11, 2021
@powersj powersj marked this pull request as ready for review October 11, 2021 18:15
@powersj powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Oct 11, 2021
@powersj
Copy link
Contributor Author

powersj commented Oct 12, 2021

@reimda would you want this to be a flag in order to not change things for users who are ignoring this? or feel it is a proper fix?

@reimda
Copy link
Contributor

reimda commented Oct 12, 2021

would you want this to be a flag in order to not change things for users who are ignoring this? or feel it is a proper fix?

Since it's a workaround for a single unusual network device driver I don't think it should need a flag. Metrics for these unusual devices seem pretty much unusable so I expect everyone would want the fix.

Usually tag and field names have underscores between words and not spaces. (see https://github.com/influxdata/telegraf/blob/master/docs/developers/REVIEWS.md#metric-schema and https://docs.influxdata.com/influxdb/v1.8/concepts/schema_and_data_layout/#encouraged-schema-design) Maybe we should replace the spaces between words with underscores for these fields too.

@powersj
Copy link
Contributor Author

powersj commented Oct 12, 2021

Hmm yes replacing spaces with underscores does seem to be a good idea here. Are you good with that being part of the bug fix as well? or something separate with a flag?

@reimda
Copy link
Contributor

reimda commented Oct 12, 2021

I don't know if other ethernet devices have spaces. If it was just vmxnet3 I'd say we don't need a flag for underscores, but we would need to be confident it wasn't going to affect other devices.

From a quick look at linux kernel source, here's where vmware added the whitespace and didn't use underscore:
https://elixir.bootlin.com/linux/latest/source/drivers/net/vmxnet3/vmxnet3_ethtool.c#L42

Compare with a typical driver:
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/intel/e1000/e1000_ethtool.c#L26

There are 160 or so drivers that return ethtool info. I spot checked 10 of them and found two with spaces. That's enough to show a significant portion of users would be affected if this plugin started switching spaces to underscores.

We definitely need a flag to switch spaces to underscores. I'm changing my mind on trimming whitespace too and think that should be a flag. Then there's no chance existing users will be affected by the change, but users who want cleaner names have a way to get them.

I did also run into one driver that uses camel case. It would be nice if we handled that too.

Here are the drivers I checked if you're interested. Maybe we should copy some of these into unit tests.

underscores: 7
https://elixir.bootlin.com/linux/latest/source/drivers/net/xen-netfront.c#L2354
https://elixir.bootlin.com/linux/latest/source/drivers/net/hyperv/netvsc_drv.c#L1475
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c#L42
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c#L263
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/marvell/mv643xx_eth.c#L1447
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/brocade/bna/bnad_ethtool.c#L1075
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/altera/altera_tse_ethtool.c#L231

spaces: 1
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c#L17

mixed underscore and spaces: 1
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c#L28

CamelCase: 1
https://elixir.bootlin.com/linux/latest/source/drivers/net/dsa/lantiq_gswip.c#L301

@reimda
Copy link
Contributor

reimda commented Oct 12, 2021

I'd prefer not to add multiple bool flags to select which field name normalizations should happen. trim_whitespace = true, replace_spaces = true, translate_uppercase = true. If we do it this way the configuration section for a plugin is many lines and future changes just make it longer.

If we are sure we know about all possible normalizations, we could have one setting normalize_field_names = true that does everything. The problem there is if we miss something, in a future release we would have to add another setting normalize_field_names_more = true which is ugly.

We should probably have a setting "workarounds" or "normalize_field_names" that's an array of strings, then users can choose which ones they want to apply: "trim", "underscore", "uppercase". Then if there's something else in the future, we just add another string. normalize_field_names = ["trim", "underscore"]

@powersj
Copy link
Contributor Author

powersj commented Oct 12, 2021

Thanks for looking into this! How about something like:

  ## Some drivers declare statistics with extra whitespace, different spacing,
  ## and mix cases. This list, when enabled, can be used to clean the keys.
  ## Here are the current possible normalizations:
  ##  * trim: removes leading and trailing whitespace
  ##  * lower: changes all capitalized letters to lowercase
  ##  * spaces: replaces spaces with underscores
  # normalize_key_names = ["trim", "lower", "spaces"]

@Hipska
Copy link
Contributor

Hipska commented Oct 14, 2021

Do you have an example of before/after output? That gives some context to reviewers.

@Hipska Hipska added area/network New plugins or feature relating to Network Monitoring feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins labels Oct 14, 2021
@powersj
Copy link
Contributor Author

powersj commented Oct 14, 2021

In #9902 I have one example that we saw based on the VMware vmxnet3 driver, here part of it:

dns_ethtool,driver=vmxnet3,host=foobar,interface=enp0s3 \ \ mcast\ pkts\ rx=14i,\ \ \ \ \ tso=0i,\ \ LRO\ byte\ rx=0i,Rx\ Queue#=0i,\ \ TSO\ pkts\ tx=10i,\ \ bcast\ pkts\ tx=3i,\ \ pkts\ linearized=0i,

There are numerous stats with leading whitespace and then whitespace between words in the stats. As @reimda pointed out, there are best practices for naming tags and fields, which involve not having leading/trialing whitespace or whitespace between words. When he looked at other drivers showed some hard-code camel case stat names as well.

If you look at the tests you can see some examples of how this would work. Take the following example

Original stat:        "  RX Sent "
Spaces normalization: "__RX_Sent_"
Lower normalization:  "  rx sent "
Trim normalization:   "RX Sent"
All three:            "rx_sent"

@reimda
Copy link
Contributor

reimda commented Oct 14, 2021

You added lower case which is nice, but we probably should handle camel case too since there are drivers that use it RxBytesReceived rx_bytes_received

Hipska
Hipska previously requested changes Oct 15, 2021
Copy link
Contributor

@Hipska Hipska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Init method should check if the given values are valid..

plugins/inputs/ethtool/ethtool_linux.go Outdated Show resolved Hide resolved
plugins/inputs/ethtool/ethtool_linux.go Show resolved Hide resolved
plugins/inputs/ethtool/ethtool_linux.go Show resolved Hide resolved
@powersj
Copy link
Contributor Author

powersj commented Oct 18, 2021

@reimda ready for re-review

Copy link
Contributor

@reimda reimda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the doc nitpick this looks good to me.

plugins/inputs/ethtool/README.md Outdated Show resolved Hide resolved
The upstream vmxnet3 driver declares the network driver stats
descriptions with leading and trailing whitespace. This was done to help
with the ethtool output on the CLI. The ethtool plugin uses a library
that polls the driver's stats directly and therefore this ugly
whitespace will show up in Telegraf output as well.
@powersj powersj dismissed Hipska’s stale review October 19, 2021 20:44

fixed naming items

@powersj powersj merged commit 3e1ebdb into influxdata:master Oct 19, 2021
phemmer added a commit to phemmer/telegraf that referenced this pull request Oct 26, 2021
* origin/master: (176 commits)
  fix: Linter fixes for plugins/inputs/[h-j]* (influxdata#9986)
  fix(inputs/kube_inventory): don't skip resources with zero s/ns timestamps (influxdata#9978)
  fix: update gjson to v1.10.2 (influxdata#9998)
  fix: procstat tags were not getting generated correctly (influxdata#9973)
  chore: create bug report form (influxdata#9976)
  fix: Allow for non x86 macs in Go install script (influxdata#9982)
  test: add sqlserver plugin integration tests (influxdata#9943)
  feat: plugins/common/tls/config.go: Filter client certificates by DNS names (influxdata#9910)
  feat: add option to skip table creation in azure data explorer output (influxdata#9942)
  docs: update nightlies links (influxdata#9989)
  fix: add s390x to nightlies (influxdata#9990)
  feat: Add more details to processors.ifname logmessages (influxdata#9984)
  docs: Create SECURITY.md (influxdata#9951)
  fix: set NIGHTLY=1 for correctly named nightly artifacts (influxdata#9987)
  feat: Kafka Add metadata full to config (influxdata#9833)
  chore: Update to AWS SDK v2 (influxdata#9647)
  chore: lint ignore fmt.Printf unhandled error (influxdata#9967)
  fix: starlark pop operation for non-existing keys (influxdata#9954)
  feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#9876)
  fix: Check return code of zfs command for FreeBSD. (influxdata#9956)
  chore: update go to 1.17.2 (influxdata#9873)
  fix: Graylog plugin TLS support and message format (influxdata#9862)
  docs: update README with info on package repos (influxdata#9964)
  feat: Modbus connection settings (serial) (influxdata#9256)
  fix: segfault in ingress, persistentvolumeclaim, statefulset in kube_inventory (influxdata#9585)
  fix: add normalization of tags for ethtool input plugin (influxdata#9901)
  chore: remove empty build.py (influxdata#9958)
  fix: internet_speed input plugin not collecting/reporting latency (influxdata#9957)
  chore: reference db2 external plugin (influxdata#9952)
  chore: update readme go version from 1.14 to 1.17 (influxdata#9944)
  fix: decode Prometheus scrape path from Kuberentes labels (influxdata#9662)
  docs: fix broken link (influxdata#9812)
  fix: Correct conversion of int with specific bit size (influxdata#9933)
  fix: update golanci-lint to v1.42.1 (influxdata#9932)
  feat: Azure Event Hubs output plugin (influxdata#9346)
  feat: more fields for papertrail event webhook (influxdata#9940)
  fix: solve compatibility issue for mongodb inputs when using 5.x relicaset (influxdata#9892)
  docs: Add symlink to command documentation (influxdata#9926)
  docs: update contributing.md (influxdata#9914)
  chore: reference oracle external plugin (influxdata#9934)
  ...
phemmer added a commit to phemmer/telegraf that referenced this pull request Oct 26, 2021
* origin/master: (176 commits)
  fix: Linter fixes for plugins/inputs/[h-j]* (influxdata#9986)
  fix(inputs/kube_inventory): don't skip resources with zero s/ns timestamps (influxdata#9978)
  fix: update gjson to v1.10.2 (influxdata#9998)
  fix: procstat tags were not getting generated correctly (influxdata#9973)
  chore: create bug report form (influxdata#9976)
  fix: Allow for non x86 macs in Go install script (influxdata#9982)
  test: add sqlserver plugin integration tests (influxdata#9943)
  feat: plugins/common/tls/config.go: Filter client certificates by DNS names (influxdata#9910)
  feat: add option to skip table creation in azure data explorer output (influxdata#9942)
  docs: update nightlies links (influxdata#9989)
  fix: add s390x to nightlies (influxdata#9990)
  feat: Add more details to processors.ifname logmessages (influxdata#9984)
  docs: Create SECURITY.md (influxdata#9951)
  fix: set NIGHTLY=1 for correctly named nightly artifacts (influxdata#9987)
  feat: Kafka Add metadata full to config (influxdata#9833)
  chore: Update to AWS SDK v2 (influxdata#9647)
  chore: lint ignore fmt.Printf unhandled error (influxdata#9967)
  fix: starlark pop operation for non-existing keys (influxdata#9954)
  feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#9876)
  fix: Check return code of zfs command for FreeBSD. (influxdata#9956)
  chore: update go to 1.17.2 (influxdata#9873)
  fix: Graylog plugin TLS support and message format (influxdata#9862)
  docs: update README with info on package repos (influxdata#9964)
  feat: Modbus connection settings (serial) (influxdata#9256)
  fix: segfault in ingress, persistentvolumeclaim, statefulset in kube_inventory (influxdata#9585)
  fix: add normalization of tags for ethtool input plugin (influxdata#9901)
  chore: remove empty build.py (influxdata#9958)
  fix: internet_speed input plugin not collecting/reporting latency (influxdata#9957)
  chore: reference db2 external plugin (influxdata#9952)
  chore: update readme go version from 1.14 to 1.17 (influxdata#9944)
  fix: decode Prometheus scrape path from Kuberentes labels (influxdata#9662)
  docs: fix broken link (influxdata#9812)
  fix: Correct conversion of int with specific bit size (influxdata#9933)
  fix: update golanci-lint to v1.42.1 (influxdata#9932)
  feat: Azure Event Hubs output plugin (influxdata#9346)
  feat: more fields for papertrail event webhook (influxdata#9940)
  fix: solve compatibility issue for mongodb inputs when using 5.x relicaset (influxdata#9892)
  docs: Add symlink to command documentation (influxdata#9926)
  docs: update contributing.md (influxdata#9914)
  chore: reference oracle external plugin (influxdata#9934)
  ...
powersj added a commit that referenced this pull request Oct 27, 2021
VladislavSenkevich pushed a commit to gwos/telegraf that referenced this pull request Nov 23, 2021
phemmer added a commit to phemmer/telegraf that referenced this pull request Nov 30, 2021
* origin/master: (176 commits)
  fix: Linter fixes for plugins/inputs/[h-j]* (influxdata#9986)
  fix(inputs/kube_inventory): don't skip resources with zero s/ns timestamps (influxdata#9978)
  fix: update gjson to v1.10.2 (influxdata#9998)
  fix: procstat tags were not getting generated correctly (influxdata#9973)
  chore: create bug report form (influxdata#9976)
  fix: Allow for non x86 macs in Go install script (influxdata#9982)
  test: add sqlserver plugin integration tests (influxdata#9943)
  feat: plugins/common/tls/config.go: Filter client certificates by DNS names (influxdata#9910)
  feat: add option to skip table creation in azure data explorer output (influxdata#9942)
  docs: update nightlies links (influxdata#9989)
  fix: add s390x to nightlies (influxdata#9990)
  feat: Add more details to processors.ifname logmessages (influxdata#9984)
  docs: Create SECURITY.md (influxdata#9951)
  fix: set NIGHTLY=1 for correctly named nightly artifacts (influxdata#9987)
  feat: Kafka Add metadata full to config (influxdata#9833)
  chore: Update to AWS SDK v2 (influxdata#9647)
  chore: lint ignore fmt.Printf unhandled error (influxdata#9967)
  fix: starlark pop operation for non-existing keys (influxdata#9954)
  feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#9876)
  fix: Check return code of zfs command for FreeBSD. (influxdata#9956)
  chore: update go to 1.17.2 (influxdata#9873)
  fix: Graylog plugin TLS support and message format (influxdata#9862)
  docs: update README with info on package repos (influxdata#9964)
  feat: Modbus connection settings (serial) (influxdata#9256)
  fix: segfault in ingress, persistentvolumeclaim, statefulset in kube_inventory (influxdata#9585)
  fix: add normalization of tags for ethtool input plugin (influxdata#9901)
  chore: remove empty build.py (influxdata#9958)
  fix: internet_speed input plugin not collecting/reporting latency (influxdata#9957)
  chore: reference db2 external plugin (influxdata#9952)
  chore: update readme go version from 1.14 to 1.17 (influxdata#9944)
  fix: decode Prometheus scrape path from Kuberentes labels (influxdata#9662)
  docs: fix broken link (influxdata#9812)
  fix: Correct conversion of int with specific bit size (influxdata#9933)
  fix: update golanci-lint to v1.42.1 (influxdata#9932)
  feat: Azure Event Hubs output plugin (influxdata#9346)
  feat: more fields for papertrail event webhook (influxdata#9940)
  fix: solve compatibility issue for mongodb inputs when using 5.x relicaset (influxdata#9892)
  docs: Add symlink to command documentation (influxdata#9926)
  docs: update contributing.md (influxdata#9914)
  chore: reference oracle external plugin (influxdata#9934)
  ...
phemmer added a commit to phemmer/telegraf that referenced this pull request Jan 3, 2022
* origin/master: (176 commits)
  fix: Linter fixes for plugins/inputs/[h-j]* (influxdata#9986)
  fix(inputs/kube_inventory): don't skip resources with zero s/ns timestamps (influxdata#9978)
  fix: update gjson to v1.10.2 (influxdata#9998)
  fix: procstat tags were not getting generated correctly (influxdata#9973)
  chore: create bug report form (influxdata#9976)
  fix: Allow for non x86 macs in Go install script (influxdata#9982)
  test: add sqlserver plugin integration tests (influxdata#9943)
  feat: plugins/common/tls/config.go: Filter client certificates by DNS names (influxdata#9910)
  feat: add option to skip table creation in azure data explorer output (influxdata#9942)
  docs: update nightlies links (influxdata#9989)
  fix: add s390x to nightlies (influxdata#9990)
  feat: Add more details to processors.ifname logmessages (influxdata#9984)
  docs: Create SECURITY.md (influxdata#9951)
  fix: set NIGHTLY=1 for correctly named nightly artifacts (influxdata#9987)
  feat: Kafka Add metadata full to config (influxdata#9833)
  chore: Update to AWS SDK v2 (influxdata#9647)
  chore: lint ignore fmt.Printf unhandled error (influxdata#9967)
  fix: starlark pop operation for non-existing keys (influxdata#9954)
  feat: update etc/telegraf.conf and etc/telegraf_windows.conf (influxdata#9876)
  fix: Check return code of zfs command for FreeBSD. (influxdata#9956)
  chore: update go to 1.17.2 (influxdata#9873)
  fix: Graylog plugin TLS support and message format (influxdata#9862)
  docs: update README with info on package repos (influxdata#9964)
  feat: Modbus connection settings (serial) (influxdata#9256)
  fix: segfault in ingress, persistentvolumeclaim, statefulset in kube_inventory (influxdata#9585)
  fix: add normalization of tags for ethtool input plugin (influxdata#9901)
  chore: remove empty build.py (influxdata#9958)
  fix: internet_speed input plugin not collecting/reporting latency (influxdata#9957)
  chore: reference db2 external plugin (influxdata#9952)
  chore: update readme go version from 1.14 to 1.17 (influxdata#9944)
  fix: decode Prometheus scrape path from Kuberentes labels (influxdata#9662)
  docs: fix broken link (influxdata#9812)
  fix: Correct conversion of int with specific bit size (influxdata#9933)
  fix: update golanci-lint to v1.42.1 (influxdata#9932)
  feat: Azure Event Hubs output plugin (influxdata#9346)
  feat: more fields for papertrail event webhook (influxdata#9940)
  fix: solve compatibility issue for mongodb inputs when using 5.x relicaset (influxdata#9892)
  docs: Add symlink to command documentation (influxdata#9926)
  docs: update contributing.md (influxdata#9914)
  chore: reference oracle external plugin (influxdata#9934)
  ...
@powersj powersj deleted the debug/ethtool branch January 23, 2022 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/network New plugins or feature relating to Network Monitoring feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ethtool: lots of extra escaped charachters in output
3 participants