Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputs.internet_speed.enable_file_download option appears to be misnamed #11870

Closed
g-a-c opened this issue Sep 22, 2022 · 6 comments · Fixed by #11877
Closed

inputs.internet_speed.enable_file_download option appears to be misnamed #11870

g-a-c opened this issue Sep 22, 2022 · 6 comments · Fixed by #11877
Labels
bug unexpected problem or unintended behavior

Comments

@g-a-c
Copy link
Contributor

g-a-c commented Sep 22, 2022

Relevant telegraf.conf

[[inputs.internet_speed]]
  enable_file_download = false

Logs from Telegraf

n/a

System info

Telegraf 1.23.4

Docker

No response

Steps to reproduce

  1. Enable the speed test plugin
  2. Enable enable_file_download
  3. Disable enable_file_download
  4. While both of these may work, with this setting disabled the test should run for longer
  5. It's also possible that with this mode disabled, the test will timeout because it takes longer

Expected behavior

The option should be more appropriately named, to align with the upstream behaviour

Actual behavior

The option does not appear to be named consistently with upstream

Additional info

Assuming you use enable_file_download = true, this is commented in Telegraf as ## Sets if runs file download test. But if we follow this step-by-step through the code...

We start in internet_speed.go

type InternetSpeed struct {
	EnableFileDownload bool            `toml:"enable_file_download"`
	Cache              bool            `toml:"cache"`
	Log                telegraf.Logger `toml:"-"`
	serverCache        *speedtest.Server
}

So we know that EnableFileDownload is set to true if it's in the config. This is then consumed further down within internet_speed.go and passed into the DownloadTest function as true:

	err = s.DownloadTest(is.EnableFileDownload)

This jumps us over to the speedtest-go library, specifically into the DownloadTest function in request.go, where this true parameter is actually being consumed to set the savingMode property. According to the speedtest-go README, "saving mode" is nothing to do with downloading files, but appears to be a memory-saving option, which acknowledges that when it is enabled the results will be inaccurate:

  --saving-mode        Using less memory (≒10MB), though low accuracy (especially > 30Mbps).

So I think this indicates several things here:

  • the enable_file_download option in Telegraf is mis-named; the option has nothing to do with downloading files but is actually a memory saving option
  • both the upload and download tests are always run regardless of whether this is set
  • enable_file_download should be set to false for the most accurate results; HOWEVER some Speedtest.net servers appear to not like this mode because the library sends/receives more data. There is an upstream bug (speedtest-go work only in saving-mode showwin/speedtest-go#73) for "the client only works with --saving-mode" which I was able to broadly reproduce
@g-a-c g-a-c added the bug unexpected problem or unintended behavior label Sep 22, 2022
@powersj
Copy link
Contributor

powersj commented Sep 22, 2022

Completely agree the naming was not great and something I regret not catching. It is something that has come up before as a source of confusion.

We do not go removing options between minor versions, but we could add a new option and deprecate the old one, so that in a future 2.x version, the old option is removed.

Would you be willing to put up a PR for that?

@powersj powersj added the waiting for response waiting for response from contributor label Sep 22, 2022
@g-a-c
Copy link
Contributor Author

g-a-c commented Sep 22, 2022

Would you be willing to put up a PR for that?

I could take a look when I have a minute, sure. What would be the feeling around changing the default for this option to be true since this appears to improve the reliability of running the tests (but the upstream claims that the accuracy may be reduced). My gut feeling would be that "speed tests with reduced accuracy that always run" are better than "speed tests that are more accurate when they run, but that may often fail".

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Sep 22, 2022
@Hipska
Copy link
Contributor

Hipska commented Sep 23, 2022

We tend to be backwards compatible, but is some cases we deprecated settings to not being used anymore.

I'm not sure what you want in the future? Have a setting for file download or one for save mode? Or maybe both?

@g-a-c
Copy link
Contributor Author

g-a-c commented Sep 23, 2022

I'm not sure what you want in the future? Have a setting for file download or one for save mode? Or maybe both?

There is no setting for "file download". The option called "file download" in Telegraf actually controls the "save resources" flag in the upstream library. When this mode is enabled, it appears to make the upstream library do a simpler, shorter test supposedly at the expense of some accuracy.

With Telegraf as it stands right now:

  • enable_file_download does not control anything to do with file downloads
  • enable_file_download defaults false
  • This means that:
    • by default "save resources" mode is off in the upstream library
    • the upstream library claims to be more accurate but is also more prone to failure (the box I currently have Telegraf on cannot complete a single test successfully with "save resources = off")

What I'm proposing as an end goal is:

  • rename enable_file_download to something more appropriate like low_resources, save_resources (and refactor the internal variables to suit)
  • make this new setting default to true
  • This means that:
    • the setting name is more consistent with what it actually controls
    • there is a greater chance of the tests "just working" with the minimum configuration, rather than someone just turning it on and finding that it doesn't work, as happened for me (and at least one other user has the same problem in the upstream bug that I linked to)

@powersj
Copy link
Contributor

powersj commented Sep 23, 2022

I would prefer we keep the current behavior (false) and change the variable name. As a part of this you could add a section in the readme about the behavior to explain why a user would want to turn this off and the trade off you mentioned above.

As the code comment says, changing the behavior to true results in lower accuracy in connections with >30Mbps, that bar seems to be pretty low for most users of telegraf.

@Hipska
Copy link
Contributor

Hipska commented Sep 23, 2022

Yeah, that would be something like this + some extra info in the readme + sample config:

diff --git a/plugins/inputs/internet_speed/internet_speed.go b/plugins/inputs/internet_speed/internet_speed.go
index f517e5227..9c20ff3c4 100644
--- a/plugins/inputs/internet_speed/internet_speed.go
+++ b/plugins/inputs/internet_speed/internet_speed.go
@@ -17,7 +17,8 @@ var sampleConfig string
 
 // InternetSpeed is used to store configuration values.
 type InternetSpeed struct {
-	EnableFileDownload bool            `toml:"enable_file_download"`
+	EnableFileDownload bool            `toml:"enable_file_download" deprecated:"1.25.0;use 'save_resources' instead"`
+	SaveResources      bool            `toml:"save_resources"`
 	Cache              bool            `toml:"cache"`
 	Log                telegraf.Logger `toml:"-"`
 	serverCache        *speedtest.Server
@@ -29,6 +30,12 @@ func (*InternetSpeed) SampleConfig() string {
 	return sampleConfig
 }
 
+func (is *InternetSpeed) Init() error {
+	is.SaveResources = is.SaveResources || is.EnableFileDownload
+
+	return nil
+}
+
 func (is *InternetSpeed) Gather(acc telegraf.Accumulator) error {
 	// Get closest server
 	s := is.serverCache
@@ -58,12 +65,12 @@ func (is *InternetSpeed) Gather(acc telegraf.Accumulator) error {
 		return fmt.Errorf("ping test failed: %v", err)
 	}
 	is.Log.Debug("Running Download...")
-	err = s.DownloadTest(is.EnableFileDownload)
+	err = s.DownloadTest(is.SaveResources)
 	if err != nil {
 		return fmt.Errorf("download test failed: %v", err)
 	}
 	is.Log.Debug("Running Upload...")
-	err = s.UploadTest(is.EnableFileDownload)
+	err = s.UploadTest(is.SaveResources)
 	if err != nil {
 		return fmt.Errorf("upload test failed failed: %v", err)
 	}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants