Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long startup time for influx CLI since introducing Flux 0.42 #15441

Closed
mark-rushakoff opened this issue Oct 16, 2019 · 13 comments · Fixed by #15777
Closed

Long startup time for influx CLI since introducing Flux 0.42 #15441

mark-rushakoff opened this issue Oct 16, 2019 · 13 comments · Fixed by #15777

Comments

@mark-rushakoff
Copy link
Contributor

I noticed that it is taking about 2.5 seconds to run influx help on my workstation.

I bisected to the commit that upgraded to Flux 0.42. Before that it takes around a second, which is still too long IMO.

I have not yet taken a CPU profile to understand where exactly time is being spent.

bash-3.2$ git log -1
commit 454003841aabeb74396d73541378bfa59c75b5db (HEAD)
Author: Nathaniel Cook <[email protected]>
Date:   Fri Aug 30 12:22:48 2019 -0600

    build(flux): upgrade to Flux v0.42.0

    Skips a test that requires special dependencies to pass.
bash-3.2$ GO111MODULE=on go build ./cmd/influx && time ./influx help
Influx Client

Usage:
  influx [flags]
  influx [command]
<snip>

real	0m2.355s
user	0m3.832s
sys	0m0.120s

bash-3.2$ git checkout 45400384^
Previous HEAD position was 454003841 build(flux): upgrade to Flux v0.42.0
HEAD is now at 333dc1d19 Merge pull request #14883 from influxdata/fix/flux-data-check

bash-3.2$ GO111MODULE=on go build ./cmd/influx && time ./influx help
Influx Client

Usage:
  influx [flags]
  influx [command]
<snip>

real	0m0.981s
user	0m2.099s
sys	0m0.086s
@cameronelliott
Copy link

When I am running “influx auth help”, it is taking between 4 to 10 seconds just to print the usage. strace reveals a lot of timing data being collected, along with futex.
It’s kind of a mood killer to wait 4 to 10 seconds to get help output.

root@120792:~# time influx auth help
… [ removed ] …
real 0m10.109s
user 0m9.637s
sys 0m0.154s

If I had to guess influx is running timing loops to try to discover timing granularity or resolution, it’s really going to be a bummer for people if they need to run x100 influx commands and the mean time to real work is ~5 seconds for each, meaning ~500 seconds or over 9 minutes waiting while system clock granularity is re-discovered on each run.

@Srokap
Copy link

Srokap commented Nov 5, 2019

It's over 1min on raspberrypi, prohibitively slow to use cli in a script for me. Also seems to be CPU intensive like crazy on initialization alone.

@fchikwekwe fchikwekwe self-assigned this Nov 5, 2019
@fchikwekwe
Copy link
Contributor

I'm trying to reproduce this issue, but I'm seeing 1 second times using InfluxDB's latest commit, which includes Flux v0.53.0. Screen Shot 2019-11-05 at 1.59.30 PM.png

@fchikwekwe
Copy link
Contributor

Does anyone have an idea as to why mine isn't slow?

@mark-rushakoff
Copy link
Contributor Author

@fchikwekwe one second to output a help message is still inappropriately slow. But, I suspect you are not running the build of influx that you think you are running.

cd /path/to/influxdb && GO111MODULE=on go build ./cmd/influx && time ./influx help still reports 2.5 real seconds on my machine, on current master (0d89697), built with go version go1.12.9 darwin/amd64.

@cameronelliott
Copy link

@fchikwekwe I would agree one second is too, long, but on some of my virtual machines, the time is ten seconds! I suspect there is a blocking calibration type routing occuring (I strace'd this)
which is blocking startup while calibration occurs.

As I built Ansible scripts to setup Influx, waiting 10 seconds for each test to understand how the cli/db/files interact was quite painful.

@fchikwekwe
Copy link
Contributor

Yeah, thanks. Running your command allows me to reproduce the error. I'll work on trying to profile it.

@mark-rushakoff
Copy link
Contributor Author

I applied this diff to influx:

diff --git a/cmd/influx/main.go b/cmd/influx/main.go
index 9f25ff33b..179540712 100644
--- a/cmd/influx/main.go
+++ b/cmd/influx/main.go
@@ -6,6 +6,8 @@ import (
 	"io/ioutil"
 	"os"
 	"path/filepath"
+	"runtime"
+	"runtime/pprof"
 
 	"github.com/influxdata/influxdb"
 	"github.com/influxdata/influxdb/bolt"
@@ -17,6 +19,20 @@ import (
 	"github.com/spf13/viper"
 )
 
+var profileFile *os.File
+
+func init() {
+	runtime.SetCPUProfileRate(50000)
+	f, err := os.Create("/tmp/influx.cpuprof")
+	if err != nil {
+		panic(err)
+	}
+	profileFile = f
+	if err := pprof.StartCPUProfile(f); err != nil {
+		panic(err)
+	}
+}
+
 var influxCmd = &cobra.Command{
 	Use:   "influx",
 	Short: "Influx Client",
@@ -74,6 +90,8 @@ func main() {
 	if err := influxCmd.Execute(); err != nil {
 		os.Exit(1)
 	}
+	pprof.StopCPUProfile()
+	profileFile.Close()
 }
 
 // Flags contains all the CLI flag values for influx.

But I am not seeing anything useful in the profiles -- I believe this is because imported packages' init functions (in this case, from flux) run before the init functions in the main package.

However, when I start influx and press ctrl-\ to send SIGQUIT, I seem to always see github.com/influxdata/influxdb/query/builtin.init.0 in the stack, which calls flux.FinalizeBuiltIns()

The simplest approach to fix this would be to lazily call flux.FinalizeBuiltins, probably via a sync.Once. I have no idea how many call sites would be affected by this.

@fchikwekwe
Copy link
Contributor

Okay, that it pretty helpful. Let me get some input from the rest of the flux team on this.

@Srokap
Copy link

Srokap commented Nov 6, 2019

Are nightly builds on download page working? Tried https://dl.influxdata.com/influxdb/nightlies/influxdb-nightly_linux_armhf.tar.gz but it seems to have modification date on 11th Jan 2019 on files within the package.

@fchikwekwe
Copy link
Contributor

fchikwekwe commented Nov 6, 2019

@Srokap From what I understand, it doesn't seem like nightly builds are being generated for 1.x anymore. I'm not sure about the ones on the downloads page though.

@fchikwekwe
Copy link
Contributor

fchikwekwe commented Nov 6, 2019

It seems like at some point the nightly builds may have inadvertently been killed. I've opened an issue for this here: #15789.

@Srokap
Copy link

Srokap commented Nov 6, 2019

Ok, was hoping to pick up nightly, but should be able to compile it myself to verify fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants