Skip to content

Commit

Permalink
Add new metrics collected by debuginfo (#62)
Browse files Browse the repository at this point in the history
1. Updated the docs to reflect the implementation that will be introduced by hypermodeinc/dgraph#7439
2. minor fixes
  • Loading branch information
OmarAyo authored Mar 12, 2021
1 parent 2c1f6c2 commit 3d9d22f
Showing 1 changed file with 88 additions and 56 deletions.
144 changes: 88 additions & 56 deletions content/howto/retrieving-debug-information.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,45 +57,87 @@ The HTTP page `/debug/pprof/` is available at the HTTP port of a Dgraph Zero or

## Profiling Information with `debuginfo`

Instead of sending a request to the server for each CPU, Memory, and goroutine profile, you can use the `debuginfo` command to collect all the profiles you need in one go.
Instead of sending a request to the server for each CPU, memory, and `goroutine` profile, you can use the `debuginfo` command to collect all of these profiles, along with several metrics.

You can run the command like this:

```sh
dgraph debuginfo -a <alpha_address:port> -z <zero_address:port> -d <path_to_dir_to_store_profiles>
dgraph debuginfo -a <alpha_address:port> -z <zero_address:port> -d <path_to_dir_to_store_profiles>
```

Your output should look like:

```log
[Decoder]: Using assembly version of decoder
Page Size: 4096
I0120 14:57:43.722166 15018 run.go:85] using directory /tmp/dgraph-debuginfo121781350 for debug info dump.
I0120 14:57:43.722272 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/goroutine?duration=30
I0120 14:57:43.722281 15018 pprof.go:74] please wait... (30s)
I0120 14:57:43.724208 15018 pprof.go:62] saving goroutine profile in /tmp/dgraph-debuginfo121781350/alpha_goroutine.gz
I0120 14:57:43.724217 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/heap?duration=30
I0120 14:57:43.724222 15018 pprof.go:74] please wait... (30s)
I0120 14:57:43.726212 15018 pprof.go:62] saving heap profile in /tmp/dgraph-debuginfo121781350/alpha_heap.gz
I0120 14:57:43.726220 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/threadcreate?duration=30
I0120 14:57:43.726225 15018 pprof.go:74] please wait... (30s)
I0120 14:57:43.727054 15018 pprof.go:62] saving threadcreate profile in /tmp/dgraph-debuginfo121781350/alpha_threadcreate.gz
I0120 14:57:43.727064 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/block?duration=30
I0120 14:57:43.727071 15018 pprof.go:74] please wait... (30s)
I0120 14:57:43.727958 15018 pprof.go:62] saving block profile in /tmp/dgraph-debuginfo121781350/alpha_block.gz
I0120 14:57:43.727967 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/mutex?duration=30
I0120 14:57:43.727971 15018 pprof.go:74] please wait... (30s)
I0120 14:57:43.728622 15018 pprof.go:62] saving mutex profile in /tmp/dgraph-debuginfo121781350/alpha_mutex.gz
I0120 14:57:43.728630 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/profile?duration=30
I0120 14:57:43.728635 15018 pprof.go:74] please wait... (30s)
I0120 14:58:13.788794 15018 pprof.go:62] saving profile profile in /tmp/dgraph-debuginfo121781350/alpha_profile.gz
I0120 14:58:13.788827 15018 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/trace?duration=30
I0120 14:58:13.788841 15018 pprof.go:74] please wait... (30s)
I0120 14:58:14.792110 15018 pprof.go:62] saving trace profile in /tmp/dgraph-debuginfo121781350/alpha_trace.gz
I0120 14:58:14.799585 15018 run.go:115] Debuginfo archive successful: dgraph-debuginfo121781350.tar.gz
```

When the command finishes, `debuginfo` returns the tarball's file name. In this example, it was saved in `/tmp/dgraph-debuginfo121781350/alpha_trace.gz`.
I0311 14:13:53.243667 32654 run.go:118] using directory /tmp/dgraph-debuginfo037351492 for debug info dump.
I0311 14:13:53.243864 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/heap
I0311 14:13:53.243872 32654 debugging.go:70] please wait... (30s)
I0311 14:13:53.245338 32654 debugging.go:58] saving heap metric in /tmp/dgraph-debuginfo037351492/alpha_heap.gz
I0311 14:13:53.245349 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/profile?seconds=30
I0311 14:13:53.245357 32654 debugging.go:70] please wait... (30s)
I0311 14:14:23.250079 32654 debugging.go:58] saving cpu metric in /tmp/dgraph-debuginfo037351492/alpha_cpu.gz
I0311 14:14:23.250148 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/state
I0311 14:14:23.250173 32654 debugging.go:70] please wait... (30s)
I0311 14:14:23.255467 32654 debugging.go:58] saving state metric in /tmp/dgraph-debuginfo037351492/alpha_state.gz
I0311 14:14:23.255507 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/health
I0311 14:14:23.255528 32654 debugging.go:70] please wait... (30s)
I0311 14:14:23.257453 32654 debugging.go:58] saving health metric in /tmp/dgraph-debuginfo037351492/alpha_health.gz
I0311 14:14:23.257507 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/jemalloc
I0311 14:14:23.257548 32654 debugging.go:70] please wait... (30s)
I0311 14:14:23.259009 32654 debugging.go:58] saving jemalloc metric in /tmp/dgraph-debuginfo037351492/alpha_jemalloc.gz
I0311 14:14:23.259055 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/trace?seconds=30
I0311 14:14:23.259091 32654 debugging.go:70] please wait... (30s)
I0311 14:14:53.266092 32654 debugging.go:58] saving trace metric in /tmp/dgraph-debuginfo037351492/alpha_trace.gz
I0311 14:14:53.266152 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/metrics
I0311 14:14:53.266181 32654 debugging.go:70] please wait... (30s)
I0311 14:14:53.276357 32654 debugging.go:58] saving metrics metric in /tmp/dgraph-debuginfo037351492/alpha_metrics.gz
I0311 14:14:53.276414 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/vars
I0311 14:14:53.276439 32654 debugging.go:70] please wait... (30s)
I0311 14:14:53.278295 32654 debugging.go:58] saving vars metric in /tmp/dgraph-debuginfo037351492/alpha_vars.gz
I0311 14:14:53.278340 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/trace?seconds=30
I0311 14:14:53.278366 32654 debugging.go:70] please wait... (30s)
I0311 14:15:23.286770 32654 debugging.go:58] saving trace metric in /tmp/dgraph-debuginfo037351492/alpha_trace.gz
I0311 14:15:23.286830 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/goroutine?debug=2
I0311 14:15:23.286886 32654 debugging.go:70] please wait... (30s)
I0311 14:15:23.291120 32654 debugging.go:58] saving goroutine metric in /tmp/dgraph-debuginfo037351492/alpha_goroutine.gz
I0311 14:15:23.291164 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/block
I0311 14:15:23.291192 32654 debugging.go:70] please wait... (30s)
I0311 14:15:23.304562 32654 debugging.go:58] saving block metric in /tmp/dgraph-debuginfo037351492/alpha_block.gz
I0311 14:15:23.304664 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/mutex
I0311 14:15:23.304706 32654 debugging.go:70] please wait... (30s)
I0311 14:15:23.309171 32654 debugging.go:58] saving mutex metric in /tmp/dgraph-debuginfo037351492/alpha_mutex.gz
I0311 14:15:23.309228 32654 debugging.go:68] fetching information over HTTP from http://localhost:8080/debug/pprof/threadcreate
I0311 14:15:23.309256 32654 debugging.go:70] please wait... (30s)
I0311 14:15:23.313026 32654 debugging.go:58] saving threadcreate metric in /tmp/dgraph-debuginfo037351492/alpha_threadcreate.gz
I0311 14:15:23.385359 32654 run.go:150] Debuginfo archive successful: dgraph-debuginfo037351492.tar.gz
```

When the command finishes, `debuginfo` returns the tarball's file name. If no destination has been specified, the file will be created in the same directory from where you ran the `debuginfo` command.

The following files contain the metrics collected by the `debuginfo` command:

```
dgraph-debuginfo639541060
├── alpha_block.gz
├── alpha_goroutine.gz
├── alpha_health.gz
├── alpha_heap.gz
├── alpha_jemalloc.gz
├── alpha_mutex.gz
├── alpha_profile.gz
├── alpha_state.gz
├── alpha_threadcreate.gz
├── alpha_trace.gz
├── zero_block.gz
├── zero_goroutine.gz
├── zero_health.gz
├── zero_heap.gz
├── zero_jemalloc.gz
├── zero_mutex.gz
├── zero_profile.gz
├── zero_state.gz
├── zero_threadcreate.gz
└── zero_trace.gz
```

### Command parameters

Expand All @@ -104,42 +146,32 @@ When the command finishes, `debuginfo` returns the tarball's file name. In this
-x, --archive Whether to archive the generated report (default true)
-d, --directory string Directory to write the debug info into.
-h, --help help for debuginfo
-p, --profiles strings List of pprof profiles to dump in the report. (default [goroutine,heap,threadcreate,block,mutex,profile,trace])
-s, --seconds uint32 Duration for time-based profile collection. (default 15)
-m, --metrics strings List of metrics & profiles to dump in the report. (default [heap,cpu,state,health,jemalloc,trace,metrics,vars,trace,goroutine,block,mutex,threadcreate])
-s, --seconds uint32 Duration for time-based metric collection. (default 30)
-z, --zero string Address of running dgraph zero.
```

#### The profile flag (`-p`)
#### The metrics flag (`-m`)

By default, `debuginfo` collects:
- `goroutine`
- `heap`
- `threadcreate`
- `block`
- `mutex`
- `profile`
- `cpu`
- `state`
- `health`
- `jemalloc`
- `trace`
- `metrics`
- `vars`
- `trace`
- `goroutine`
- `block`
- `mutex`
- `threadcreate`

If needed, you can collect some of them (not necessarily all). For example, this command will collect only `goroutine` and `heap` profiles:

```sh
dgraph debuginfo -p goroutine,heap
```

#### The seconds flag (`-s`)

By default, the flag is set to 15 seconds. If you are collecting the CPU profile, this profile needs at least 30 seconds to be collected, therefore when you want to collect it, you need to set the `-s` flag as follows:
If needed, you can collect some of them (not necessarily all). For example, this command will collect only `jemalloc` and `health` profiles:

```sh
dgraph debuginfo -s 30
```

If you don't set the flag, when collecting a CPU profile you'll will get a `context deadline exceeded` error:

```log
I0120 14:06:49.840613 13589 pprof.go:72] fetching profile over HTTP from http://localhost:8080/debug/pprof/profile?duration=15
I0120 14:06:49.840622 13589 pprof.go:74] please wait... (15s)
E0120 14:07:14.341613 13589 pprof.go:58] error while saving pprof profile from http://localhost:8080/debug/pprof/profile?duration=15: http fetch: Get "http://localhost:8080/debug/pprof/profile?duration=15": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
dgraph debuginfo -m jemalloc,health
```

### Profiles details
Expand Down

0 comments on commit 3d9d22f

Please sign in to comment.