Skip to content

Commit

Permalink
new plot options & custom templates
Browse files Browse the repository at this point in the history
  • Loading branch information
dmpetrov committed May 5, 2020
1 parent 70339ab commit 4b35a28
Show file tree
Hide file tree
Showing 3 changed files with 138 additions and 46 deletions.
50 changes: 33 additions & 17 deletions content/docs/command-reference/plot/diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,9 @@ them in a single image.
## Synopsis

```usage
usage: dvc plot diff [-h] [-q | -v] [-t [TEMPLATE]] [-d [DATAFILE]]
[-r RESULT] [--no-html] [-f FIELDS] [-o]
[--no-csv-header]
[revisions [revisions ...]]
usage: dvc plot diff [-h] [-q | -v] [-t [TEMPLATE]] [-d [DATAFILE]] [-f FILE]
[-s SELECT] [-x X] [-y Y] [--stdout] [--no-csv-header]
[--no-html] [--title TITLE] [--xlab XLAB] [--ylab YLAB]
positional arguments:
revisions Git revisions to plot from
Expand Down Expand Up @@ -44,21 +43,37 @@ an output.

## Options

- `-t [TEMPLATE], --template [TEMPLATE]` - File to be injected with data.

- `-d [DATAFILE], --datafile [DATAFILE]` - Continuous metrics file to visualize.

- `-r RESULT, --result RESULT` - Name of the generated file.
- `-t [TEMPLATE], --template [TEMPLATE]` - File to be injected with data. The
default temlpate is `.dvc/plot/default.json`. See more details in `dvc plot`.

- `--no-html` - Do not wrap vega plot json with HTML.
- `-f FILE, --file FILE` - Name of the generated file. By default, the output
file name is equal to the input filename with additional `.html` suffix or
`.json` suffix for `--no-html` mode.

- `-f FIELDS, --fields FIELDS` - Choose which fileds or jsonpath to put into
plot.
- `--no-html` - Do not wrap output vega plot json with HTML.

- `--no-csv-header` - Provided CSV or TSV datafile does not have a header.
- `-s SELECT, --select SELECT` - Select which fileds or jsonpath to put into
plot. All the fields will be included by default with DVC generated `index`
field - see `dvc plot`.

- `-x X` - Field name for x axis. `index` is the default field for X.

- `-y Y` - Field name for y axis. The dafult field is the last field found in
the input file: the last column in CSV file or the last field in the JSON
array object (the first object).

- `--xlab XLAB` - X axis title. The X column name is the default title.

- `--ylab YLAB` - Y axis title. The Y column name is the default title.

- `--title TITLE` - Plot title.

- `-o, --stdout` - Print plot content to stdout.

- `--no-csv-header` - Provided CSV or TSV datafile does not have a header.

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
Expand All @@ -73,18 +88,19 @@ one:

```dvc
$ dvc plot diff -d logs.csv
file:///Users/dmitry/src/plot/logs.html
file:///Users/dmitry/src/plot/logs.csv.html
```

A new file `logs.html` was generated. User can open it in a web browser.
A new file `logs.csv.html` was generated. User can open it in a web browser.

![](/img/plot_diff_workspace.svg)

The difference between two specified commits:
The difference between two specified commits (multiple commits, tag or branches
can be specified):

```dvc
$ dvc plot diff -d logs.csv HEAD 11c0bf1
file:///Users/dmitry/src/plot/logs.html
file:///Users/dmitry/src/plot/logs.csv.html
```

![](/img/plot_diff.svg)
Expand All @@ -107,8 +123,8 @@ turtle,cat
```

```dvc
$ dvc plot diff -d classes.csv -t confusion_matrix
file:///Users/dmitry/src/test/plot_old/classes.html
$ dvc plot diff -d classes.csv -t confusion
file:///Users/dmitry/src/test/plot_old/classes.csv.html
```

![](/img/plot_diff_confusion.svg)
67 changes: 58 additions & 9 deletions content/docs/command-reference/plot/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ DVC provides a set of commands to visualize _continuous metrics_ of machine
learning experiments. Usual examples of plots are AUC curves, loss functions,
and confusion matrices.

Continous metrics represents a plot and should be stored as data series in one
Continuous metrics represent plots, and should be stored as data series in one
of the supported [file formats](#file-formats). These files are usually created
by users or generated by user's modeling or data processing code.
by users or generated by user modeling or data processing code.

The plot commands can work with these continuous metrics files that are commited
to a repository history, data files controlled by DVC or files from workspace.
Expand Down Expand Up @@ -113,8 +113,57 @@ programming language or environment which allows DVC stay programming language
agnostic.

Plot templates are stored in `.dvc/plot/` directory as json files. A user can
define it's own templates or modify the existing ones. Please see more details
in `dvc plot show` and `dvc plot diff`.
define it's own templates or modify the existing ones. The default template is
`.dvc/plot/default.json`. User can change the temlpate by `--template` or `-t`
option of `dvc plot show` or `dvc plot diff` commands and specifying a file
name.

For temlpates in the templates directory the path and the json extension are not
required. User can specify only `--template scatter` instead of
`--template .dvc/plot/scatter.json`. Any custom template can be added to the
temlpate directory.

### Custom templates

User can define their own temlpate for specific plot types. Any temlpate file is
a JSON specification with predefined DVC anchors that help DVC to inject user's
data properly.

All input JSON files of `dvc plot show` and `dvc plot diff` commands are
combined together into a single array for the injection to a template file.

There are two important additional signals or fields that DVC adds:

- `rev` - specified revision, tag or branch of input file. This option helps to
destinguish between different revisions of the file in `dvc plot diff`
command.

- `index` - is a ordering number in the file. In many cases it corresponds to
mchine learning training epoch or step number.

DVC applies the same logic to all input CSV files but first transforms all CSV
data into JSON. DVC uses CSV files columns name from a header for JSON
conversion.

DVC temlpate anchors:

- `<DVC_METRIC_DATA>` - Plotting command input data from either CSV or JSON
files is converted to JSON array and injected instead of this anchor. Two
additional signal will be added `index` and `rev` - revision (See above).

- `<DVC_METRIC_TITLE>` - A plot title that can be defined by `--title` option.

- `<DVC_METRIC_Y>` - a field name for Y axis of the plot. It can be defined by
`-y` option of the commands. The dafult field is the last field found in the
input file: the last column in CSV file or the last field in the JSON array
object.

- `<DVC_METRIC_X>` - a field name for Y axes. It can be defined by `-x` option.
`index` is the default field for X.

- `<DVC_METRIC_Y_TITLE>` - a displayed field label for Y.

- `<DVC_METRIC_X_TITLE>` - a displayed field label for X.

## Options

Expand Down Expand Up @@ -142,7 +191,7 @@ epoch,accuracy,loss,val_accuracy,val_loss

```dvc
$ dvc plot show logs.csv
file:///Users/dmitry/src/plot/logs.html
file:///Users/dmitry/src/plot/logs.csv.html
```

![](/img/plot_show.svg)
Expand All @@ -151,15 +200,15 @@ Difference between the current file and the previous commited one:

```dvc
$ dvc plot diff -d logs.csv HEAD^
file:///Users/dmitry/src/plot/logs.html
file:///Users/dmitry/src/plot/logs.csv.html
```

![](/img/plot_diff.svg)

Visualize a specific field:

```dvc
$ dvc plot show --field loss logs.csv
$ dvc plot show -y loss logs.csv
file:///Users/dmitry/src/plot/logs.html
```

Expand All @@ -183,8 +232,8 @@ turtle,cat
```

```dvc
$ dvc plot show classes.csv --template confusion_matrix
file:///Users/dmitry/src/plot/classes.html
$ dvc plot show classes.csv --template confusion -x actual -y predicted
file:///Users/dmitry/src/plot/classes.csv.html
```

![](/img/plot_show_confusion.svg)
67 changes: 47 additions & 20 deletions content/docs/command-reference/plot/show.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Generate a plot image from from a
## Synopsis

```usage
usage: dvc plot show [-h] [-q | -v] [-t [TEMPLATE]] [-r RESULT] [--show-json]
[-f FIELDS]
[datafile]
usage: dvc plot show [-h] [-q | -v] [-t [TEMPLATE]] [-f FILE] [-s SELECT]
[-x X] [-y Y] [--stdout] [--no-csv-header] [--no-html]
[--title TITLE] [--xlab XLAB] [--ylab YLAB]
positional arguments:
datafile Data to be visualized.
Expand All @@ -22,14 +22,30 @@ information.

## Options

- `-t [TEMPLATE], --template [TEMPLATE]` - File to be injected with data.
- `-t [TEMPLATE], --template [TEMPLATE]` - File to be injected with data. The
default temlpate is `.dvc/plot/default.json`. See more details in `dvc plot`.

- `-r RESULT, --result RESULT` - Name of the generated file.
- `-f FILE, --file FILE` - Name of the generated file. By default, the output
file name is equal to the input filename with additional `.html` suffix or
`.json` suffix for `--no-html` mode.

- `--no-html` - Do not wrap vega plot json with HTML.
- `--no-html` - Do not wrap output vega plot json with HTML.

- `-f FIELDS, --fields FIELDS` - Choose which fileds or jsonpath to put into
plot.
- `-s SELECT, --select SELECT` - Select which fileds or jsonpath to put into
plot. All the fields will be included by default with DVC generated `index`
field - see `dvc plot`.

- `-x X` - Field name for x axis. `index` is the default field for X.

- `-y Y` - Field name for y axis. The dafult field is the last field found in
the input file: the last column in CSV file or the last field in the JSON
array object (the first object).

- `--xlab XLAB` - X axis title. The X column name is the default title.

- `--ylab YLAB` - Y axis title. The Y column name is the default title.

- `--title TITLE` - Plot title.

- `-o, --stdout` - Print plot content to stdout.

Expand Down Expand Up @@ -58,30 +74,41 @@ epoch,accuracy,loss,val_accuracy,val_loss
7,0.9954,0.01396906608727198,0.9802,0.07247738889862157
```

By default, the command plots the last column of the tabular file.
By default, the command plots the last column of the tabular file. Please look
at the default behaviour of `-y` option.

```dvc
$ dvc plot show logs.csv
file:///Users/dmitry/src/plot/logs.html
file:///Users/dmitry/src/plot/logs.csv.html
```

![](/img/plot_show.svg)

Use `--field` option to changing column to visualize:
Use `-y` option to change column to visualize:

```dvc
$ dvc plot show --field loss logs.csv
file:///Users/dmitry/src/plot/logs.html
$ dvc plot show -y loss logs.csv
file:///Users/dmitry/src/plot/logs.csv.html
```

![](/img/plot_show_field.svg)

In the previous examlpe all the columns (or fields) were included into the
output file. You can select only specified subset ot the columns by `--select`
option which might be important for reducing the output file size. In this case
the default `index` column will be still included.

```dvc
$ dvc plot show -y loss --select loss logs.csv
file:///Users/dmitry/src/plot/logs.csv.html
```

A tabular file without header can be plotted with `--no-csv-header` option. A
field can be specified through column number (starting with 0):

```dvc
$ dvc plot show --no-csv-header --field 2 logs.csv
file:///Users/dmitry/src/plot/logs.html
file:///Users/dmitry/src/plot/logs.csv.html
```

In many automation scenarios (like CI/CD for ML), it is convinient to have Vega
Expand All @@ -92,7 +119,7 @@ Note, the result file extension changes to JSON:

```
$ dvc plot show --no-html logs.csv
file:///Users/dmitry/src/plot/logs.json
file:///Users/dmitry/src/plot/logs.csv.json
```

JSON file plotting example:
Expand All @@ -116,15 +143,15 @@ find.

```dvc
$ dvc plot show train.json
file:///Users/dmitry/src/plot/train.html
file:///Users/dmitry/src/plot/train.json.html
```

![](/img/plot_show.svg)

The field name can be specified with the same `--field` option. The signal from
the first JSON array with the specified name will be showned:
The field name can be specified with the same `-y` option. The signal from the
first JSON array with the specified name will be showned:

```dvc
$ dvc plot show --field accuracy logs.json
file:///Users/dmitry/src/plot/logs.html
$ dvc plot show -y accuracy logs.json
file:///Users/dmitry/src/plot/logs.json.html
```

0 comments on commit 4b35a28

Please sign in to comment.