Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fix #313 #330

Merged
merged 3 commits into from
May 14, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/Documentation/Markdown/Markdown.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ const HtmlRenderer = props => {
const CodeBlock = ({ value, language }) => {
const dvcStyle = Object.assign({}, docco)
dvcStyle['hljs-comment'] = { color: '#999' }
dvcStyle['hljs-meta'] = { color: '#333', fontSize: '14px' }
dvcStyle['hljs-meta'] = { color: '#333', fontSize: '14px', paddingLeft: '8em' }
kurianbenoy marked this conversation as resolved.
Show resolved Hide resolved
return (
<SyntaxHighlighter language={language} style={dvcStyle}>
{value}
Expand Down
16 changes: 8 additions & 8 deletions static/docs/get-started/add-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,24 +16,24 @@ link as`(Chrome) or `Save object as`(Firefox).
</details>

```dvc
$ mkdir data
$ wget https://dvc.org/s3/get-started/data.xml -O data/data.xml
$ mkdir data
$ wget https://dvc.org/s3/get-started/data.xml -O data/data.xml
```

To take a file (or a directory) under DVC control just run `dvc add`, it accepts
any **file** or a **directory**:

```dvc
$ dvc add data/data.xml
$ dvc add data/data.xml
```

DVC stores information about your data file in a special `.dvc` file, that has a
human-readable [description](/doc/user-guide/dvc-file-format) and can be
committed to Git to track versions of your file:

```dvc
$ git add data/.gitignore data/data.xml.dvc
$ git commit -m "add source data to DVC"
$ git add data/.gitignore data/data.xml.dvc
$ git commit -m "add source data to DVC"
```

<details>
Expand All @@ -44,9 +44,9 @@ You can see that actual data file has been moved to the `.dvc/cache` directory
(usually hardlink or reflink is created, so no physical copying is happening).

```dvc
$ ls -R .dvc/cache
.dvc/cache/a3:
04afb96060aad90176268345e10355
$ ls -R .dvc/cache
.dvc/cache/a3:
04afb96060aad90176268345e10355
```

where `a304afb96060aad90176268345e10355` is an MD5 hash of the `data.xml` file.
Expand Down
20 changes: 10 additions & 10 deletions static/docs/get-started/compare-experiments.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ Let's run evaluate for the latest `bigram` experiment we created in one of the
previous steps. It mostly takes just running the `dvc repro`:

```dvc
$ git checkout master
$ dvc checkout
$ dvc repro evaluate.dvc
$ git checkout master
$ dvc checkout
$ dvc repro evaluate.dvc
```

`git checkout master` and `dvc checkout` commands ensure that we have the latest
Expand All @@ -21,19 +21,19 @@ experiment code and data respectively. And `dvc repro`, as we discussed in the
commands to build the model and measure its performance.

```dvc
$ git commit -a -m "evaluate bigram model"
$ git tag -a "bigram-experiment" -m "bigrams"
$ git commit -a -m "evaluate bigram model"
$ git tag -a "bigram-experiment" -m "bigrams"
```
Now, we can use `-T` option of the `dvc metrics show` command to see the
difference between the `baseline` and `bigrams` experiments:

```dvc
$ dvc metrics show -T
$ dvc metrics show -T

baseline-experiment:
auc.metric: 0.588765
bigram-experiment:
auc.metric: 0.620421
baseline-experiment:
auc.metric: 0.588765
bigram-experiment:
auc.metric: 0.620421
```

DVC provides built-in support to track and navigate `JSON`, `TSV` or `CSV`
Expand Down
6 changes: 3 additions & 3 deletions static/docs/get-started/configure.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ project/repository itself.
</details>

```dvc
$ dvc remote add -d myremote /tmp/dvc-storage
$ git commit .dvc/config -m "initialize DVC local remote"
$ dvc remote add -d myremote /tmp/dvc-storage
$ git commit .dvc/config -m "initialize DVC local remote"
```
> We only use a local remote in this guide for simplicity's sake in following
> these basic steps as you are learning to use DVC. We realize that for most
Expand Down Expand Up @@ -53,7 +53,7 @@ for all remotes.
For example, to setup an S3 remote we would use something like:

```dvc
$ dvc remote add -d s3remote s3://mybucket/myproject
$ dvc remote add -d s3remote s3://mybucket/myproject
```
> This command is only shown for informational purposes. No need to actually run
> it in order to continue with this guide.
Expand Down
66 changes: 33 additions & 33 deletions static/docs/get-started/connect-code-and-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ to get the sample code:
> On Windows just use your browser to download the archive instead.

```dvc
$ wget https://dvc.org/s3/get-started/code.zip
$ unzip code.zip
$ rm -f code.zip
$ wget https://dvc.org/s3/get-started/code.zip
$ unzip code.zip
$ rm -f code.zip
```

You'll also need to install its dependencies: Python packages like `pandas` and
Expand All @@ -27,34 +27,34 @@ You'll also need to install its dependencies: Python packages like `pandas` and
After downloading the sample code, your project structure should look like this:

```dvc
$ tree
.
├── data
│   ├── data.xml
│   └── data.xml.dvc
├── requirements.txt
└── src
   ├── evaluate.py
   ├── featurization.py
   ├── prepare.py
 └── train.py
$ tree
.
├── data
│   ├── data.xml
│   └── data.xml.dvc
├── requirements.txt
└── src
   ├── evaluate.py
   ├── featurization.py
   ├── prepare.py
 └── train.py
```

We **strongly** recommend using `virtualenv` or a similar tool to isolate your
environment:

```dvc
$ virtualenv .env
$ echo ".env/" >> .gitignore
$ source .env/bin/activate
$ virtualenv .env
$ echo ".env/" >> .gitignore
$ source .env/bin/activate
```

Now, we are ready to install dependencies to run the code:

```dvc
$ pip install -U -r requirements.txt
$ git add .
$ git commit -m "add code"
$ pip install -U -r requirements.txt
$ git add .
$ git commit -m "add code"
```

</details>
Expand All @@ -64,10 +64,10 @@ command transforms it into a reproducible **stage** for the ML **pipeline**
(describes in the next chapter).

```dvc
$ dvc run -f prepare.dvc \
-d src/prepare.py -d data/data.xml \
-o data/prepared \
python src/prepare.py data/data.xml
$ dvc run -f prepare.dvc \
-d src/prepare.py -d data/data.xml \
-o data/prepared \
python src/prepare.py data/data.xml
```

`dvc run` generates the `prepare.dvc` file. It has the same
Expand All @@ -86,18 +86,18 @@ This is how the result should look like now:
```diff
.
├── data
   ├── data.xml
   ├── data.xml.dvc
+ │   └── prepared
+ │   ├── test.tsv
+ │   └── train.tsv
├── data.xml
├── data.xml.dvc
+ │ └── prepared
+ │ ├── test.tsv
+ │ └── train.tsv
+ ├── prepare.dvc
├── requirements.txt
└── src
   ├── evaluate.py
   ├── featurization.py
   ├── prepare.py
 └── train.py
├── evaluate.py
├── featurization.py
├── prepare.py
└── train.py
```

This is how `prepare.dvc` looks like internally:
Expand Down
Loading