-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update dvc push documentation #203
Conversation
|
||
With the first `dvc push` we specified a stage in the middle of the pipeline | ||
while using `--with-deps`. This started with the named stage and searched | ||
backwards through the pipeline for data files to upload. Because the stage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do a single space everywhere? :) (btw you have different styles in this document).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone read the raw markdown? What people are reading is the rendered markdown as HTML on the website, or else in the github repository. The raw markdown is for editing. Once it is rendered the number of spaces after a period, how lines are wrapped into paragraphs, and so on, all that is disappeared into the rendered HTML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consider writing docs (at least with Markdown or Latex) the same as coding. It's easier for everyone to write/code when there is a common style guide. In this specific case - there are some editors that automatically remove extra spaces (especially trailing). So, if it happens that someone creates a PR to fix a simple mistake there are chances we end up with a lot of unnecessary changes all over the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points, and I'll keep this in mind. On the other hand consider how documentation editing is different than code editing. Inserting a few words in the middle of a paragraph causes the rest of the paragraph to reflow -- if one is manually adjusting the text to fit into 80 characters per line. Meaning, the one line with inserted words will overflow 80 columns, then every following line in that paragraph probably also overflows, resulting in an excessive diff.
For DVC docs I'm making sure to remove trailing spaces and to fit things into 80 columns. And I've adjusted the settings to insert spaces rather than tabs when hitting TAB.
This command pushes all data file caches related to the current Git branch to | ||
the remote storage. | ||
Uploads files and directories from the current branch in the local workspace to | ||
the [remote storage]('doc/commands-reference/remote'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We push data from cache based on DVC files in the working space. For example, (let's double check this), if I run something with --no-commit and then dvc push
, data from the working space won't be uploaded to remote. Again, let's confirm and let's come with a better summary.
|
||
## Description | ||
|
||
The `dvc push` command is the twin pair to the `dvc pull` command, and together |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that a lot of users still don't understand how all these commands work (dvc status -c
, dvc pull/push/fetch
, etc). Could we think about some explanation similar to what we have in dvc add
? It might be helpful to try making examples more detailed - show DVC file content, explain that it will extract checksums from it and will be pushing/pulling only those files to/from cache to/from remote.
|
||
## Examples | ||
|
||
Using the `dvc push` command remote storage must be defined. For an existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment above ^^. I think it worth explaining and illustrating it with more details to show state (with tree .
) before/after, show DVC file content, show that a referenced file is in cache or not in cache, etc. It's definitely worth explaining at least at one of those example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have put together an example focusing explicitly on what happens in the cache from dvc push
operations. I've pushed it to the pull request so we can discuss whether this form is useful or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks!
Since it repeats a lot the dvc pull
document let's spend a few cycles providing a more details, polishing explanation. It feels that it's still hard to understand that all these commands deal with three things - DVC files (scope is determined via various options), cache and remote. All three things are in play and we need to come with some language (similar to dvc add
??) how to explain this.
No description provided.