Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added a BC: workspace document #2197
added a BC: workspace document #2197
Changes from all commits
168a290
3130952
c8d77e6
d119729
e0d915e
bdbcc97
7f2e93d
523f1c1
50964b1
46fe4bb
7a88ff5
44af0a9
42559e2
6cc0f6c
e85a0d7
5674991
0b099f6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not go into versioning here, I think. At least not by implying they're all in the workspace because in DVC the workspace only holds one version (the rest are cached and managed via Git, metafiles, etc.
(Mentioned in #2197 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. This is better way to very subtly mention versioning (could even link to the corresponding Use Case doc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. File contents are org'd in the cache with a special file structure (see https://dvc.org/doc/user-guide/project-structure/internal-files#structure-of-the-cache-directory)
That part is correct. And contradicts the previous part 🙂 (because "visible part" implies there's a hidden part which must be in other dirs).
Let's open this p with that sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"e.g., the raw data, source code, model files you're currently using"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again contradicting 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again probably too many details about versioning. Doesn't really fall within the 'workspace' concept, I think. This can be simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are better mentions of versioning (clear benefits i.e. how much you'd have to suffer without DVC)
That specific example isn't great because that's still pretty common even with DVC (e.g. in our own example-get-started repo we have a
prepared/
dir).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "ML file system" keyword is pretty tricky. No need to force it (just skip it if you can't find a correct way to use it).
I can only think of something like "DVC turns your project into a sort of machine learning file system for..." but not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again contradicting and also, repetitive at this point.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But actually
add
can also download data to the workspace (see--out
and--to-remote
options). Also,import*
commands download AND track data. You may want to rephrase this part accordingly 🙂There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes I think there is no command that hasn't got a duplicate, somehow :) I try to mention commands in passing, if we'd consider each and every option to commands, we'll need to duplicate the command reference here IMHO.
We may just delete the commands if you would like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also list all possibilities for each functionality, like
In the workspace, you can
dvc add --out
,dvc import-url
...)but I think this will turn the document into a list of commands and options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to list every command usage of course, agreed!
My point was that these 3 commands mentioned actually overlap in a way that makes the current text slightly incorrect. In any case, the main use case of
add
is not to "add" but to "track", actually. Please check each cmd ref to try to find the right terms when needed 🙂"Download" is correct for
get/import
butadd
can also download (and they can all "transfer") so I'd avoid that term probably. And in fact I wouldn't even mentionget
here, since it doesn't require a DVC project/workspace. Forimport
I'd try to use the cmd name as the relevant action (to "import") I guess...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then again
import*
also track the downloaded data 😅 ("adds"). Maybe it should be a single sentence about tracking and put alladd
,import
,import-url
in the same parenthesis.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe open the previous paragraph with that?
p.s. having this I think def. no need for the "ml file system" keyword. But keeping "machine learning" somewhere would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to incorporate the metafile mentions to the main (2nd) paragraph somehow. After simplifying it per my previous comments, there should be enough room in there. that way there's no need for this 4th p.