-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init: add subdir description #1022
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First review with some relevant questions for @pared. Please take a look and address what you can (at least the Qs).
We can and will take it over too, but not sure how long it will take to get to this. Thanks!
- `--subdir` - initialize <abbr>DVC repository</abbr> in current directory and | ||
allow to search for Git repository in parent directories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allows initializing a <abbr>DVC repository</abbr> inside a directory of a Git repo. DVC in this _subdir_ DVC repo will search for Git in parent directories. This option should not be combined with `--no-scm` (above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions though, @pared:
- Can the parent repo be a DVC repo (having both .git/ and .dvc/)?
- Will it traverse the hierarchy all the way up to
/
looking for .git/?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- parent can be a DVC repo
- yes, if there is no
no-scm
insideconfig
, we will search the whole tree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll add some notes about these details 🙂
ce56664
to
ecf994b
Compare
@jorgeorpinel introduced your suggestions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @pared!
@shcheklein there's a few possible improvements to this but I think is mergeable; I can address them in a later PR.
I was not following the DVC core PR, could you please clarify the following things. No need to add them right now in the PR, but the will help in taking this PR over. Would be great to have some answers here in any way your prefer :)
Let's start with this. I think we need more clarify for this before we merge this. The way we've done it so far is too low level and technical, not clear the motivation and not clear how does it affect behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pared Please, see the comment. No need to change anything in the PR yet, just asking to share a bit more context.
@shcheklein I had this same question in iterative/dvc#3257 (review), and Pawel's answer is
|
@shcheklein 1. Did we change anything about Yes, now
2. No, (2) Could you elaborate a bit please? What do you mean by default behavior? And why Now logic works following way: if we do not find 3. Can I do nesting - one Yes 4. Can I do Yes, nothing will change, we will look for git root starting from root, (4) So does it mean that when I do No, this behavior stays as it was, so if we don't specify 5. How having subdirs affect the flow (basically, why do we need them in the first place). Flow of Subdir does not influence normal use case, data sync operations will not be performed in subrepos, which are ignored during tree traversing.
(5) what do you mean by normal here? Sorry for that, what I meant, data sync commands without specific target. Stages for subdirs will not be gathered. (5) _ can I still run Yes, though we will try to use current repo to perform the operation. That is a flaw, I did not think that through. (5) what about As above, we will receive some ambiguous errors. 6. What happens if I do Init error. Without this error, we would end up with NoSCM case, because it just happens that check for Q: what was the motivation behind this? why do they contradict each other?
|
@pared thanks! please, check the next round of questions to clarify the PR! Any details would be helpful. You can edit the same comment #1022 (comment) |
@shcheklein Tried to provide some more broad answer.
I believe that second option is appropriate, as original issue was about having multiple dvc repos inside one git repo. |
Sounds like we need several more changes.
Hmmm. I tried
👍
Kind of like Git subrepos?
Should we open a separate dvc core issue to discuss #1022 (comment) ?
--subrepo may also confusing, it sounds like its a full repo inside a full repo, but its not a full repo per se: its a DVC project inside a subdirectory of a Git/DVC repo. Maybe --sub-scm ? (I'd consider renaming --no-git and --sub-git) Could making both --no-scm and this option automatic/implicit help? |
It is the default behavior when handling SCM inside our code.
Agree
Yes
yes, i will do that
Probably yest, though I would discuss it, as for now |
I would wait with adding information about that, as it's not decided yet whether we will support repo inside repo scenario after detecting the "targeted sync" bug. Also, introduced the rest of your points. Please ping me if its ok, Ill squash them. |
ecf994b
to
ddb46d4
Compare
@jorgeorpinel @pared I've update the doc and added more information about this options, why they are needed, etc. Please review and let me know WYT. @jorgeorpinel feel free to put small fixes on top/merge/etc. |
May be we should consider moving some part to the User Guide. |
@shcheklein I think those changes are desirable, they point "main" use case and provide info on additional possible workflows. |
This comment has been minimized.
This comment has been minimized.
I'm back guys, sorry for the delay.
@pared OK so it doesn't affect the docs in this PR but I'm still trying to understand what "default behavior when handling SCM inside our code" means or what impacts it has. Can you elaborate a little? Maybe it impacts other docs... Other than that we've taken over this PR and I'm trying to merge it ASAP. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will push some copy edits so this can be merged ASAP.
But here's a list of stuff pending from comments above:
- Update sync commands per "data sync commands without specific target: Stages for subdirs will not be gathered"
- May be we should consider moving some part to the User Guide.
I also think (already discussed with Ivan) that the result here is a little too long. We went from 2 paragraphs to 21 plus block codes. It's not necessarily a problem because all of this info is valuable, but I think in part it can be summarized at least a little, and possibly extract these advanced cases into the user guide. We even state that they are "rare" so I'm not sure it's worth explaining them in detail in the description of this cmd ref. Another option is to make them into examples and keep the direct links from the description. Food for thought.
❗ Please read the guidelines in the Contributing to the Documentation list if you make any substantial changes to the documentation or JS engine.
🐛 Please make sure to mention
Fix #issue
(if applicable) in the description of the PR. This causes GitHub to close it automatically when the PR is merged.Please chose to allow us to edit your branch when creating the PR.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏
Related to iterative/dvc#3257