Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce stage:list command with autocompletions #5282

Merged
merged 2 commits into from
Jan 19, 2021

Conversation

skshetry
Copy link
Member

Autocompletion only limited to the dvc repro for now.

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

@skshetry skshetry added enhancement Enhances DVC feature is a feature labels Jan 18, 2021
@skshetry skshetry requested review from pmrowla, efiop and pared January 18, 2021 09:25
@skshetry skshetry self-assigned this Jan 18, 2021
}
_dvc_compadd_stages() {
# this will also show up the description of the stages
_describe 'stages' "($(_dvc_stages_output))"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it'd be better to create a hidden flag for completions instead of using awk?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense, that would make completions "test"-able to some extent.

Copy link
Contributor

@pared pared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Played around a bit on my own machine. There seems to be a bit of delay, but its much better than when we were testing it on #5191 πŸš€

@skshetry
Copy link
Member Author

skshetry commented Jan 18, 2021

There seems to be a bit of delay

Sadly, most of the time is just the startup time. If you are not in shtab >= 1.3.4 version, then it might be pkg_resources as well. Or, if you are in Python3.9, don't forget to pip install ruamel.yaml.clib.

For the example-get-started, it takes ~300ms and for a quite large dvc.yaml file with ~100 stages, it takes ~400ms on my machine (just to compare).

@pared
Copy link
Contributor

pared commented Jan 18, 2021

@skshetry I am using 3.7, shtab==1.3.4. Don't get me wrong, in my case the delay is small, so that I don't feel fear from using completions, which was the case last time we have been measuring it. So I would say it's pretty great.

formatter_class=argparse.RawDescriptionHelpFormatter,
)
stage_list_parser.add_argument(
"targets", nargs="*", default=["dvc.yaml"], help="Stages to list from",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files to list from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shcheklein, it can also be a directory/foreach-stage-name/stage-name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk, my point that this combination default=["dvc.yaml"] and "Stages to list from" sounds strange a bit and it's not clear what does it exactly expect/can take.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a common problem in most of our commands these days as they all support these kinds of target spec. I don't have a good idea to make it concise and also document it clearly.

I changed it to the following (removed or a stage name to keep it short for now):

Show stages from a dvc.yaml/.dvc file or a directory. 'dvc.yaml' by default.

I looked into pip install and git add, and they mostly seem to not explain positional arguments.

  1. pip
$ pip install -h

Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...
  1. git
$ git add -h
usage: git add [<options>] [--] <pathspec>...

maybe we could do similar to pip in our commands?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, but in this case it should be less generic as targets. In case of pip local project path is a good example. It doesn't require any further explanations.

Git is so-so. They use pathspec but they are very consistent and it's easy to google it (it's very specific). Also, they have examples usually around.

Copy link
Member Author

@skshetry skshetry Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also replace similar to the pip with:

$ dvc stage list -h

Usage:
	dvc stage list [path to dvc.yaml]...
	dvc stage list -R <dir>...
	dvc stage list [path to dvc.yaml][:<stage name>]...
	dvc stage list --all

@skshetry skshetry merged commit 372d6f0 into iterative:master Jan 19, 2021
@skshetry skshetry deleted the stage-list branch January 19, 2021 11:50
@shcheklein
Copy link
Member

A few quick notes on the workflow:

  • since it affects UI/UX - it would be great to see some summary to review and ask @jorgeorpinel or @dberenbaum to review messages/descriptions
  • do we have a ticket for dvc.org on this? (it's checked, but I don't see it)

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stage list description UI text suggestions:

dvc/command/stage.py Show resolved Hide resolved
dvc/command/stage.py Show resolved Hide resolved

plots_and_metrics = list(filter(is_plot_or_metric, stage.outs))
if plots_and_metrics:
desc.append("Generates " + part_desc(plots_and_metrics))
Copy link
Contributor

@jorgeorpinel jorgeorpinel Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Outputs metric " ?

Copy link
Member Author

@skshetry skshetry Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's too long. I'd only like to add a single verb here.

This comment was marked as resolved.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this in a test repo I have and there's no indication about what "Produces/Outputs" and "Generates" mean.

$ dvc stage list
ab   Produces ab
abm  Generates ab.met

Maybe also use "Outputs" (single verb) and append "(metric)" or "(plot)" at the end of the line?

ab   Outputs ab
abm  Outputs ab.met (metric)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding metric and plot is one more word for each of the outputs. Would be better to find a single word to explain all these kinds of output.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that user will have the output path that is already quite lengthy. I am trying to keep it within a single line.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry super long late reply. I know this is merged so OK. But for the record, keeping "produces" and "generates" seems to me like the worst alternative. It's unclear why the former refers to regular outputs and the latter to metrics/plots.

@jorgeorpinel
Copy link
Contributor

(Most of the strings in https://github.com/iterative/dvc/pull/5282/files#diff-cd4d8ab7bbb98bdac7ebf8f89f4c65961526eea6bd78a142e11d4da9835e2bf7 are not covered in tests according to Codecov BTW, not sure if that's a problem)

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stage list -h string suggestions and some Qs:

Comment on lines +305 to +310
"targets",
nargs="*",
default=["dvc.yaml"],
help=(
"Show stages from a dvc.yaml/.dvc file or a directory. "
"'dvc.yaml' by default"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Limit command scope to the stages in these .dvc or dvc.yaml files."

Similar to https://dvc.org/doc/command-reference/repro (and others)

We can explain the default and about directories in the options and in docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wait, why .dvc files? What stages are there?

And, so this doesn't accept stage names? E.g. to confirm that a stage exists (like regular list for artifacts)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, so this doesn't accept stage names?

It does, but I removed it to keep it short.

"Limit command scope to the stages in these .dvc or dvc.yaml files."

I don't think limit applies here, it's more like changing the scope completely. Though I don't like the repro's either.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Limit" applies as much as in all the existing targets arg descriptions in other commands, I think. It's usually dvc.yaml by default, and the base use case of this is to narrow it down.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't prefer Limit here as the main use of targets here is to change the scope of dvc.yaml to be used, rather than narrowing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK then something like `.dvc file or dvc.yaml files to show stages from". Should def. be plural at least since it's "targets".

"--all",
action="store_true",
default=False,
help="List all of the stages in the repo.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"List all of the stages in the repo regardless of `targets`." (if I'm understanding this correctly)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "regardless of targets" to be redundant due to the "all of the stages" there.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Feb 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to be redundant sometimes. We need to be extra clear.

We often describe things correctly but people still get confused. All the commands an options are not obvious at all and as the developers we are biased because we already deeply understand them.

Comment on lines +320 to +323
"--fail",
action="store_true",
default=False,
help="Fail immediately if there's an error.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't error imply failing already?
Not sure how to try this locally, what kind of error? Thanks

Copy link
Member Author

@skshetry skshetry Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax Errors in yaml when using --all.

Copy link
Contributor

@jorgeorpinel jorgeorpinel Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. So my point was that "fails if there's an error" doesn't explain much/ seems redundant. Maybe we can update these messages in some upcoming PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have opened #5390 regarding discussion of UI messages. Will update the flags as that will most likely affect it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I found that issue after checking this. Cool, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to address this on #5499, please take a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#5499 LGTM πŸ‘

dvc/command/stage.py Show resolved Hide resolved
Comment on lines +67 to +70
class CmdStageList(CmdBase):
def _get_stages(self) -> Iterable["Stage"]:
if self.args.all:
stages: List["Stage"] = self.repo.stages # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last post-merge comment here. Should it be dvc stages add/list? In plural like params/metrics/plots/experiments (Only remote is in singular, which is OK for Git familiarity I think)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Subcommand should usually be plural, which we should better discuss on #5392 as a guideline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhances DVC feature is a feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants