Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA: params/vars and loops in dvc.yaml (2.0) round 3 #5180

Closed
1 of 4 tasks
jorgeorpinel opened this issue Dec 30, 2020 · 9 comments
Closed
1 of 4 tasks

QA: params/vars and loops in dvc.yaml (2.0) round 3 #5180

jorgeorpinel opened this issue Dec 30, 2020 · 9 comments
Labels
A: templating Related to the templating feature discussion requires active participation to reach a conclusion

Comments

@jorgeorpinel
Copy link
Contributor

jorgeorpinel commented Dec 30, 2020

Continues #5165

Bugs?

  • Defining a stage with dvc repro isn't fully compatible with using ${} substitutions in commands e.g.

    $ dvc run -n echovar "echo \${var}"

    runs echo ${var} in which is interpreted by the terminal (and probably var doesn't exist in the env, so an empty line is printed).
    Upon dvc repro, the command executed is different: reads var from params.yaml (assuming it exists, otherwise errors out) and substitutes it in the cmd, for example echo foo (the value of the var param is printed).

  • (Fixed in run-cache: skip unhashable stages #5185) A strange edge error occurs when you try to overwrite a stage that has params and ${} substitution with dvc run --force. E.g.:

    stages:
      hasparam:
        cmd: echo hello
        params:
        - var
    $ dvc run -f -n hasparam -p var "echo \${var}"
    ERROR: unexpected error - 'outs'

    TBH I'm not sure under what conditions this happens or if it's definitely related to the 2.0 dvc.yaml changes. The verbose output seems to indicate it's also related to the run-cache (so feel free to extract this to its own issue).

Feature design

  • It's not possible to combine foreach and vars in a multi-stage. dvc repro gives the following error:
    ERROR: 'dvc.yaml' format error: extra keys not allowed @ data['stages']['mystage']['foreach']

    This is mentioned as a known limitation in https://github.com/iterative/dvc/wiki/Parametrization#stage-local-vars-section. But at least the error message could be improved for now because it seems unacurate.

  • Similarly, shouldn't wdir be able work with hardcoded local vars? Not from params files listed in local vars though, I get why that's not possible.

    Also a known limitation mentioned in the wiki page

Other

@jorgeorpinel jorgeorpinel added bug Did we break something? product: VSCode Integration with VSCode extension labels Dec 30, 2020
@skshetry
Copy link
Member

@jorgeorpinel, dvc run does not support parametrization. We should encourage users to write dvc.yaml themselves. dvc run does too many things. So, both of those are not bugs, but a misuse of dvc run.

Regarding the last error message, I have created #5182. Please post verbose logs when creating a bug report. Closing this issue, thanks.

@skshetry skshetry reopened this Dec 30, 2020
@skshetry
Copy link
Member

Looks like this is a draft for other QA-ing as well. Opening...

@jorgeorpinel jorgeorpinel removed the bug Did we break something? label Dec 31, 2020
@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Dec 31, 2020

dvc run does not support parametrization

Yup, that summarizes what I was trying to report 🙂. For example the following seems to me like a more convenient and scriptable option to editing a multi-line dvc.yaml file:

$ echo "foo: bar" > params.yaml
$ dvc run -n echofoo "echo \${foo}"

Please remember that during QA we are operating in the realm of the hypothetical. I'm trying to think of ways to use and abuse our features. Would anyone actually have this problem? Idk but let's try not to dismiss edge cases because they are unintended uses.

That said I guess I shouldn't label these bug if I'm not sure until after discussing a little (that may have implications for the core team). Changed that!

p.s. And yes this is a draft still. I added one a third thing so far but not quite done yet. Hopefully tomorrow

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jan 5, 2021

OK I think I'm done here: this is ready for addressing, discussing, etc. Cc @shcheklein

If you could label this discussion that would be great @skshetry or @efiop thanks

@jorgeorpinel jorgeorpinel changed the title [WIP] QA: params/vars and loops in dvc.yaml (2.0) round 3 QA: params/vars and loops in dvc.yaml (2.0) round 3 Jan 5, 2021
@efiop efiop added the discussion requires active participation to reach a conclusion label Jan 5, 2021
@skshetry
Copy link
Member

skshetry commented Jan 19, 2021

The status of the above reports are:

  1. Not a bug. We won't be supporting parameterization as it's not easy to support, and we are encouraging users to write dvc.yaml than use dvc run (which will be deprecated in 2.0 and instead stage add/create will be recommended).

  2. Fixed already.

  3. Regarding vars before foreach, what is the use case? I know it can be used to have local scope within the foreach and can be quite useful, but why not load it on globally? Can it be worked around by users?
    Anyway, I am not interested in introducing it in 2.0 as we can add it later without any incompatibilities. Let's wait for someone ask for it.

  4. Similarly, shouldn't wdir be able work with hardcoded local vars?

    I think dvc.yaml to be a kind of a programming language which has its own order of evaluation.
    Having two kinds of scope would break that thought process to me. It's not intuitive at all, later if I changed my mind to use vars from different files rather than the hardcoded ones.

    Another thing is that local hardcoded vars is mostly meant to be useful during cmd. eg: setting env vars, etc.

@jorgeorpinel
Copy link
Contributor Author

jorgeorpinel commented Jan 21, 2021

OK thanks, I'll read the reply carefully ASAP. I quickly discussed this and related issues with Ivan and I'll be reprioritizing and consolidating some (if any) remaining concerns in a separate issue (also for Dave to get involved) but probably nothing is major or blocking. Will submit that issue and close these... ⌛

@jorgeorpinel
Copy link
Contributor Author

Regarding vars before foreach, what is the use case? ... Can it be worked around by users?

I'd flip the question. Why prevent that combination? We say that all the regular stage fields are accepted under do, but for some reason not vars? (BTW as implied I wasn't thinking necessarily vars before foreach.)

we can add it later without any incompatibilities

Sure, this issue doesn't even suggest any kind of timeline, it's just a matter of keeping track of these things for when it's appropriate to do something about it (or decide not to).

language which has its own order of evaluation

I see. You'd have to first load local "write-in" vars, then eval wdir, and THEN load the file-based vars, which messes up with the order in which objects are merged, right? OK I agree it doesn't make sense — although it could make sense if we didn't merge objects (will expand on that in my next issue where everything is consolidated).

@skshetry
Copy link
Member

but for some reason not vars

vars should be supported inside do`. I'll check, but that should be supported. Maybe I forgot to account the schema.

@jorgeorpinel
Copy link
Contributor Author

Oh OK maybe I got that wring? I'll check too.

Closing this in favor of #5312 (see you there).

@skshetry skshetry added A: templating Related to the templating feature and removed product: VSCode Integration with VSCode extension labels May 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: templating Related to the templating feature discussion requires active participation to reach a conclusion
Projects
None yet
Development

No branches or pull requests

3 participants