Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc status always mark a param as changed #4184

Closed
WestXu opened this issue Jul 9, 2020 · 1 comment · Fixed by #4185
Closed

dvc status always mark a param as changed #4184

WestXu opened this issue Jul 9, 2020 · 1 comment · Fixed by #4185
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.

Comments

@WestXu
Copy link

WestXu commented Jul 9, 2020

Bug Report

Please provide information about your setup

Output of dvc version:

$ dvc version
DVC version: 1.1.7
Python version: 3.8.3
Platform: Linux-4.19.104-microsoft-standard-x86_64-with-glibc2.10
Binary: False
Package: pip
Supported remotes: azure, gdrive, gs, hdfs, http, https, s3, ssh, oss
Cache: reflink - not supported, hardlink - supported, symlink - supported
Filesystem type (cache directory): ('9p', 'D:\\')
Repo: dvc, git
Filesystem type (workspace): ('9p', 'D:\\')

Additional Information (if any):

dvc status always gives me a "changed deps", when there isn't one. dvc commit / repro doesn't help.

$ dvc status
eval:                                                                           
        changed deps:
                params.yaml:
                        new:                eval.filter_limitup
$ dvc commit
dependencies ['params.yaml'] of stage: 'get_base_dv' changed. Are you sure you want to commit it? [y/n] y
dependencies ['params.yaml'] of stage: 'eval' changed. Are you sure you want to commit it? [y/n] y                                                                                 
$ dvc status
eval:                                                                           
        changed deps:
                params.yaml:
                        new:                eval.filter_limitup
$ dvc repro -P
Stage 'get_base_dv' didn't change, skipping                                     
Stage 'log_float_mv' didn't change, skipping
Stage 'lgt' didn't change, skipping
Stage 'margin_trade' didn't change, skipping
Stage 'alpha' didn't change, skipping
Stage 'industry' didn't change, skipping
Stage 'future_alpha' didn't change, skipping
Stage 'preprocess' didn't change, skipping
Stage 'train' didn't change, skipping
Restored stage 'eval' from run-cache
Skipping run, checking out outputs
$ dvc status
eval:                                                                           
        changed deps:
                params.yaml:
                        new:                eval.filter_limitup

Here is the content of my params.yaml:

start: 20200101
end: 20200630
universe: "000050.SH"
benchmark: "000905.SH"

eval:
  group: 5
  hold_days: 5
  filter_limitup: false

Here is the related content of my dvc.yaml:

stages:
  get_base_dv:
    cmd: ipython --pdb new_daily_model/get_base_dv.py
    deps:
    - new_daily_model/get_base_dv.py
    params:
    - benchmark
    - end
    - start
    - universe
    outs:
    - data/base_dv
  eval:
    cmd: ipython --pdb new_daily_model/eval.py
    deps:
    - data/prediction_sr.pkl
    - new_daily_model/eval.py
    params:
    - benchmark
    - eval.filter_limitup
    - eval.group
    - eval.hold_days
    - universe
    outs:
    - reports/fa.html
    metrics:
    - reports/metrics.json:
        cache: false

So there are at least 2 problems here:

  1. dvc commit always shows the stage get_base_dv changed, even when dvc status didn't show that.
  2. dvc status always shows the param eval.filter_limitup of stage eval changed, when actually wasn't.
@triage-new-issues triage-new-issues bot added the triage Needs to be triaged label Jul 9, 2020
@skshetry skshetry added bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP. labels Jul 9, 2020
@triage-new-issues triage-new-issues bot removed triage Needs to be triaged labels Jul 9, 2020
@skshetry
Copy link
Member

skshetry commented Jul 9, 2020

Issue is that we are skipping false values:

if value:
self.info[param] = value

Should only have been:

self.info[param] = value

skshetry added a commit to skshetry/dvc that referenced this issue Jul 9, 2020
The falsy values were being skipped when merging params values
from dvc.yaml and dvc.lock, showing the values as always changed.

These values were never added to `info`, but were found in
`self.params` making them appear as changed.

Fixes iterative#4184
efiop pushed a commit that referenced this issue Jul 9, 2020
The falsy values were being skipped when merging params values
from dvc.yaml and dvc.lock, showing the values as always changed.

These values were never added to `info`, but were found in
`self.params` making them appear as changed.

Fixes #4184
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants