Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp run: --set-param does not override Hydra config parameter #10186

Closed
lefos99 opened this issue Dec 20, 2023 · 4 comments
Closed

exp run: --set-param does not override Hydra config parameter #10186

lefos99 opened this issue Dec 20, 2023 · 4 comments

Comments

@lefos99
Copy link

lefos99 commented Dec 20, 2023

Bug Report

Description

When using dvc exp run --set-param to update a parameter, the params.yaml file is updated correctly, but the change is not propagated to the training script when using Hydra for configuration management. Despite the params.yaml being updated, the training script executed by DVC uses the default value from conf/config.yaml, and the override is not applied.

Reproduce

  1. Create a DVC project with Hydra configuration management and the following structure:
    .
    ├── conf
    │   └── config.yaml
    ├── dvc.lock
    ├── dvc.yaml
    ├── params.yaml
    └── train.py
    
  2. Configure .dvc/config with the following content:
    [hydra]
        enabled = True
    
  3. Configure dvc.yaml with a stage that uses a parameter:
    stages:
      training:
        cmd: python3 train.py
        params:
          - training.pl_hparams.max_epochs
  4. Implement a training script train.py that uses Hydra to load configuration:
    from omegaconf import DictConfig
    import hydra
    
    @hydra.main(version_base=None, config_path='conf', config_name='config')
    def main(cfg: DictConfig) -> None:
        print('max epochs are ', cfg.training.pl_hparams.max_epochs)
    
    if __name__ == "__main__":
        main()
  5. Provide a conf/config.yaml with the default parameter value:
    training:
      pl_hparams:
        max_epochs: 11
  6. Run a DVC experiment with a parameter override, e.g., dvc exp run -sf training -S 'training.pl_hparams.max_epochs=41'.
  7. Observe that params.yaml is updated with the new max_epochs value.
  8. Notice that the output from the train.py script still shows the default max_epochs value from conf/config.yaml.

Expected

The dvc exp run --set-param command should update the parameter in params.yaml and ensure that the overridden parameter value is used by the training script when executed by DVC, even when using Hydra for configuration management.

Environment information

Output of dvc doctor:

$ dvc doctor
DVC version: 3.33.4 (pip)
-------------------------
Platform: Python 3.8.18 on Linux-5.15.0-91-generic-x86_64-with-glibc2.17
Subprojects:
        dvc_data = 2.24.0
        dvc_objects = 2.0.1
        dvc_render = 1.0.0
        dvc_task = 0.3.0
        scmrepo = 1.6.0
Supports:
        http (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.8.5, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2023.9.2, boto3 = 1.28.17),
        ssh (sshfs = 2023.7.0)
Config:
        Global: /fastdata/users/lefos/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: None
Workspace directory: ext4 on /dev/mapper/fastdatagroup-fastdatavolume
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/76cee0c3cf39f2ccf3b8f67ddb9ff9ca
@dberenbaum
Copy link
Collaborator

It's expected but we need to clarify it better in the docs. To have the hydra overrides reflected in your @hydra.main() decorator, you would need to include the overrides in the cmd itself, like cmd: python3 train.py training.pl_hparams.max_epochs=41, but that's probably not what you want.

You don't need the @hydra.main() decorator in your code since all parameters have already been dumped to params.yaml. You can read directly from there:

from omegaconf import DictConfig
from ruamel.yaml import YAML

def main() -> None:
    cfg = DictConfig(YAML(typ="safe").load(open("params.yaml")))
    print('max epochs are ', cfg.training.pl_hparams.max_epochs)

if __name__ == "__main__":
    main()

@lefos99
Copy link
Author

lefos99 commented Dec 22, 2023

Thank you, @dberenbaum , for the clarification and the suggested workaround. I understand that the current behavior is by design and that the documentation could benefit from additional details on how parameter overrides are handled with Hydra.

However, the proposed solution to bypass the @hydra.main() decorator and read directly from params.yaml presents a challenge for us. Our project relies on custom OmegaConf resolvers, which are important for our configuration management. As noted in issue #9731, DVC currently does not support these custom resolvers, making it infeasible to switch to direct params.yaml reading without losing important functionality.

Is there a possibility to enhance DVC's compatibility with Hydra's features, specifically to support custom OmegaConf resolvers? This would allow us to utilize --set-param overrides (and usage of the VS code DVC addon as wel;) effectively while maintaining our existing configuration setup.

@dberenbaum
Copy link
Collaborator

Thanks for the explanation @lefos99! Would you mind moving your issue over to #9731 so that we have one consolidated issue about it? Also, do you have an example of what you are doing with custom resolvers?

@lefos99
Copy link
Author

lefos99 commented Jan 2, 2024

Thanks again for the reply @dberenbaum.

Would you mind moving your issue over to #9731 so that we have one consolidated issue about it? Also, do you have an example of what you are doing with custom resolvers?

I moved my issue to #9731 and added some examples for our custom resolvers. I will close this issue now.

@lefos99 lefos99 closed this as completed Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants