Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Error when using config with UTF characters #1659

Closed
2 tasks done
michaelitvin opened this issue Jun 9, 2021 · 4 comments · Fixed by #1660
Closed
2 tasks done

[Bug] Error when using config with UTF characters #1659

michaelitvin opened this issue Jun 9, 2021 · 4 comments · Fixed by #1660
Labels
bug Something isn't working
Milestone

Comments

@michaelitvin
Copy link

michaelitvin commented Jun 9, 2021

🐛 Bug

Description

An exception is thrown when using Unicode chars in the config.

Workaround

In hydra/core/utils.py:

def _save_config(cfg: DictConfig, filename: str, output_dir: Path) -> None:
    output_dir.mkdir(parents=True, exist_ok=True)
    yaml = OmegaConf.to_yaml(cfg).encode("utf8")
    with open(str(output_dir / filename), "wb") as file:
        file.write(yaml)

Checklist

  • I checked on the latest version of Hydra
  • I created a minimal repro (See this for tips).

To reproduce

Use a config.yaml with unicode chars, and make Hydra store it in outputs/.

Minimal example

Use the tutorial code:

from omegaconf import DictConfig, OmegaConf
import hydra

@hydra.main(config_name='config')
def my_app(cfg: DictConfig) -> None:
    print(OmegaConf.to_yaml(cfg))

if __name__ == "__main__":
    my_app()

Along with a config that has Unicode chars:
config.yaml:

a: א

** Stack trace/error message **

  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\main.py", line 32, in decorated_main
    _run_hydra(
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\_internal\utils.py", line 346, in _run_hydra
    run_and_report(
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\_internal\utils.py", line 201, in run_and_report
    raise ex
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\_internal\utils.py", line 198, in run_and_report
    return func()
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\_internal\utils.py", line 347, in <lambda>
    lambda: hydra.run(
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\_internal\hydra.py", line 107, in run
    return run_job(
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\core\utils.py", line 122, in run_job
    _save_config(task_cfg, "config.yaml", hydra_output)
  File "D:\proj\video-pipeline\venv38\lib\site-packages\hydra\core\utils.py", line 70, in _save_config
    file.write(OmegaConf.to_yaml(cfg))
  File "C:\Users\micha\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 267-271: character maps to <undefined>

Expected Behavior

System information

  • Hydra Version : 1.0.6
  • Python version : 3.8.10
  • Virtual environment type and version : pip
  • Operating system : Windows 10

Additional context

Add any other context about the problem here.

@michaelitvin michaelitvin added the bug Something isn't working label Jun 9, 2021
@jieru-hu
Copy link
Contributor

jieru-hu commented Jun 9, 2021

hi @michaelitvin
thanks for the reports. I do not have a windows machine and haven't been able to repro this on my MAC.

What exact version of Hydra 1.0 are you on? we did fix a unicode in config file related bug in a later Hydra 1.0 release. Could you try to upgrade to the latest Hydra 1.0 and see if the issue goes away?

pip install hydra-core --upgrade

@michaelitvin
Copy link
Author

I'm using the latest from pip - 1.0.6.
The exception does make sense, as when the config is written to file, an encoding isn't specified. So a default is used, and on Windows that's cp1252.

@jieru-hu
Copy link
Contributor

jieru-hu commented Jun 9, 2021

I'm using the latest from pip - 1.0.6.
The exception does make sense, as when the config is written to file, an encoding isn't specified. So a default is used, and on Windows that's cp1252.

I see! I will look into this. Thanks for reporting.

@jieru-hu
Copy link
Contributor

jieru-hu commented Jun 9, 2021

@michaelitvin this will be fixed in Hydra 1.1.
We are not adding minor fixes to Hydra 1.0 at this moment, we encourage everyone to upgrade to Hydra 1.1.

Thanks!

@jieru-hu jieru-hu added this to the Hydra 1.1.0 milestone Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants