Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report errors when loading YAML files with duplicate keys #257

Merged
merged 4 commits into from
May 28, 2020

Conversation

jgehring
Copy link
Contributor

@jgehring jgehring commented May 26, 2020

This adds a custom constructor to the YAML loader created in
get_yaml_loader() so that duplicate keys in the YAML file result in a
PyYAML ConstructorError.

Duplicate keys are not allowed as per the YAML spec but are silently
ignored by PyYAML. This is a potential source of hard-to-detect errors
due to misconfigurations. It has been reported several times in the
PyYAML issues, e.g. yaml/pyyaml#165. The
specific fix here was proposed at https://gist.github.com/pypt/94d747fe5180851196eb.

This adds a custom constructor to the YAML loader created in
get_yaml_loader() so that duplicate keys in the YAML file result in a
PyYAML ConstructorError.

Duplicate keys are not allowed as per the YAML spec but are silently
ignored by PyYAML. This is a potential source of hard-to-detect errors
due to misconfigurations. It has been reported several times in the
PyYAML issues, e.g. yaml/pyyaml#165. The
specific fix here was proposed at https://gist.github.com/pypt/94d747fe5180851196eb.
@jgehring jgehring force-pushed the yaml-check-duplicate-keys branch from d4c262a to 1c8dbbb Compare May 26, 2020 12:10
Copy link
Owner

@omry omry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, a few notes:

  1. Fix lint.
  2. Add a news fragment file (This doc in Hydra got some general mostly applicable info about it)
  3. See inline comment (applies to both tests).


try:
with tempfile.NamedTemporaryFile(delete=False) as fp:
fp.write("a:\n b: 1\na:\n b: 2\n".encode("utf-8"))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch tests to use """ strings to to make them easier to understand.
"""

Suggested change
fp.write("a:\n b: 1\na:\n b: 2\n".encode("utf-8"))
content = """
a:
b: 1
a:
b: 2
"""
fp.write(content.encode("utf-8"))

@jgehring
Copy link
Contributor Author

jgehring commented May 27, 2020

I guess "bugfix" is the most appropriate label for this PR?

Comment on lines +122 to +123
with pytest.raises(ConstructorError):
OmegaConf.load(fp.name)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this becomes:

with pytest.warns(...)

@omry
Copy link
Owner

omry commented May 28, 2020

Alright, error it is!

@omry omry merged commit 721efd2 into omry:master May 28, 2020
omry pushed a commit that referenced this pull request May 28, 2020
* Report errors when loading YAML files with duplicate keys

This adds a custom constructor to the YAML loader created in
get_yaml_loader() so that duplicate keys in the YAML file result in a
PyYAML ConstructorError.

Duplicate keys are not allowed as per the YAML spec but are silently
ignored by PyYAML. This is a potential source of hard-to-detect errors
due to misconfigurations. It has been reported several times in the
PyYAML issues, e.g. yaml/pyyaml#165. The
specific fix here was proposed at https://gist.github.com/pypt/94d747fe5180851196eb.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants