Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update error message when catalog entry is invalid #3944

Merged
merged 4 commits into from
Jun 13, 2024

Conversation

ankatiyar
Copy link
Contributor

@ankatiyar ankatiyar commented Jun 10, 2024

Description

Fix #3910

Development notes

Add an error message if the catalog entry is not a dictionary - eg

invalid_entry: whatever

Also, when the catalog entry is a dict but type: is missing.

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

@ankatiyar ankatiyar marked this pull request as ready for review June 10, 2024 14:27
@ankatiyar ankatiyar requested a review from merelcht as a code owner June 10, 2024 14:27
@ankatiyar ankatiyar requested review from noklam and DimedS June 10, 2024 14:27
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one question. + don't forget to add this to the release notes ✍️

@@ -288,6 +288,12 @@ class to be loaded is specified with the key ``type`` and their
user_default = {}

for ds_name, ds_config in catalog.items():
if not isinstance(ds_config, dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #3555 there was a bit of discussion on what to do with values that are dict type, but not meant as dataset. Is it worth updating the message for that case as well to mention interpolation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I've added a hint to the error that comes from AbstractDataset.from_config() as well!

Signed-off-by: Ankita Katiyar <[email protected]>
@ankatiyar ankatiyar requested a review from merelcht June 11, 2024 10:36
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

@DimedS DimedS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @ankatiyar ! The PR looks great. I have one question: Is it okay that we receive a different error if we put just some_value into the catalog without a : after it?

@ankatiyar
Copy link
Contributor Author

Thank you, @ankatiyar ! The PR looks great. I have one question: Is it okay that we receive a different error if we put just some_value into the catalog without a : after it?

Some offline discussion with @DimedS on this. If there is a catalog entry which is just one word like this-

some_value

This is wrong YAML syntax and it'll error out much earlier when OmegaConfigLoader tries to load the catalog. -

 File "/Users/ankita_katiyar/kedro/kedro/kedro/config/omegaconf_config.py", line 319, in load_and_merge_dir_config
    raise ParserError(
yaml.parser.ParserError: Invalid YAML or JSON file /Users/ankita_katiyar/kedro_projects/demo/conf/base/catalog.yml, unable to read line 73, position 0.

This is not Kedro specific, it simply is just not allowed in YAML afaik. So to me, they seem like two different issues. This PR is to address when a catalog entry is correct syntactically in YAML but "wrong" as per Kedro rules (eg. not a valid dataset) in catalog.yml

Copy link
Member

@DimedS DimedS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. It seems it will be complicated to handle this properly on the Kedro side. Additionally, the error is meaningful as it points to the specific line number in catalog.yml where the parsing issue occurs, unlike the other error mentioned in the ticket.

@ankatiyar ankatiyar enabled auto-merge (squash) June 13, 2024 10:01
@ankatiyar ankatiyar merged commit b6e585f into main Jun 13, 2024
41 checks passed
@ankatiyar ankatiyar deleted the catalog-error-message branch June 13, 2024 10:17
bpmeek pushed a commit to bpmeek/kedro that referenced this pull request Jun 20, 2024
* Update error message when catalog entry is invalid

Signed-off-by: Ankita Katiyar <[email protected]>

* Update error message for dict type as well

Signed-off-by: Ankita Katiyar <[email protected]>

* Move error message

Signed-off-by: Ankita Katiyar <[email protected]>

---------

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: bpmeek <[email protected]>
bpmeek pushed a commit to bpmeek/kedro that referenced this pull request Jun 20, 2024
* Update error message when catalog entry is invalid

Signed-off-by: Ankita Katiyar <[email protected]>

* Update error message for dict type as well

Signed-off-by: Ankita Katiyar <[email protected]>

* Move error message

Signed-off-by: Ankita Katiyar <[email protected]>

---------

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: bpmeek <[email protected]>
bpmeek pushed a commit to bpmeek/kedro that referenced this pull request Jul 18, 2024
* Update error message when catalog entry is invalid

Signed-off-by: Ankita Katiyar <[email protected]>

* Update error message for dict type as well

Signed-off-by: Ankita Katiyar <[email protected]>

* Move error message

Signed-off-by: Ankita Katiyar <[email protected]>

---------

Signed-off-by: Ankita Katiyar <[email protected]>
Signed-off-by: bpmeek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DataCatalog]: Error message is confusing if the catalog.yaml is invalid
3 participants