Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent NoneType exception when profiling empty datasets #3144

Merged
merged 2 commits into from
Aug 23, 2021

Conversation

sgomezvillamor
Copy link
Contributor

@sgomezvillamor sgomezvillamor commented Aug 23, 2021

res["unexpected_percent"] is NoneType when profiling empty datasets and the division causes an exception. This PR is preventing that scenario.

Issue #3132

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

@@ -219,7 +219,8 @@ def _handle_convert_column_evrs( # noqa: C901 (complexity)
elif exp == "expect_column_values_to_not_be_null":
column_profile.nullCount = res["unexpected_count"]
if "unexpected_percent" in res:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the system can handle column_profile.nullProportion = None, so how about only setting it only when both of these are true.

So I guess:

if "unexpected_percent" in res and res["unexpected_percent"] is not None:
      column_profile.nullProportion = res["unexpected_percent"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes totally sense. I will try that!

@shirshanka
Copy link
Contributor

@sgomezvillamor : seems like lint is complaining (https://github.com/linkedin/datahub/pull/3144/checks?check_run_id=3402002680)

You can fix it via: ./gradlew :metadata-ingestion:lintFix from the top level directory.

Also I had a suggestion for rewriting the check and combining it with the previous.

@sgomezvillamor sgomezvillamor changed the title fix(profiling): prevent NoneType exception when profiling empty datasets prevent NoneType exception when profiling empty datasets Aug 23, 2021
Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit dd7bead into datahub-project:master Aug 23, 2021
shirshanka pushed a commit to shirshanka/datahub that referenced this pull request Aug 27, 2021
gabe-lyons pushed a commit to gabe-lyons/datahub that referenced this pull request Aug 31, 2021
rahulbsw pushed a commit to rahulbsw/datahub that referenced this pull request Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants