Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add verbosity to drop_unknown_references #1854

Merged
merged 6 commits into from
Mar 21, 2024

Conversation

R-Palazzo
Copy link
Contributor

CU-86aznj0ht
Resolve #1845

@npatki I slightly changed the message printed compared to the issue let me know if it works:

  • Data with referential integrity:
Screenshot 2024-03-18 at 14 43 30
  • Data without referential integrity:
Screenshot 2024-03-18 at 14 44 03

@R-Palazzo R-Palazzo requested a review from a team as a code owner March 18, 2024 14:46
@sdv-team
Copy link
Contributor

1 similar comment
@sdv-team
Copy link
Contributor

@R-Palazzo R-Palazzo removed the request for review from a team March 18, 2024 14:46
'Success! All foreign keys have referential integrity.\n'
'No rows were dropped.'
)
assert captured.out.strip() == expected_message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather we not have a special case message for this case. In the spec, we said we should print out for every table all the time.

Even if nothing was dropped, I think it's useful reassurance to the user that we checked all tables. Plus, it simplifies our logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, done in 7a80b0a

@npatki
Copy link
Contributor

npatki commented Mar 18, 2024

@R-Palazzo I also think that the first case you have added a line Summary of the number of rows dropped: which is not really necessary

@codecov-commenter
Copy link

codecov-commenter commented Mar 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.33%. Comparing base (7b1b9b7) to head (6067e02).

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1854   +/-   ##
=======================================
  Coverage   97.32%   97.33%           
=======================================
  Files          51       51           
  Lines        4823     4834   +11     
=======================================
+ Hits         4694     4705   +11     
  Misses        129      129           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sdv/utils/poc.py Outdated
metadata.validate()
try:
metadata.validate_data(data)
if drop_missing_values:
_validate_foreign_keys_not_null(metadata, data)

if verbose:
sys.stdout.write('\n'.join([success_message, summary_table.to_string(index=False)]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sys.stdout instead of print (just curious).

Should we add one more \n, this is how it looks:
image

I think that this is a bit more digestable by the eye:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done in 6067e02. We have a lint error when using print but I could also add # noqa: T001 (in SDMetrics we use sys.stdout)

@R-Palazzo R-Palazzo merged commit bf204f2 into main Mar 21, 2024
37 checks passed
@R-Palazzo R-Palazzo deleted the issue-1845-verbosity-drop-unknown-references branch March 21, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add verbosity to drop_unknown_references
6 participants