Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add validate method to BaseMultiTableSynthesizer #1071

Closed
amontanez24 opened this issue Oct 19, 2022 · 0 comments
Closed

Add validate method to BaseMultiTableSynthesizer #1071

amontanez24 opened this issue Oct 19, 2022 · 0 comments
Assignees
Labels
feature request Request for a new feature
Milestone

Comments

@amontanez24
Copy link
Contributor

Problem Description

As a user, it would be useful to see if the data I provide matches my metadata at the multi-table level.

Acceptance criteria

  • Add a method called validate to the BaseMultiTableSynthesizer class
    • This method should have the following parameters:
      • data: a dictionary mapping table name -> the pandas DataFrame for that table
    • The method should call the SingleTableSynthesizer.validate to accumulate those errors
    • Additionally, it should perform the following check: Columns marked as foreign keys should be referencing a primary key that exists in the parent table – ie no unknown parent references.
      • If this fails raise the following error:
        Error: Foreign key column 'purchaser_id' contains unknown references: ('Unknown', 'USER_999', 'ZZZ', +more). All the values in must reference a primary key.

Expected behavior

>>> synthesizer = HMASynthesizer(metadata)
>>> synthesizer.validate(data)

InvalidDataError: The provided data does not match the metadata

Error: Foreign key column 'purchaser_id' contains unknown references: ('Unknown', 'USER_999', 'ZZZ', +more). All the values in must reference a primary key.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

2 participants