Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError in DiagnosticReport if synthetic data does not match metadata #508

Closed
frances-h opened this issue Nov 9, 2023 · 0 comments · Fixed by #499
Closed

ValueError in DiagnosticReport if synthetic data does not match metadata #508

frances-h opened this issue Nov 9, 2023 · 0 comments · Fixed by #499
Assignees
Labels
bug Something isn't working
Milestone

Comments

@frances-h
Copy link
Contributor

Environment Details

Please indicate the following details about the environment in which you found the bug:

  • SDMetrics version: diagnostic_report_updates
  • Python version:
  • Operating System:

Error Description

Currently, the DiagnosticReport errors if the synthetic data does not match the given metadata. Because the DiagnosticReport has metrics designed to evaluate this situation, the report should not error if the synthetic data does not match the metadata. The report should still validate that the real data matches the synthetic data. The error message should be updated to indicate only the real data has missing/extra columns.

Steps to reproduce

import pandas as pd
from sdmetrics.reports.single_table import DiagnosticReport

data = pd.DataFrame({
   'id': [0, 1, 2],
   'val1': ['a', 'a', 'b'],
   'val2': [0.1, 2.4, 5.7]
})
synthetic_data = pd.DataFrame({
  'id': [1, 2, 3],
  'extra_col': ['x', 'y', 'z'],
  'val1': ['c', 'd', 'd']
})

metadata = {
  'columns': {
     'id': {'sdtype': 'id'},
     'val1': {'sdtype': 'categorical'},
     'val2': {'sdtype': 'numerical'}
  },
  'primary_key': 'id'
}


report = DiagnosticReport()
report.generate(data, synthetic_data, metadata)
@frances-h frances-h added bug Something isn't working new Label applied to new issues labels Nov 9, 2023
@npatki npatki removed the new Label applied to new issues label Nov 13, 2023
@amontanez24 amontanez24 added this to the 0.13.0 milestone Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants