Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate removing quality tests #642

Closed
amontanez24 opened this issue May 1, 2023 · 0 comments · Fixed by #650
Closed

Investigate removing quality tests #642

amontanez24 opened this issue May 1, 2023 · 0 comments · Fixed by #650
Assignees
Labels
maintenance Task related to infrastructure & dependencies
Milestone

Comments

@amontanez24
Copy link
Contributor

Problem Description

As an engineer, it's confusing to have code that isn't being used or doesn't fully achieve its purpose.

The quality tests were originally added as a way to determine if new transformers were of a lower quality than existing ones. In these tests, quality was defined as how well a transformer captures relationships between columns. To test this, the following steps are taken:

  1. It creates a list of test cases. Each test case has a dataset and a set of sdtypes to test for the dataset.
  2. A dictionary is created mapping sdtypes to a DataFrame containing the regression scores obtained from running the
    transformers of that sdtype against the datasets in the test cases. Each row in the DataFrame has the transformer name,
    dataset name, column name and score. The scores are computed as follows:
    1. For every transformer of the sdtype, transform all the columns of that sdtype.
    2. For every numerical column in the dataset, the transformed columns are used as features to train a regression model.
    3. The score is the coefficient of determination obtained from that model trying to predict the target column.
  3. Once the scores are gathered, a results table is created. Each row has a transformer name, dataset name, average score for the dataset and a score comparing the transformer's average score for the dataset to the average of the average score for the dataset across all transformers of the same sdtype.
  4. For every unique transformer in the results, a test is run to check that the transformer's score for each table is either higher than the threshold, or the comparative score is higher than the threshold.

The problem is that the tests haven't been used as intended. A lot of our transformers don't do a very good job of capturing information that can be used to predict other columns as that isn't necessarily what they were meant to do independently. The only transformer that seems to do a very good job is OneHotEncoder, and that makes sense since it is actually making a new column for every unique value of a category. If there are correlations between these categories and other columns, it can capture this.

In order to get the tests to pass for the other transformers, we just made the threshold for the score very low. This doesn't seem to be a very effective way to test quality. If the goal was to catch new transformers that had bad "quality", this doesn't work well because it would probably pass the tests anyway.

We removed the quality tests from the github workflows because we temporarily lost access to the datasets it was using. The question moving forward is if we should just remove the code altogether since it isn't being used.

Expected behavior

  • Investigate if there is any code in the quality test package that should be kept or provides some value.
  • Remove what isn't needed.

Additional context

@amontanez24 amontanez24 added the maintenance Task related to infrastructure & dependencies label May 1, 2023
@amontanez24 amontanez24 self-assigned this May 31, 2023
@amontanez24 amontanez24 added this to the 1.5.0 milestone May 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Task related to infrastructure & dependencies
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant