Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer should not be fit on test data #291

Closed
dionman opened this issue Jan 3, 2023 · 3 comments · Fixed by #492
Closed

Transformer should not be fit on test data #291

dionman opened this issue Jan 3, 2023 · 3 comments · Fixed by #492
Assignees
Labels
bug Something isn't working feature:metrics Related to any of the individual metrics
Milestone

Comments

@dionman
Copy link

dionman commented Jan 3, 2023

Environment Details

  • SDMetrics version: v0.8.1
  • Python version: 3.9
  • Operating System: Ubuntu

Error Description

At current implementation of MLEfficacyMetric base class, the transformer is fit on the test data

test_data = ht.fit_transform(test_data)
This undesirably leaks information from the (unseen) test dataset, biasing the reported metrics. Instead fit_transform should be applied on the train data and transform on the test data.

@dionman dionman added bug Something isn't working new Label applied to new issues labels Jan 3, 2023
@npatki
Copy link
Contributor

npatki commented Jan 3, 2023

Thanks for filing this @dionman. I'll check back in with the team in case there's any reason why it was implemented the way it is. Will keep this issue open to report any progress or updates.

@npatki npatki added under discussion Issue is currently being discussed feature:metrics Related to any of the individual metrics and removed new Label applied to new issues labels Jan 3, 2023
@dionman
Copy link
Author

dionman commented Jan 24, 2023

thanks - are there any updates on this?

@npatki
Copy link
Contributor

npatki commented Apr 6, 2023

Hi @dionman, this bug has been added to our queue. We'll use this issue to provide any updates.

@npatki npatki removed the under discussion Issue is currently being discussed label Jun 5, 2023
@amontanez24 amontanez24 added this to the 0.12.1 milestone Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature:metrics Related to any of the individual metrics
Projects
None yet
4 participants