-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PipelineBase.compute_estimator_features fails for Undersampler #2272
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2272 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 280 280
Lines 24369 24382 +13
=========================================
+ Hits 24347 24360 +13
Misses 22 22
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍 !!
is there an issue to track investigating how to fix the underlying bug?
@angela97lin not yet, no. I'm still seemingly enumerating some of the issues in the hotfix. There's 6 broken tests and this only fixed one so I'll need re-review. |
@@ -57,7 +57,7 @@ def transform(self, X, y=None): | |||
X = infer_feature_types(X) | |||
if y is not None: | |||
y = infer_feature_types(y) | |||
return X, y | |||
return X, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote up an explanation for why this fixes the issue in the PR description. Copying here for visibility:
Because undersampler
transform
returns both X and y, our currentComponentGraph
impl will save the unmodified target returned from that method under the key"Undersampler.y"
in the component features dict during evaluation. This doesn't cause problems during pipeline fit and predict, but when features are computed for each transformer in_fit_transform_features_helper
, that code erroneously grabs the"Undersampler.y"
output and appends it to the input of the next component, which in automl and in our reproducer happens to be the estimator. This is why one of the stack traces at the bottom comes from the estimator and mentions too many features being present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long-term, we'll need to update our component graph to be tolerant to this behavior. If a component transform
happens to return a target, we should use that target. If it doesn't we should just use the last modified target. We have logic for this already in the main component graph method for fit/predict, but not yet in _fit_transform_features_helper
. We can address this separately once this patch is merged and tested.
assert isinstance(transform_output[0], ww.DataTable) | ||
assert isinstance(transform_output[1], ww.DataColumn) | ||
elif 'sampler' in component.name: | ||
assert isinstance(transform_output[0], ww.DataTable) | ||
assert transform_output[1] is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if 'sampler' in transformer.name: | ||
X_t, y_t = transformer.transform(X, y) | ||
X_t = X_t.to_dataframe() | ||
assert y_t is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chukarsten looks great!
Before merging, please add direct coverage for calling calculate_permutation_importance
on a pipeline which contains the undersampler. We can worry about adding generalized coverage later. I think a good place to do that is in evalml/tests/model_understanding_tests/test_permutation_importance.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @chukarsten !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Background
Our
ComponentGraph
(first added in Dec 2020) was not originally intended to support transformer components which can alter the supervised target data.The undersampler was recently added as a component in the graph (#2030 ). The undersampler component is the first component we've had which alters the target data. The way the undersampler currently works is:
ComponentGraph
, each transformers'fit_transform
is called only during fit, whereastransform
is called during predict.fit_transform
to apply undersampling to the input data and target at the same time.transform
to be a no-op, and to pass both X and y through from input to output, because we assume this only happens at pipelinepredict
-time.When we added the undersampler in #2030 and related PRs, we added test coverage for evaluating the component, and for using the component in a pipeline and calling pipeline
fit
/predict
. That all works just great. What we didn't add coverage for is for calling the methodPipelineBase.compute_estimator_features
on a pipeline with the undersampler in it! This is used by permutation importance, which is how we first noticed the bug.Symptoms
Calling
PipelineBase.compute_estimator_features
on a pipeline which uses the undersampler results in a stack trace. Reproducer at the bottom.Cause
Because undersampler
transform
returns both X and y, our currentComponentGraph
impl will save the unmodified target returned from that method under the key"Undersampler.y"
in the component features dict during evaluation. This doesn't cause problems during pipeline fit and predict, but when features are computed for each transformer in_fit_transform_features_helper
, that code erroneously grabs the"Undersampler.y"
output and appends it to the input of the next component, which in automl and in our reproducer happens to be the estimator. This is why one of the stack traces at the bottom comes from the estimator and mentions too many features being present.Solution
This PR has a quick fix to change
BaseSampler.transform
to return None instead of y. This means that when the features are computed for each transformer, the resulting component output dict will no longer contain a key"Undersampler.y"
.Long-term: we need to add test coverage to ensure all pipeline methods work properly for components which modify the target like the undersampler.
@dsherry had a spike which cleans up the component graph impl and would fix this issue, #2110 . And another spike #2210 which simplifies the sampler API, although that wouldn't have fixed the issue here directly.
Repro
The following
runs fine until the call to
calculate_permutation_importance
, which produces this stack traceIf you run the above but add a name to the pd Series for the target, you get a slightly different stack trace ending with
because the component graph somehow ends up returning all-nans for the target passed from the undersampler to the target.