-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make EvalML compatible with the new Woodwork Boolean inference #3892
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3892 +/- ##
=======================================
- Coverage 99.7% 99.7% -0.0%
=======================================
Files 346 346
Lines 36640 36665 +25
=======================================
+ Hits 36510 36532 +22
- Misses 130 133 +3
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good! Have 1-2 q's but ready to approve once addressed.
I'm curious about is if we need to have a test case for the case you specified in the PR description? E.g. transforming a column of 0 and 1 into boolean values if explicitly set with df.ww.init(logical_types={"col": "Boolean"})
after_to_before_inference_mapping = { | ||
new: old for old, new in zip(original_vc.keys(), new_vc.keys()) | ||
} | ||
before_to_after_inference_mapping = { | ||
old: new for new, old in after_to_before_inference_mapping.items() | ||
} | ||
if str(y.ww.logical_type) not in ["Boolean", "BooleanNullable"]: | ||
after_to_before_inference_mapping = { | ||
old: old for old in after_to_before_inference_mapping.keys() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is a bit convoluted but I think I follow it. Can we move lines 159 - 162 up to before 153 and set it as an if/else for the after_to_before_inference_mapping
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, added some comments for clarity too
@christopherbunn Added a test to verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The Boolean inference has been revised in Woodwork and has been broken into two stages:
Boolean
using this new inference by specifying theBoolean
orBooleanNullable
logical types. What does this mean? A column of1s and 0s
will not be inferred asBoolean
throughdf.ww.init()
, however specifyingdf.ww.init(logical_types={"col": "Boolean"})
will transform that column intoTrue and False
. This PR is solely to get EvalML compatible with this stage inwoodwork==0.22.0
.