-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Accept fold index for TargetEncoder #4453
Conversation
sync with upstream
sync with upstream
sync with upstream
Sync with upstream
sync with upstream
sync with upstream
merge with upstream
sync with upstream
stop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks great, just a request for updated docstring
@@ -114,7 +118,7 @@ def __init__(self, n_folds=4, smooth=0, seed=42, | |||
self.train = None | |||
self.output_type = output_type | |||
|
|||
def fit(self, x, y): | |||
def fit(self, x, y, fold_ids=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add fold_ids
to the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, thank you for comments.
self.train_encode = res | ||
self.train = train | ||
self._fitted = True | ||
return self | ||
|
||
def fit_transform(self, x, y): | ||
def fit_transform(self, x, y, fold_ids=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #4453 +/- ##
===============================================
Coverage ? 85.74%
===============================================
Files ? 236
Lines ? 19322
Branches ? 0
===============================================
Hits ? 16567
Misses ? 2755
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
@gpucibot merge |
As requested in issue rapidsai#4441, in this PR we let TargetEncoder accept a customized fold index array in `fit()` For example, in the following code ``` X = [1, 2, 3, 1, 2] y = [1, 0, 0, 0, 1] fold_id = [0,1,0,0,1] encoder = TargetEncoder(split_method='customize') encoder.fit(X,y,fold_id=fold_id) ``` The target encoder will fit subarray of `X` and `y` where `fold_id==0` to encode the subarray of `X` where `fold_id==1`, and vice versa. Authors: - Jiwei Liu (https://github.com/daxiongshu) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#4453
As requested in issue #4441, in this PR we let TargetEncoder accept a customized fold index array in
fit()
For example, in the following code
The target encoder will fit subarray of
X
andy
wherefold_id==0
to encode the subarray ofX
wherefold_id==1
, and vice versa.