Implement our own partial dependence method #2502

bchen1116 · 2021-07-13T19:37:59Z

With this PR, we add a section of code to handle datetime support for partial dependence. @chukarsten brings up a good point about risks involved in using the private functions of sklearn's partial dependence, _grid_from_X and _partial_dependence_brute.

In my look into the issue, I found the following:

sklearn's partial_dependence makes calls to _grid_from_X and _partial_dependence_brute
_grid_from_X doesn't accept datetime features. If a datetime is passed in, it will result in the original datetimes to be returned, ignoring the grid_resolution parameter. We resolve this by turning the datetime into seconds (an int value), then pass it forward
_partial_dependence_brute calls the pipeline.predict/predict_proba methods on the trained pipelines. This means that we can't alter the dataset in only partial dependence if it isn't altered similarly for training (ie we cannot introduce a new column or alter the X data)

If we wanted to not use the the private functions, we need to find a way to properly pass the data to sklearn's partial_dependence. The issue here is that we somehow need to handle converting the datetime to seconds for obtaining the grid, but we'd need to use datetime for fitting, predicting, and computing the partial dependence otherwise.

It might be beneficial to build our own partial dependence feature or to find another way to pass this through without necessarily relying on private methods. This issue tracks finding the next best-steps to clean up our current partial dependence implementation.

fyi @chukarsten @freddyaboulton

The text was updated successfully, but these errors were encountered:

dsherry · 2021-07-14T19:25:03Z

Agreed. Goal is to avoid more issues like #2475 , make our code easier to read and maintain.

Seems like we have two options. 1) clean up our current impl 2) don't use sklearn impl any more, write our own.

bchen1116 added the enhancement An improvement to an existing feature. label Jul 13, 2021

bchen1116 mentioned this issue Jul 13, 2021

Datetime support for 1-way Partial Dependence #2454

Merged

freddyaboulton mentioned this issue Aug 9, 2021

Raise a custom exception for partial dependence errors #2604

Merged

freddyaboulton mentioned this issue Aug 30, 2021

Partial dependence errors with column with string and NaN values #2475

Closed

freddyaboulton changed the title ~~Clean up Partial Dependence~~ Implement our own partial dependence method Sep 2, 2021

freddyaboulton self-assigned this Sep 13, 2021

freddyaboulton mentioned this issue Sep 23, 2021

Our own Partial Dependence Implementation #2834

Merged

freddyaboulton closed this as completed in #2834 Sep 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement our own partial dependence method #2502

Implement our own partial dependence method #2502

bchen1116 commented Jul 13, 2021

dsherry commented Jul 14, 2021

Implement our own partial dependence method #2502

Implement our own partial dependence method #2502

Comments

bchen1116 commented Jul 13, 2021

dsherry commented Jul 14, 2021