Oversampler: Add nullable type handling for nullable y #3974

tamargrey · 2023-02-02T19:20:26Z

The following block of code will raise the ValueError: Unknown label type: 'unknown' error.

    import woodwork as ww
    X, y = X_y_binary
    y = ww.init_series(y, logical_type="BooleanNullable")
    sn = Oversampler()
    _ = sn.fit_transform(X, y)

This will not currently be seen in automl search because of the replace nullable types component, but we should consider adding nullable handling into the component class itself so that it can independently support nullable types.

Note this is likely related to #3923, #3922 , and #3910 , which all stem from the inability of sklearn's type_of_target to assign a proper type to nullable data

The text was updated successfully, but these errors were encountered:

tamargrey · 2023-02-07T22:05:41Z

Not fixed by updating to sklearn 1.2.1

tamargrey · 2023-02-15T17:37:32Z

As part of implementing component-specific handling for the Oversampler, we need to remove the nullable type logic in the BaseSampler's _prepare_data.

Also worth noting - this wasn't even maintaining woodwork types causing us to rerun type inference, which would be unnecessary computation and potentially cause a bug if we lost some column types that were influencing the type of sampler we chose.

tamargrey · 2023-03-09T15:40:40Z

The Oversampler's nullable type incompatibility is fixed by upgrading to sklearn 1.2.2, but we should still rmeove the nullable type logic that is now doubly unnecessary in _prepare_data

tamargrey changed the title ~~Oversampler can raise ValueError: Unknown label type: when nullable y passed in~~ Add nullable type handling for nullable y to Oversampler Feb 17, 2023

tamargrey mentioned this issue Feb 17, 2023

Oversampler: Remove nullable type handling when imblearn adds nullable type support #4013

Closed

tamargrey changed the title ~~Add nullable type handling for nullable y to Oversampler~~ Oversampler: Add nullable type handling for nullable y Feb 17, 2023

tamargrey mentioned this issue Mar 2, 2023

Use nullable type handling in components' fit, transform, and predict methods #4046

Merged

exalate-issue-sync bot assigned tamargrey Mar 7, 2023

gsheni linked a pull request Mar 8, 2023 that will close this issue

Use nullable type handling in components' fit, transform, and predict methods #4046

Merged

This was referenced Mar 10, 2023

Oversampler: Remove nullable type handling when support is added in sklearn #4067

Open

Handle new oversampler nullable type incompatibility in X #4068

Merged

tamargrey closed this as completed in #4068 Mar 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Oversampler: Add nullable type handling for nullable y #3974

Oversampler: Add nullable type handling for nullable y #3974

tamargrey commented Feb 2, 2023

tamargrey commented Feb 7, 2023

tamargrey commented Feb 15, 2023

tamargrey commented Mar 9, 2023

Oversampler: Add nullable type handling for nullable y #3974

Oversampler: Add nullable type handling for nullable y #3974

Comments

tamargrey commented Feb 2, 2023

tamargrey commented Feb 7, 2023

tamargrey commented Feb 15, 2023

tamargrey commented Mar 9, 2023