Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimeSeriesImputer: Add nullable type handling for X and y when interpolate is impute method #4001

Closed
tamargrey opened this issue Feb 15, 2023 · 2 comments · Fixed by #4046
Closed
Assignees

Comments

@tamargrey
Copy link
Contributor

tamargrey commented Feb 15, 2023

When interpolate is the impute method, we use pandas' series.impute method, which cannot handle nullable types: pandas-dev/pandas#40252

As part of component-specific nullable type handling, we should remove the calls to astype(float) we currently do:

This will be replaced with separate calls to _handle_nullable_types in both X and y. Since we only need to transform when self._impute_target == "interpolate", we can consider implementing a special _handle_nullable_types that takes _self.impute_target into account.

@tamargrey tamargrey changed the title Add nullable type handling to TimeSeriesImputer TimeSeriesImputer: Add nullable type handling for X and y when interpolate is impute method Feb 17, 2023
@tamargrey
Copy link
Contributor Author

Currently, passing AgeNullable columns containing nans raises a TypeConversionError from trying to convert from float64 back to Int64 because we are not excluding it at https://github.com/alteryx/evalml/blob/main/evalml/pipelines/components/transformers/imputers/time_series_imputer.py#L155-L158, which is also logic that needs to be removed when we can call _handle_nullable_types

@tamargrey
Copy link
Contributor Author

tamargrey commented Feb 24, 2023

Since nans will be present at this point, there may need to be some logic to make sure we maintain the original logical types that will not be needed at the other coponents. We should be able to just initialize the interpolated data with the original schema (minus any dropped fully null cols and some cols that can't do type conversions like float64 -> Int64) after interpolation occurs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant