-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fit fails when input data has categorical columns #970
Comments
When encoding a pandas array in autosklearn.data.validator, the columns are re-ordered by the ColumnTransformer. This PR re-orders the feature types so that when passing the data to the actual ML pipeline, columns and feature types are sorted the same way.
Thanks a lot for the bug report. I can reproduce and suggest a fix in #975. |
When encoding a pandas array in autosklearn.data.validator, the columns are re-ordered by the ColumnTransformer. This PR re-orders the feature types so that when passing the data to the actual ML pipeline, columns and feature types are sorted the same way.
This was fixed by #975 |
I'm still going through this problem, but I can't provide a small working example. I have a dataset of around 200,000 instances. I split between train and test. I managed to train a model on the train dataset, although when later loading the joblib model artifact I got this error when trying to I'm trying to understand if my test set have something wrong, although I did not find negative values on columns that I set as categorical. I have negative values on numeric columns though, but I don't thing that should be a problem. Any hints or tips on what I should look for? thanks in advance |
Could you please
? This will allow us debugging the problem you're facing |
Describe the bug
auto-sklearn fails when input data has categorical data. I have changed example_pandas_train_test.py to use OpenML dataset, data_id : 1558 and also updated categorical and numerical list var.
Changes done:
As per understanding, categorical columns are not encoded. That's why it fails.
Actual behavior, stacktrace or logfile
Environment and installation:
Please give details about your installation:
The text was updated successfully, but these errors were encountered: