Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug in OrdinalEncoder #71

Merged
merged 13 commits into from
Nov 27, 2024
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,12 @@ pip install tabensemb[torch]

Please use `pip install tabensemb` instead if you already have `torch>=1.12.0` installed. Use `pip install tabensemb[test]` if you want to run unit tests.

To install from source,

```shell
pip install -e .[torch]
```

2. (Optional) Run unit tests after installed `tabensemb[test]`:

```shell
Expand Down
1 change: 0 additions & 1 deletion docs/source/api/utility.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,4 @@ tabensemb.utils
:toctree: generated/

utils
collate.fix_collate_fn
ranking
47 changes: 26 additions & 21 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,24 +1,29 @@
autogluon.tabular[all]>=0.7.0,<1.0.0
autogluon.common>=0.7.0,<1.0.0
autogluon.core>=0.7.0,<1.0.0
autogluon.features>=0.7.0,<1.0.0
autogluon.tabular[all]>=0.8.2,<1.0.0
autogluon.common>=0.8.2,<1.0.0
autogluon.core>=0.8.2,<1.0.0
autogluon.features>=0.8.2,<1.0.0
captum>=0.6.0
matplotlib>=3.3.4
numpy>=1.22.1
openpyxl>=3.0.10
pandas>=1.4.2
scikit-learn>=1.0.2,<1.4
gensim>=4.3.0,<4.3.3
matplotlib>=3.8.2
numba>=0.58.0,<0.58.2
numpy>=1.26.2
opencv-contrib-python>=4.8.0,<4.8.2
openpyxl>=3.1.2
pandas>=1.5.3
scikit-learn>=1.2.2,<1.4
scikit-optimize>=0.9.0
scipy>=1.7.3
seaborn>=0.11.2
tqdm>=4.64.1
torchmetrics>=0.11.0
pytorch-widedeep>=1.2.1
pytorch_tabnet>=4.0
pytorch_tabular>=1.0.2
miceforest>=5.6.3
shap>=0.41.0
einops>=0.6.0
pytorch_lightning>=1.8.6
traitlets<=5.9.0
scipy>=1.11.4
seaborn>=0.13.0
statsmodels>=0.14.0,<0.14.1
tqdm>=4.66.1
torchmetrics>=0.11.4
pytorch-widedeep>=1.3.2,<1.6.2
pytorch-tabnet>=4.0
pytorch-tabular>=1.0.2
ray>=2.3.1
miceforest>=5.7.0
shap>=0.43.0
einops>=0.6.1
pytorch-lightning>=1.9.5
traitlets>=5.9.0

8 changes: 6 additions & 2 deletions tabensemb/data/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,9 @@ def _inverse_transform(self, df: pd.DataFrame):
f"inverse-transformed."
)

transformed_values = np.ones_like(values).astype(self.dtypes[feature])
transformed_values = np.ones_like(values).astype(
self.dtypes[feature] if self.dtypes[feature] != str else "U256"
)
for i in range(len(transformed_values)):
transformed_values[i] = unknown_val
for val in unique_values:
Expand All @@ -327,5 +329,7 @@ def _inverse_transform(self, df: pd.DataFrame):
if val in unknown_values
else self.mapping[feature][int(val)]
)
df[feature] = transformed_values.astype(self.dtypes[feature])
df[feature] = transformed_values.astype(
self.dtypes[feature] if self.dtypes[feature] != str else "U256"
)
return df
Loading