Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: nan - CategoricalTransformer fails on numerical + nan data only #142

Closed
csala opened this issue Nov 26, 2020 · 0 comments · Fixed by #143
Closed

KeyError: nan - CategoricalTransformer fails on numerical + nan data only #142

csala opened this issue Nov 26, 2020 · 0 comments · Fixed by #143
Assignees
Labels
bug Something isn't working
Milestone

Comments

@csala
Copy link
Contributor

csala commented Nov 26, 2020

  • Reversible Data Transforms version: 0.2.8

Description

When the CategoricalTransformer is used with data that contains only numerical data only NaN values, it crashes with a KeyError when trying to transform the NaNs.

What I Did

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: data = pd.Series([1, 2, np.nan])

In [4]: from rdt.transformers import CategoricalTransformer

In [5]: ct = CategoricalTransformer()

In [6]: ct.fit_transform(data)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-9f958bc8253f> in <module>
----> 1 ct.fit_transform(data)

~/Projects/MIT/RDT/rdt/transformers/base.py in fit_transform(self, data)
     44         """
     45         self.fit(data)
---> 46         return self.transform(data)
     47 
     48     def reverse_transform(self, data):

~/Projects/MIT/RDT/rdt/transformers/categorical.py in transform(self, data)
    172             data = data.map(MAPS[id(self)])
    173 
--> 174         return data.fillna(np.nan).apply(self._get_value).to_numpy()
    175 
    176     def _normalize(self, data):

~/.virtualenvs/RDT/lib/python3.8/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   4210             else:
   4211                 values = self.astype(object)._values
-> 4212                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   4213 
   4214         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

~/Projects/MIT/RDT/rdt/transformers/categorical.py in _get_value(self, category)
    145     def _get_value(self, category):
    146         """Get the value that represents this category."""
--> 147         mean, std = self.intervals[category][2:]
    148         if self.fuzzy:
    149             return norm.rvs(mean, std)

KeyError: nan
@csala csala self-assigned this Nov 26, 2020
@csala csala added the bug Something isn't working label Nov 26, 2020
@csala csala added this to the 0.2.9 milestone Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant