We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In [1]: import pandas as pd In [2]: from pandas.tests.extension.decimal.array import DecimalArray, make_dat ...: a In [3]: da1= make_data() ...: da2= make_data() ...: In [4]: s1 = pd.Series(DecimalArray(da1)) ...: s2 = pd.Series(DecimalArray(da2)) ...: In [5]: s1.head(), s2.head() Out[5]: (0 0.57581534881735985109685316274408251047134399... 1 0.05647135567908745379384072293760254979133605... 2 0.41049738961593973396446699553052894771099090... 3 0.13724377491342376611527242857846431434154510... 4 0.24154934068629707599740186196868307888507843... dtype: decimal, 0 0.40855027024154888515283801098121330142021179... 1 0.21243084028671055385473209753399714827537536... 2 0.15218065149055393092680787958670407533645629... 3 0.87747422249812989658579454044229350984096527... 4 0.53991488184898328572813852588296867907047271... dtype: decimal) In [6]: s1.combine(s2, lambda x1, x2: x1 if x1 < x2 else x2) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-6-14abc20f0095> in <module>() ----> 1 s1.combine(s2, lambda x1, x2: x1 if x1 < x2 else x2) C:\EclipseWorkspaces\LiClipseWorkspace\pandas-dev\pandas36\pandas\core\series.py in combine(self, other, func, fill_value) 2220 new_index = self.index.union(other.index) 2221 new_name = ops.get_op_result_name(self, other) -> 2222 new_values = np.empty(len(new_index), dtype=self.dtype) 2223 for i, idx in enumerate(new_index): 2224 lv = self.get(idx, fill_value) TypeError: data type not understood
The Series.combine() method uses numpy.empty with the dtype of the ExtensionArray, and numpy isn't happy with that.
Series.combine()
numpy.empty
dtype
ExtensionArray
numpy
Note: This also happens with Categorical in v0.22 and in master:
Categorical
In [3]: cat1 = pd.Categorical(values=["one","two","three","three","two","one"], ...: categories=["one","two","three"], ordered=True) ...: cat2 = pd.Categorical(values=["three","two","one","one","two","three"], ...: categories=["one","two","three"], ordered=True) ...: s1 = pd.Series(cat1) ...: s2 = pd.Series(cat2) ...: s1, s2 ...: Out[3]: (0 one 1 two 2 three 3 three 4 two 5 one dtype: category Categories (3, object): [one < two < three], 0 three 1 two 2 one 3 one 4 two 5 three dtype: category Categories (3, object): [one < two < three]) In [4]: s1.combine(s2, lambda x1, x2: x1 <= x2) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-4-b597231c2d3c> in <module>() ----> 1 s1.combine(s2, lambda x1, x2: x1 <= x2) C:\Anaconda3\lib\site-packages\pandas\core\series.py in combine(self, other, func, fill_value) 1768 new_index = self.index.union(other.index) 1769 new_name = _maybe_match_name(self, other) -> 1770 new_values = np.empty(len(new_index), dtype=self.dtype) 1771 for i, idx in enumerate(new_index): 1772 lv = self.get(idx, fill_value) TypeError: data type not understood
NOTE: I will look into fixing this as part of my attempt to get ops() working for ExtensionArray
A Series of True and False values.
Series
True
False
pd.show_versions()
commit: 60fe82c python: 3.6.4.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None
pandas: 0.23.0.dev0+799.g60fe82c8a pytest: 3.4.0 pip: 9.0.1 setuptools: 38.5.1 Cython: 0.25.1 numpy: 1.14.1 scipy: 1.0.0 pyarrow: 0.8.0 xarray: None IPython: 6.2.1 sphinx: 1.7.1 patsy: 0.5.0 dateutil: 2.6.1 pytz: 2018.3 blosc: 1.5.1 bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.2.0 openpyxl: 2.5.0 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.2 lxml: 4.1.1 bs4: 4.6.0 html5lib: 1.0.1 sqlalchemy: 1.2.5 pymysql: 0.8.0 psycopg2: None jinja2: 2.10 s3fs: 0.1.3 fastparquet: None pandas_gbq: None pandas_datareader: None
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
Code Sample, a copy-pastable example if possible
Problem description
The
Series.combine()
method usesnumpy.empty
with thedtype
of theExtensionArray
, andnumpy
isn't happy with that.Note: This also happens with
Categorical
in v0.22 and in master:NOTE: I will look into fixing this as part of my attempt to get ops() working for
ExtensionArray
Expected Output
A
Series
ofTrue
andFalse
values.Output of
pd.show_versions()
INSTALLED VERSIONS
commit: 60fe82c
python: 3.6.4.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.23.0.dev0+799.g60fe82c8a
pytest: 3.4.0
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.25.1
numpy: 1.14.1
scipy: 1.0.0
pyarrow: 0.8.0
xarray: None
IPython: 6.2.1
sphinx: 1.7.1
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.3
blosc: 1.5.1
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: 2.2.0
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.5
pymysql: 0.8.0
psycopg2: None
jinja2: 2.10
s3fs: 0.1.3
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: