-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Unable to create a MultiIndex with nan
values in nullable Float
dtypes
#39984
Closed
2 of 3 tasks
Labels
good first issue
MultiIndex
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
Needs Tests
Unit test(s) needed to prevent regressions
Comments
galipremsagar
added
Bug
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Feb 23, 2021
19 tasks
rapids-bot bot
pushed a commit
to rapidsai/cudf
that referenced
this issue
Feb 26, 2021
Fixes: #7367, #7446 This PR upgrades pandas to `1.2.2` in `cudf`. Changes include: - [x] Bumping up `pandas` version. - [x] Fixing `isin` behavior which now takes in types into accout: pandas-dev/pandas#38781 - [x] `CategoricalColumn.__setitem__` will now not allow setting of values that are not in existing categories. - [x] Introduced `cudf.core._compat.PANDAS_GE_120` variable to create back-ward compatibility. - [x] Updated usages of `pd.core.tools.datetimes._guess_datetime_format` to `pd.core.tools.datetimes.guess_datetime_format` - [x] Introduced `std` & `median` in `DateTimeColumn`. - [x] Fixed incorrect handling of passing `StringMethods` as an input to methods in string APIs. - [x] Fixed a typo in calling `is_valid` of `Scalar`. - [x] Removed unnecessary special handling in `TimeDeltaColumn.sum` logic for empty inputs. - [x] Introduced passing `dtype='float64'` wherever there is an empty series being created since pandas will soon be defaulting to `object` dtype if no type is passed and we don't have a perfectly resembling `object` dtype as that of pandas. - [x] Fixed deprecation warnings of `Index.__or__` and `Index.__xor__` by replacing with `union` & `symmetric_difference` APIs. - [x] Introduced mapping of our `float32` & `float64` dtypes to pandas Nullable dtypes `FLoat32Dtype` & `Float64Dtype` when `nullable=True` in `to_pandas`. - [x] With introduction of nullable float dtypes, there is an issue in creating `MultiIndex` from dataframe: pandas-dev/pandas#39984, so introduced a workaround in our `MultiIndex.__repr__` code. - [x] Removed usages of `check_less_precise` in our code-base as this is deprecated and is replaced with `rtol` & `atol`. Retained its usages in our testing APIs for back-ward compatibility. - [x] Removed good number `xfail` cases which are actually passing right now because of resolved issues in both `pandas` & `cudf`. - [x] Did some miscellaneous code-cleanup in pytests. - [x] Fixed pytests that will fail when run in parallel due to access to shared pytest params being manipulated inplace. - [x] Follow a standard import pattern across pytest files, some files do `from pandas import Series` and some do `from cudf.core import Series`. So removed both patterns and doing only simple `import cudf` & `import pandas as pd` to avoid confusion while debugging test failures across multiple files. (Made this change in all pytest files which I had to touch as part of pandas upgrade, we can make similar changes in future for the files which we touch). - [x] Fix issue with assigning `np.nan` values to a `CategoricalColumn` and fix related `__repr__` code: #7446 Authors: - GALI PREM SAGAR (@galipremsagar) Approvers: - Keith Kraus (@kkraus14) - AJ Schmidt (@ajschmidt8) URL: #7375
mzeitlin11
added
MultiIndex
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
Needs Discussion
Requires discussion from core team before further action
and removed
Needs Triage
Issue that has not been reviewed by a pandas team member
labels
Jul 1, 2021
works on master, needs test |
jbrockmendel
added
Needs Tests
Unit test(s) needed to prevent regressions
and removed
Needs Discussion
Requires discussion from core team before further action
labels
Jan 9, 2022
I'll be working on the test for this issue. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
good first issue
MultiIndex
NA - MaskedArrays
Related to pd.NA and nullable extension arrays
Needs Tests
Unit test(s) needed to prevent regressions
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
Problem description
Should
nan
be allowed in the first place while creating a nullable floating array? If yes, then we have an issue while creating a mulit-index from the dataframe.Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 7d32926
python : 3.7.3.final.0
python-bits : 64
OS : Darwin
OS-release : 20.3.0
Version : Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : en_US.UTF-8
pandas : 1.2.2
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 50.3.0
Cython : None
pytest : None
hypothesis : 5.29.0
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 1.0.1
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: