-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Unable to assign null
/nan
in categorical column
#7446
Comments
I think we should only handle |
We may want to consider inheriting |
Ideally |
We technically have those, they're just typed. They basically represent libcudf scalars of a certain type with the validity set to be false. |
Fixes: #7367, #7446 This PR upgrades pandas to `1.2.2` in `cudf`. Changes include: - [x] Bumping up `pandas` version. - [x] Fixing `isin` behavior which now takes in types into accout: pandas-dev/pandas#38781 - [x] `CategoricalColumn.__setitem__` will now not allow setting of values that are not in existing categories. - [x] Introduced `cudf.core._compat.PANDAS_GE_120` variable to create back-ward compatibility. - [x] Updated usages of `pd.core.tools.datetimes._guess_datetime_format` to `pd.core.tools.datetimes.guess_datetime_format` - [x] Introduced `std` & `median` in `DateTimeColumn`. - [x] Fixed incorrect handling of passing `StringMethods` as an input to methods in string APIs. - [x] Fixed a typo in calling `is_valid` of `Scalar`. - [x] Removed unnecessary special handling in `TimeDeltaColumn.sum` logic for empty inputs. - [x] Introduced passing `dtype='float64'` wherever there is an empty series being created since pandas will soon be defaulting to `object` dtype if no type is passed and we don't have a perfectly resembling `object` dtype as that of pandas. - [x] Fixed deprecation warnings of `Index.__or__` and `Index.__xor__` by replacing with `union` & `symmetric_difference` APIs. - [x] Introduced mapping of our `float32` & `float64` dtypes to pandas Nullable dtypes `FLoat32Dtype` & `Float64Dtype` when `nullable=True` in `to_pandas`. - [x] With introduction of nullable float dtypes, there is an issue in creating `MultiIndex` from dataframe: pandas-dev/pandas#39984, so introduced a workaround in our `MultiIndex.__repr__` code. - [x] Removed usages of `check_less_precise` in our code-base as this is deprecated and is replaced with `rtol` & `atol`. Retained its usages in our testing APIs for back-ward compatibility. - [x] Removed good number `xfail` cases which are actually passing right now because of resolved issues in both `pandas` & `cudf`. - [x] Did some miscellaneous code-cleanup in pytests. - [x] Fixed pytests that will fail when run in parallel due to access to shared pytest params being manipulated inplace. - [x] Follow a standard import pattern across pytest files, some files do `from pandas import Series` and some do `from cudf.core import Series`. So removed both patterns and doing only simple `import cudf` & `import pandas as pd` to avoid confusion while debugging test failures across multiple files. (Made this change in all pytest files which I had to touch as part of pandas upgrade, we can make similar changes in future for the files which we touch). - [x] Fix issue with assigning `np.nan` values to a `CategoricalColumn` and fix related `__repr__` code: #7446 Authors: - GALI PREM SAGAR (@galipremsagar) Approvers: - Keith Kraus (@kkraus14) - AJ Schmidt (@ajschmidt8) URL: #7375
Describe the bug
We can have
null
values in a categorical column and a user can any time set values at an index to<NA>
or evennp.nan
. We are currently not able to do so.Steps/Code to reproduce bug
Expected behavior
We should be set values to
<NA>
/np.nan
(but that won't add new categories, potentially would remove some).Environment overview (please complete the following information)
Environment details
Please run and paste the output of the
cudf/print_env.sh
script here, to gather any other relevant environment detailsClick here to see environment details
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: