Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add support for pandas 1.2 #7367

Closed
galipremsagar opened this issue Feb 10, 2021 · 0 comments · Fixed by #7375
Closed

[FEA] Add support for pandas 1.2 #7367

galipremsagar opened this issue Feb 10, 2021 · 0 comments · Fixed by #7375
Assignees
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@galipremsagar
Copy link
Contributor

Is your feature request related to a problem? Please describe.
cudf is currently pinned to pandas<1.2.0a0, we would want to use some new features introduced in 1.2. For example: use_nullable_dtypes param in read_parquet: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_parquet.html

Describe the solution you'd like
Test current cudf code-base and make versioned fixes to breakages.

@galipremsagar galipremsagar added feature request New feature or request Python Affects Python cuDF API. labels Feb 10, 2021
@galipremsagar galipremsagar self-assigned this Feb 10, 2021
rapids-bot bot pushed a commit that referenced this issue Feb 26, 2021
Fixes: #7367, #7446

This PR upgrades pandas to `1.2.2` in `cudf`. Changes include:

- [x] Bumping up `pandas` version.
- [x] Fixing `isin` behavior which now takes in types into accout: pandas-dev/pandas#38781
- [x] `CategoricalColumn.__setitem__` will now not allow setting of values that are not in existing categories.
- [x] Introduced `cudf.core._compat.PANDAS_GE_120` variable to create back-ward compatibility.
- [x] Updated usages of `pd.core.tools.datetimes._guess_datetime_format` to `pd.core.tools.datetimes.guess_datetime_format`
- [x] Introduced `std` & `median` in `DateTimeColumn`.
- [x] Fixed incorrect handling of passing `StringMethods` as an input to methods in string APIs.
- [x] Fixed a typo in calling `is_valid` of `Scalar`.
- [x] Removed unnecessary special handling in `TimeDeltaColumn.sum` logic for empty inputs.
- [x] Introduced passing `dtype='float64'` wherever there is an empty series being created since pandas will soon be defaulting to `object` dtype if no type is passed and we don't have a perfectly resembling `object` dtype as that of pandas.
- [x] Fixed deprecation warnings of `Index.__or__` and `Index.__xor__` by replacing with `union` & `symmetric_difference` APIs.
- [x] Introduced mapping of our `float32` & `float64` dtypes to pandas Nullable dtypes `FLoat32Dtype` & `Float64Dtype` when `nullable=True` in `to_pandas`.
- [x] With introduction of nullable float dtypes, there is an issue in creating `MultiIndex` from dataframe: pandas-dev/pandas#39984, so introduced a workaround in our `MultiIndex.__repr__` code.
- [x] Removed usages of `check_less_precise` in our code-base as this is deprecated and is replaced with `rtol` & `atol`. Retained its usages in our testing APIs for back-ward compatibility.
- [x] Removed good number `xfail` cases which are actually passing right now because of resolved issues in both `pandas` & `cudf`.
- [x] Did some miscellaneous code-cleanup in pytests.
- [x] Fixed pytests that will fail when run in parallel due to access to shared pytest params being manipulated inplace.
- [x] Follow a standard import pattern across pytest files, some files do `from pandas import Series` and some do `from cudf.core import Series`. So removed both patterns and doing only simple `import cudf` & `import pandas as pd` to avoid confusion while debugging test failures across multiple files. (Made this change in all pytest files which I had to touch as part of pandas upgrade, we can make similar changes in future for the files which we touch).
- [x] Fix issue with assigning `np.nan` values to a `CategoricalColumn` and fix related `__repr__` code: #7446

Authors:
  - GALI PREM SAGAR (@galipremsagar)

Approvers:
  - Keith Kraus (@kkraus14)
  - AJ Schmidt (@ajschmidt8)

URL: #7375
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant