Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Calling cudf.from_pandas on empty index isn't preserving the dtype for string column #14046

Closed
galipremsagar opened this issue Sep 6, 2023 · 0 comments · Fixed by #14052
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@galipremsagar
Copy link
Contributor

Describe the bug
When an empty index with object dtype is constructed and passed to cudf.from_pandas, the dtype is being reset to float64.

Steps/Code to reproduce bug

In [1]: import pandas as pd

In [2]: s = pd.Series([], dtype='str')

In [3]: import cudf

In [4]: gs = cudf.from_pandas(s)

In [5]: s
Out[5]: Series([], dtype: object)

In [6]: gs
Out[6]: Series([], dtype: object)

In [7]: i = pd.Index([], dtype='str')

In [8]: i
Out[8]: Index([], dtype='object')

In [9]: cudf.from_pandas(i)
Out[9]: Float64Index([], dtype='float64')

Expected behavior

In [9]: cudf.from_pandas(i)
Out[9]: StringIndex([], dtype='object')

Environment overview (please complete the following information)

  • Environment location: [Bare-metal]
  • Method of cuDF install: [from source]
@galipremsagar galipremsagar added bug Something isn't working Python Affects Python cuDF API. labels Sep 6, 2023
@galipremsagar galipremsagar self-assigned this Sep 6, 2023
rapids-bot bot pushed a commit that referenced this issue Sep 7, 2023
Fixes #14046 

This PR fixes empty string column construction that arises due to a corner-case in the way pyarrow constructs arrays.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #14052
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant