Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise error on constructing an array from mixed type inputs #13768

Merged
merged 8 commits into from
Jul 28, 2023

Conversation

galipremsagar
Copy link
Contributor

Description

We currently are allowing construction of mixed-dtype by type-casting them into a common type as below:

In [1]: import cudf

In [2]: import pandas as pd

In [3]: s = pd.Series([1, 2, 3], dtype='datetime64[ns]')


In [5]: p = pd.Series([10, 11])

In [6]: new_s = pd.concat([s, p])

In [7]: new_s
Out[7]: 
0    1970-01-01 00:00:00.000000001
1    1970-01-01 00:00:00.000000002
2    1970-01-01 00:00:00.000000003
0                               10
1                               11
dtype: object

In [8]: cudf.Series(new_s)
Out[8]: 
0   1970-01-01 00:00:00.000000
1   1970-01-01 00:00:00.000000
2   1970-01-01 00:00:00.000000
0   1970-01-01 00:00:00.000010
1   1970-01-01 00:00:00.000011
dtype: datetime64[us]

This behavior is incorrect and we are getting this from pa.array constructor. This PR ensures we do proper handling around such cases and raise an error.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@galipremsagar galipremsagar added bug Something isn't working Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer breaking Breaking change labels Jul 26, 2023
@galipremsagar galipremsagar self-assigned this Jul 26, 2023
@galipremsagar galipremsagar requested a review from a team as a code owner July 26, 2023 17:49
@galipremsagar galipremsagar requested review from shwina and isVoid July 26, 2023 17:49
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs cuDF (Python) Reviewer labels Jul 27, 2023
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion for improvement.

@galipremsagar galipremsagar requested a review from bdice July 28, 2023 13:58
@galipremsagar
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 0910046 into rapidsai:branch-23.10 Jul 28, 2023
rapids-bot bot pushed a commit that referenced this pull request Aug 16, 2023
…13889)

Continuation to #13768, In #13768 we prevented construction of mixed types in `Index` & `Series`. This PR implements the same for `DataFrame`.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/brandon-b-miller

URL: #13889
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants