Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Support datetime64 in row-wise op #6415

Merged
merged 19 commits into from
Oct 12, 2020
Merged

Conversation

isVoid
Copy link
Contributor

@isVoid isVoid commented Oct 3, 2020

Closes #6129

In solving this issue a bug from numpy surfaced: numpy/numpy#17428
find_common_types is not suggested. Should we start replacing the use of this function to np.promote_types?

@isVoid isVoid added 2 - In Progress Currently a work in progress Python Affects Python cuDF API. labels Oct 3, 2020
@isVoid isVoid requested review from shwina and beckernick October 3, 2020 01:58
@isVoid isVoid requested a review from a team as a code owner October 3, 2020 01:58
@isVoid isVoid requested a review from kkraus14 October 3, 2020 01:58
@isVoid isVoid marked this pull request as draft October 3, 2020 02:07
@jakirkham
Copy link
Member

Sounds like a good idea 🙂

@seberg
Copy link
Contributor

seberg commented Oct 3, 2020

Seems np.promote_types is currently limited to only two inputs, so np.result_type is probably more convenient.

EDIT: You also will have to do a try:/except Typerror: to fall back to object unfortunately.

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 3, 2020

Thanks @seberg! We don't support object types in general so we should likely be fine 😄

@kkraus14
Copy link
Collaborator

kkraus14 commented Oct 3, 2020

cc @brandon-b-miller for dtype related discussion since you've been working on implementing cudf dtypes.

@brandon-b-miller
Copy link
Contributor

I'll look into if this affects anything and report back - I know we use it in a few places in join.

@GPUtester
Copy link
Collaborator

Please update the changelog in order to start CI tests.

View the gpuCI docs here.

Copy link
Contributor Author

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two other things I would like to have some comments:

  1. Memory footprint. I could not come up with an easier solution when there are mixed resolution of datetime64, since the underlying values are stored w.r.t the resolution.
  2. (Question thanks to @brandon-b-miller ) What are the expected behavior when overflow happens when converting to a higher resolution?

python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Show resolved Hide resolved
@isVoid isVoid marked this pull request as ready for review October 7, 2020 03:38
@isVoid isVoid changed the title [WIP] Support datetime64[ms] in row-wise op [REVIEW] Support datetime64[ms] in row-wise op Oct 7, 2020
@isVoid isVoid added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 7, 2020
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
Michael Wang added 3 commits October 7, 2020 12:45
- Separating pandas bug cases into separate assertion
- Remove excess skipif

PR 6415
Address review comments
PR6415
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 8, 2020

Codecov Report

Merging #6415 into branch-0.17 will increase coverage by 0.38%.
The diff coverage is 92.30%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.17    #6415      +/-   ##
===============================================
+ Coverage        82.72%   83.11%   +0.38%     
===============================================
  Files               90       90              
  Lines            14444    14816     +372     
===============================================
+ Hits             11949    12314     +365     
- Misses            2495     2502       +7     
Impacted Files Coverage Δ
python/cudf/cudf/utils/dtypes.py 88.54% <84.61%> (+1.63%) ⬆️
python/cudf/cudf/core/dataframe.py 90.75% <100.00%> (+0.16%) ⬆️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/custreamz/custreamz/_version.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/_version.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_orc.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_json.py 100.00% <0.00%> (ø)
... and 38 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b50662d...573640c. Read the comment docs.

- Handle both type and value exception for result type
- Comment to explain why test memory idendity
- Use with pytest.raise block
Michael Wang added 3 commits October 8, 2020 16:22
Addresses review comments:
- Using a wrapper of np.find_common_type to accomodate future
need of cudf type promotion rules
- Avoids the need to modify dataframe by deep copying underlying
values used for row-wise op.

PR 6415
CHANGELOG.md Outdated Show resolved Hide resolved
@kkraus14 kkraus14 changed the title [REVIEW] Support datetime64[ms] in row-wise op [REVIEW] Support datetime64 in row-wise op Oct 9, 2020
@kkraus14
Copy link
Collaborator

cc @brandon-b-miller to review again before we merge

Copy link
Contributor

@brandon-b-miller brandon-b-miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Oct 12, 2020
@kkraus14 kkraus14 merged commit 8c32b80 into rapidsai:branch-0.17 Oct 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants