Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding cudf.cut method #8002

Merged
merged 70 commits into from
Jun 11, 2021
Merged
Changes from 3 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
fd6fb9c
interval dtype and tests
marlenezw Dec 11, 2020
bdce72c
fixing merge conflicts
marlenezw Apr 20, 2021
f4c8329
adding updates from branch-20
marlenezw Apr 20, 2021
7cfd192
removing faulty merge.
marlenezw Apr 20, 2021
800e134
more merge conflict fixes.
marlenezw Apr 20, 2021
43e74d1
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw Apr 21, 2021
aae0d16
changes that allow us to return catindex.
marlenezw Apr 23, 2021
9709562
updating branch.
marlenezw Apr 23, 2021
5733daf
final changes and tests.
marlenezw Apr 27, 2021
044f54e
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw Apr 27, 2021
d926f83
updated changes and removing old code.
marlenezw Apr 27, 2021
0892a36
removing unnecessary changes.
marlenezw Apr 27, 2021
5b4936f
changing closed parameters to fix failing tests.
marlenezw Apr 27, 2021
b5a8982
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw Apr 27, 2021
534bf73
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw Apr 28, 2021
93ee3bd
more tests.
marlenezw May 5, 2021
0a94e42
resolving merge conflicts
marlenezw May 5, 2021
6a8bcfa
changes for series input.
marlenezw May 5, 2021
ab54eee
adding changes for parameters retbins,labels, and precision.Also allo…
marlenezw May 6, 2021
2ded2e5
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw May 6, 2021
2a6ea6b
removing breakpoint that was causing failures.
marlenezw May 7, 2021
e1b34b7
handling for bins that are interval index and or a sequence of scalar…
marlenezw May 10, 2021
58cf825
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw May 10, 2021
33f8d0f
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw May 11, 2021
548d71c
adding changes to give correct output with series and one more test.
marlenezw May 11, 2021
1ee4f1a
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw May 11, 2021
257b4d5
adding handling for the case where we have a series, duplicates dropp…
marlenezw May 12, 2021
10d4316
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw May 12, 2021
eb4f3cb
changing x min and max into scalars to avoid using cupy.
marlenezw May 14, 2021
b20da61
removing breakpoint
marlenezw May 14, 2021
4b0a837
Merge branch 'branch-0.20' of https://github.com/rapidsai/cudf into c…
marlenezw May 14, 2021
30ecf09
fixing some style issues.
marlenezw May 14, 2021
d5d8dc6
Update python/cudf/cudf/tests/test_cut.py
marlenezw May 19, 2021
809bc36
Update python/cudf/cudf/tests/test_cut.py
marlenezw May 19, 2021
00a819f
Update python/cudf/cudf/core/cut.py
marlenezw May 19, 2021
1011cb1
Update python/cudf/cudf/core/cut.py
marlenezw May 19, 2021
8b05f16
Update python/cudf/cudf/core/column/categorical.py
marlenezw May 19, 2021
f32613f
adding base_mask to col to get correct null later.
marlenezw May 19, 2021
0e7f270
resolve merge conflicts.
marlenezw May 19, 2021
2d73337
more changes to tests.
marlenezw May 24, 2021
ad66b38
Merge branch 'branch-21.06' of https://github.com/rapidsai/cudf into …
marlenezw May 24, 2021
17ca933
Merge branch 'branch-21.06' of https://github.com/rapidsai/cudf into …
marlenezw May 26, 2021
c78a0cd
updating tests.
marlenezw May 26, 2021
646f6a8
Merge branch 'branch-21.08' of https://github.com/rapidsai/cudf into …
marlenezw Jun 2, 2021
4d30454
style changes.
marlenezw Jun 2, 2021
b19edc3
fixning mypy style issue.
marlenezw Jun 2, 2021
f1a4e43
Merge branch 'branch-21.08' of https://github.com/rapidsai/cudf into …
marlenezw Jun 4, 2021
8dc5070
fixing error that assumes all categories have a dtype.
marlenezw Jun 4, 2021
62f5739
Update python/cudf/cudf/core/cut.py
marlenezw Jun 4, 2021
9aeafda
Update python/cudf/cudf/core/cut.py
marlenezw Jun 4, 2021
ad85c37
using cupy instead of sequence for calcualting bin value
marlenezw Jun 7, 2021
13dbe21
using cupy.linespace instead of sequence.
marlenezw Jun 8, 2021
c413cac
Merge branch 'branch-21.08' of https://github.com/rapidsai/cudf into …
marlenezw Jun 8, 2021
8bb81fc
style fixes.
marlenezw Jun 8, 2021
e4c7ae5
fixing merge conflicts.
marlenezw Jun 8, 2021
1fa45f1
changing to numpy.
marlenezw Jun 8, 2021
ec4d5b8
keeping bins computations on the host.
marlenezw Jun 8, 2021
d3bb368
style changes.
marlenezw Jun 8, 2021
c8d8ffd
Update python/cudf/cudf/core/cut.py
marlenezw Jun 9, 2021
588d44a
Update python/cudf/cudf/core/dtypes.py
marlenezw Jun 9, 2021
4acb1a7
Update python/cudf/cudf/tests/test_cut.py
marlenezw Jun 9, 2021
e667f64
Update python/cudf/cudf/core/cut.py
marlenezw Jun 9, 2021
57486cd
adding test for raise exception and removing stale code.
marlenezw Jun 9, 2021
d3ffa19
Update python/cudf/cudf/core/cut.py
marlenezw Jun 10, 2021
32a9255
Update python/cudf/cudf/core/cut.py
marlenezw Jun 10, 2021
ac20cb0
Update python/cudf/cudf/core/cut.py
marlenezw Jun 10, 2021
9c924be
removing stale code after switching to host.
marlenezw Jun 10, 2021
a038312
Merge branch 'cut-pr' of https://github.com/marlenezw/cudf into cut-pr
marlenezw Jun 10, 2021
8f9264d
style changes and updates from reviews.
marlenezw Jun 10, 2021
002752a
Merge branch 'branch-21.08' of https://github.com/rapidsai/cudf into …
marlenezw Jun 10, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 3 additions & 7 deletions python/cudf/cudf/core/cut.py
Original file line number Diff line number Diff line change
@@ -138,12 +138,8 @@ def cut(
bins = list(dict.fromkeys(bins))

# if bins is an intervalIndex we ignore the value of right
if (
right is False
and isinstance(bins, pd.IntervalIndex)
and bins.closed == "right"
):
right = True
if isinstance(bins, (pd.IntervalIndex, cudf.IntervalIndex)):
marlenezw marked this conversation as resolved.
Show resolved Hide resolved
right = bins.closed == "right"

# create bins if given an int or single scalar
if not isinstance(bins, pd.IntervalIndex):
@@ -178,7 +174,7 @@ def cut(
bins[0] = bins[0] - 10 ** (-precision)

# if right is false the last bin edge is not included
if right is False:
if not right:
right_edge = bins[len(bins) - 1]
marlenezw marked this conversation as resolved.
Show resolved Hide resolved
x = cupy.asarray(x)
if isinstance(right_edge, cupy._core.core.ndarray):
marlenezw marked this conversation as resolved.
Show resolved Hide resolved