[RELEASE] cudf v0.18 #7405

GPUtester · 2021-02-17T20:43:17Z

❄️ Code freeze for `branch-0.18` and v0.18 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-0.18 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-0.18 into main for the release

Add a cmake find module to locate cuFile. If found, add the include directory and link to the shared library. This shouldn't have any effect if cuFile is not installed locally.

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

) This implements the `non_numeric` argument for `DataFrame.quantile` meaning that it now works on `datetime` and `timedelta` data. However, because of the difference in how `DataFrame.iloc` behaves between Pandas and cuDF, this implementation returns a DataFrame when `non_numeric=False` even when Pandas returns a Series Passes tests locally This closes #6799 Authors: - Chris Jarrett <[email protected]> - ChrisJar <[email protected]> Approvers: - Keith Kraus URL: #6902

When using parameter `--rmm_mode=managed` for gtests `Invalid RMM allocation mode: managed` exception is thrown. The logic in `include/cudf_test/base_fixture.hpp` is just missing a return statement. Authors: - davidwendt <[email protected]> Approvers: - Paul Taylor - Mark Harris URL: #6912

Resolves: #6870 This PR adds support for `set_names` API in both `Index` & `MultiIndex`. Authors: - galipremsagar <[email protected]> - GALI PREM SAGAR <[email protected]> Approvers: - Keith Kraus URL: #6929

Fixes: #6821 This PR fixes issue where `columns` and `index` are currently not being handled correctly in specific scenarios. Authors: - galipremsagar <[email protected]> - GALI PREM SAGAR <[email protected]> Approvers: - Richard (Rick) Zamora - Ashwin Srinath URL: #6838

)

Update to libcu++ on Github. Authors: - ptaylor <[email protected]> - Paul Taylor <[email protected]> Approvers: - Mark Harris - Keith Kraus - Christopher Harris - Mark Harris URL: #6275

This PR removes `**kwargs` from the string/categorical accessors where unnecessary, and exposes keyword arguments like `inplace` to the user directly. If we want to maintain parity with Pandas APIs for Dask/others using cuDF internally, we can consider using the approach described in #6135, which will automatically raise `NotimplementedError` when unsupported kwargs are passed. Authors: - Ashwin Srinath <[email protected]> Approvers: - GALI PREM SAGAR - Keith Kraus - Keith Kraus URL: #6750

Fixes #6682, #6680 Currently, empty fields are treated as N/A regardless on parsing options. However, the desired behavior is to handle empty fields the same way as fields with special values (apply default_na_values, na_filter logic). This PR irons out the behavior so it matches Pandas in this regard. - Tries now support matching empty strings. - The list of special NA values is now generated more robustly, so it has correct elements in any parameter combination. - Empty string is added to the list of special NA values. - Empty string string ("/"/"") is added to NA value list if empty string ("") is included (mirrors Pandas behavior). - Added tests for previously failing parameter combinations. - Reworked some of the tests to check against Pandas results instead of assumed desired behavior. Authors: - vuule <[email protected]> - vuule <[email protected]> - Vukasin Milovanovic <[email protected]> - Vukasin Milovanovic <[email protected]> Approvers: - Ram (Ramakrishna Prabhu) - Christopher Harris - Keith Kraus URL: #6922

The include directory was renamed from `simt` to `cuda`. Authors: - Rong Ou <[email protected]> Approvers: - Jason Lowe URL: #6948

The `cudf::merge` API expects the key columns to be sorted. This means that if null rows are included, these null entries should all appear either at beginning or at the end of the column depending on the null_order for the sort. The `MergeDictionaryTest.WithNull` gtest placed null rows in the middle of the column. The expected results should also have included null entries at the beginning or the end. This PR also includes an extra test for checking merge results are consistent with the sort parameters `cudf::order` and `cudf::null_order`. This test also includes a larger number of rows to ensure `thrust::merge` requires more than one tile/block in its runtime logic. Authors: - davidwendt <[email protected]> Approvers: - Ram (Ramakrishna Prabhu) - Vukasin Milovanovic URL: #6942

Updating the Java bindings package version to match the libcudf version. Authors: - Jason Lowe <[email protected]> Approvers: - Robert (Bobby) Evans URL: #6949

@shwina

Fixes #7249 Copies dtype metadata after calling `ColumnBase.copy()`. Moves logic for copying dtype metadata after calling libcudf functions from `Frame` to `ColumnBase`. Authors: - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - GALI PREM SAGAR (@galipremsagar) URL: #7271

@isVoid

#7256) Small PR to provide two fixes: - Use `rmm::device_uvector` in place of `device_vector` to improve efficiency. This is a scratch space, so supplied stream and default memory resource is used. Part of #5380 - Update `sort_helper::grouped_value` docstring to reflect change after use of stable sort. Authors: - Michael Wang (@isVoid) Approvers: - Vukasin Milovanovic (@vuule) - Ram (Ramakrishna Prabhu) (@rgsl888prabhu) - Mark Harris (@harrism) URL: #7256

@vuule

Use a buffer for output in the newly added ORC test. Authors: - Vukasin Milovanovic (@vuule) Approvers: - GALI PREM SAGAR (@galipremsagar) URL: #7313

@firestarman

Add unit tests for aggregate 'collect' with windowing. This PR depends on the PR #7189 . Signed-off-by: Liangcai Li <[email protected]> Authors: - Liangcai Li (@firestarman) Approvers: - MithunR (@mythrocks) - Robert (Bobby) Evans (@revans2) URL: #7121

@adelevie

change: on -> one I read the contributing guidelines, but since this is just a documentation fix, I'm not sure which apply. Great library, I just got started using it. A little rough around the edges, but great so far, and well worth some of the added steps. Authors: - Alan deLevie (@adelevie) - AJ Schmidt (@ajschmidt8) Approvers: - GALI PREM SAGAR (@galipremsagar) - Keith Kraus (@kkraus14) - Michael Wang (@isVoid) - Ray Douglass (@raydouglass) URL: #7253

@davidwendt

Returning a unique pointer using `std::move` causes a compile error for gcc 9 and above. Simple fix to remove the incorrect move semantic in `segmented_sort.cu` `get_segment_indices`. Authors: - David (@davidwendt) Approvers: - Karthikeyan (@karthikeyann) - Devavret Makkar (@devavret) URL: #7319

@galipremsagar

Constructing a DataFrame from a ColumnAccessor previously had unintended side-effects: ```python In [1]: import cudf In [2]: a = cudf.DataFrame({'a': [1, 2, 3]}) In [3]: a._data['a'].__cuda_array_interface__ Out[3]: {'shape': (3,), 'strides': (8,), 'typestr': '<i8', 'data': (140409137266688, False), 'version': 1} In [4]: a[['a']] Out[4]: a 0 1 1 2 2 3 In [5]: a._data['a'].__cuda_array_interface__ Out[5]: {'shape': (3,), 'strides': (8,), 'typestr': '<i8', 'data': (140409137267200, False), 'version': 1} ``` In a discussion with @galipremsagar - we decided that it's probably best not to handle `ColumnAccessor` in the frame constructors. * Remove special handling of `ColumnAccessor` in `Frame` constructors * Collapse `Series.copy()` and `DataFrame.copy()` into a single `Frame.copy()` Authors: - Ashwin Srinath (@shwina) - GALI PREM SAGAR (@galipremsagar) Approvers: - GALI PREM SAGAR (@galipremsagar) URL: #7298

@isVoid

Closes #7246 This PR fixes a bug in `Dataframe.iloc`. When the slice provided to `iloc`, is decrementing and also terminates at `before-the-zero` position, such as `slice(2, -1, -1)` or `slice(4, None, -1)`, the terminal position still gets wrapped around. `Frame._slice` is moved to `DataFrame._slice` to resolve typing issue. Authors: - Michael Wang (@isVoid) Approvers: - Keith Kraus (@kkraus14) - GALI PREM SAGAR (@galipremsagar) URL: #7277

@ChrisJar

This updates the 10 minutes to cuDF and CuPY notebook to use the new methods for moving between cuDF data structures and CuPy arrays. Closes #7160 Authors: - @ChrisJar Approvers: - Ashwin Srinath (@shwina) URL: #7158

@shwina

Closes #7311 Authors: - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - AJ Schmidt (@ajschmidt8) URL: #7318

@jolorunyomi

This PR adds the GitHub action [PR Labeler](https://github.com/actions/labeler) to auto-label PRs based on their content. Labeling is managed with a configuration file `.github/labeler.yml` using the following [options](https://github.com/actions/labeler#usage). Authors: - Joseph (@jolorunyomi) - Mike Wendt (@mike-wendt) Approvers: - AJ Schmidt (@ajschmidt8) - Keith Kraus (@kkraus14) - Mike Wendt (@mike-wendt) URL: #7044

@shwina

Authors: - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - @jakirkham - Ray Douglass (@raydouglass) URL: #7335

@dillon-cullinan

Issues and PRs without activity for 30d will be marked as stale. If there is no activity for 90d, they will be marked as rotten. Authors: - Jordan Jacobelli (@Ethyling) Approvers: - Dillon Cullinan (@dillon-cullinan) URL: #7388

@mike-wendt

Follows #7388 Updates the stale GHA with the following changes: - [x] Uses `inactive-30d` and `inactive-90d` labels instead of `stale` and `rotten` - [x] Updates comments to reflect changes in labels - [x] Exempts the following labels from being marked `inactive-30d` or `inactive-90d` - `0 - Blocked` - `0 - Backlog` - `good first issue` Authors: - Mike Wendt (@mike-wendt) Approvers: - Keith Kraus (@kkraus14) - Ray Douglass (@raydouglass) URL: #7395

review-notebook-app · 2021-02-17T20:43:23Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ajschmidt8 and others added 30 commits November 24, 2020 15:47

DOC v0.18 Updates

80464ce

Add a cmake option to link to GDS/cuFile (#6847)

0e94bab

Add a cmake find module to locate cuFile. If found, add the include directory and link to the shared library. This shouldn't have any effect if cuFile is not installed locally.

Merge pull request #6866 from rapidsai/branch-0.17

2ed7e13

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6867 from rapidsai/branch-0.17

a091304

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6874 from rapidsai/branch-0.17

c0e03d6

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6876 from rapidsai/branch-0.17

018d036

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6877 from rapidsai/branch-0.17

7aa3863

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6878 from rapidsai/branch-0.17

36c03a5

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6879 from rapidsai/branch-0.17

48adcc0

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6880 from rapidsai/branch-0.17

36d5205

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge branch 'branch-0.17' into fix_automerge

536d23a

Merge pull request #6890 from kkraus14/fix_automerge

737e715

Merge pull request #6896 from rapidsai/branch-0.17

3d80bb8

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6900 from rapidsai/branch-0.17

c6f39b1

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6904 from rapidsai/branch-0.17

009c307

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6906 from rapidsai/branch-0.17

dd6cf15

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6910 from rapidsai/branch-0.17

8c8e05f

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Merge pull request #6913 from rapidsai/branch-0.17

522103d

[gpuCI] Auto-merge branch-0.17 to branch-0.18 [skip ci]

Add Index.set_names api(#6929)

917759b

Resolves: #6870 This PR adds support for `set_names` API in both `Index` & `MultiIndex`. Authors: - galipremsagar <[email protected]> - GALI PREM SAGAR <[email protected]> Approvers: - Keith Kraus URL: #6929

Implement cudf::reduce for decimal32 and decimal64 (part 1) (#6814

9120992

)

Update to official libcu++ on Github(#6275)

78f9789

Update to libcu++ on Github. Authors: - ptaylor <[email protected]> - Paul Taylor <[email protected]> Approvers: - Mark Harris - Keith Kraus - Christopher Harris - Mark Harris URL: #6275

fix libcu++ include path for jni(#6948)

83b1851

The include directory was renamed from `simt` to `cuda`. Authors: - Rong Ou <[email protected]> Approvers: - Jason Lowe URL: #6948

Update Java bindings version to 0.18-SNAPSHOT(#6949)

44eeb70

Updating the Java bindings package version to match the libcudf version. Authors: - Jason Lowe <[email protected]> Approvers: - Robert (Bobby) Evans URL: #6949

fixed_point_value double-shifts in fixed_point construction (#6950)

6d230ee

shwina and others added 14 commits February 4, 2021 18:35

Fix failing CI ORC test (#7313)

3a52d93

Use a buffer for output in the newly added ORC test. Authors: - Vukasin Milovanovic (@vuule) Approvers: - GALI PREM SAGAR (@galipremsagar) URL: #7313

Update 10 minutes to cuDF and CuPy with new APIs (#7158)

658e91a

This updates the 10 minutes to cuDF and CuPY notebook to use the new methods for moving between cuDF data structures and CuPy arrays. Closes #7160 Authors: - @ChrisJar Approvers: - Ashwin Srinath (@shwina) URL: #7158

Update readme (#7318)

da0e794

Closes #7311 Authors: - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - AJ Schmidt (@ajschmidt8) URL: #7318

Unpin from numpy < 1.20 (#7335)

d3f5add

Authors: - Ashwin Srinath (@shwina) Approvers: - Keith Kraus (@kkraus14) - @jakirkham - Ray Douglass (@raydouglass) URL: #7335

Add GHA to mark issues/prs as stale/rotten (#7388)

26c2dfe

Issues and PRs without activity for 30d will be marked as stale. If there is no activity for 90d, they will be marked as rotten. Authors: - Jordan Jacobelli (@Ethyling) Approvers: - Dillon Cullinan (@dillon-cullinan) URL: #7388

GPUtester requested review from a team as code owners February 17, 2021 20:43

GPUtester requested review from vuule, nvdbaranec, isVoid and brandon-b-miller February 17, 2021 20:43

ajschmidt8 added non-breaking Non-breaking change and removed non-breaking Non-breaking change labels Feb 23, 2021

update changelog

1544474

raydouglass added the non-breaking Non-breaking change label Feb 24, 2021

raydouglass merged commit b7e1a85 into main Feb 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] cudf v0.18 #7405

[RELEASE] cudf v0.18 #7405

GPUtester commented Feb 17, 2021

review-notebook-app bot commented Feb 17, 2021

[RELEASE] cudf v0.18 #7405

[RELEASE] cudf v0.18 #7405

Conversation

GPUtester commented Feb 17, 2021

❄️ Code freeze for branch-0.18 and v0.18 release

What does this mean?

What is the purpose of this PR?

review-notebook-app bot commented Feb 17, 2021

❄️ Code freeze for `branch-0.18` and v0.18 release