Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 0.19 merge 0.18 #7310

Closed

Conversation

nvdbaranec
Copy link
Contributor

@nvdbaranec nvdbaranec commented Feb 4, 2021

Fixes #7305

nvdbaranec and others added 9 commits February 4, 2021 02:01
…format. (rapidsai#7096)

Addresses rapidsai#3793

Depends on  rapidsai#6864   (This affects contiguous_split.cu.  For the purposes of this PR, the only changes that are relevant are those that involve the generation of metadata)

- `pack()` performs a `contiguous_split()` on the incoming table to arrange the memory into a unified device buffer, and generates a host-side metadata buffer.   These are returned in the `packed_columns` struct.

- unpack() takes the data stored in the `packed_columns` struct and returns a deserialized `table_view` that points into it.

The intent of this functionality is as follows (pseudocode)

```
// serialize-side
table_view t;
packed_columns p = pack(t);
send_over_network(p.gpu_data);
send_over_network(p.metadata);

// deserialize-side
packed_columns p = receive_from_network();
table_view t = unpack(p);
```

This PR also renames `contiguous_split_result` to `packed_table` (which is just a bundled `table_view` and `packed_column`)

Authors:
  - @nvdbaranec

Approvers:
  - Jake Hemstad (@jrhemstad)
  - Paul Taylor (@trxcllnt)
  - Mike Wilson (@hyperbolic2346)

URL: rapidsai#7096
Fixes rapidsai#7265.

`cudf::detail::get_num_child_rows()` is currently defined in `cudf/lists/detail/utilities.cuh`. The build pipelines for rapidsai#7189 are fine, but there seem to be build failures in dependent projects such as `spark-rapids`:
```
[2021-01-31T08:12:10.611Z] /.../workspace/spark/cudf18_nightly/cpp/include/cudf/lists/detail/utilities.cuh:31:18: error: 'cudf::size_type cudf::detail::get_num_child_rows(const cudf::column_view&, rmm::cuda_stream_view)' defined but not used [-Werror=unused-function]
[2021-01-31T08:12:10.611Z]  static cudf::size_type get_num_child_rows(cudf::column_view const& list_offsets,
[2021-01-31T08:12:10.611Z]                   ^~~~~~~~~~~~~~~~~~
[2021-01-31T08:12:11.981Z] cc1plus: all warnings being treated as errors
[2021-01-31T08:12:12.238Z] make[2]: *** [CMakeFiles/cudf_hash.dir/build.make:82: CMakeFiles/cudf_hash.dir/src/hash/hashing.cu.o] Error 1
[2021-01-31T08:12:12.238Z] make[1]: *** [CMakeFiles/Makefile2:220: CMakeFiles/cudf_hash.dir/all] Error 2
```
In any case, it is less than ideal for the function to be completely defined in the header, especially given that the likes of `hashing.cu` are exposed to it (by way of `scatter.cuh`). 

This commit moves the function definition to a separate translation unit, without changing implementation or interface.

Authors:
  - MithunR (@mythrocks)

Approvers:
  - @nvdbaranec
  - Mike Wilson (@hyperbolic2346)
  - David (@davidwendt)

URL: rapidsai#7266
addresses part of rapidsai#6541 Segment sort of lists

- [x] lists_column_view segmented_sort
- [x] numerical types (cub segmented sort limitation)
- [x] sort_lists(table_view)
- [x] unit tests

closes  rapidsai#4603 Segmented sort
- [x] segmented_sort
- [x] unit tests.

Authors:
  - Karthikeyan (@karthikeyann)

Approvers:
  - AJ Schmidt (@ajschmidt8)
  - Keith Kraus (@kkraus14)
  - Jake Hemstad (@jrhemstad)
  - Conor Hoekstra (@codereport)

URL: rapidsai#7122
…rapidsai#7261)

Issue rapidsai#6763

Authors:
  - Vukasin Milovanovic (@vuule)

Approvers:
  - Ram (Ramakrishna Prabhu) (@rgsl888prabhu)
  - @nvdbaranec
  - GALI PREM SAGAR (@galipremsagar)
  - Keith Kraus (@kkraus14)

URL: rapidsai#7261
This PR requires the libcudf changes in rapidsai#7096, fixing the Java bindings to `contiguous_split` that are broken by that change.

This also adds the ability to create a `ContiguousTable` instance without manifesting a `Table` instance and all `ColumnVector` instances underneath it which should prove useful during Spark's shuffle.

Authors:
  - Jason Lowe (@jlowe)

Approvers:
  - Robert (Bobby) Evans (@revans2)
  - Alessandro Bellina (@abellina)

URL: rapidsai#7127
Turns out we need version > 5.4 of the junit jupiter engine to support `@TempDir`.
- Changed the file mode to match Spark's disk manager.
- Changed to use `fstat` to get the file length when appending.
- Add tests for when a file already exists.

Authors:
  - Rong Ou (@rongou)

Approvers:
  - Jason Lowe (@jlowe)
  - Robert (Bobby) Evans (@revans2)

URL: rapidsai#7296
Closes rapidsai#7199

Refactors scalar handling inside `assert_eq`. On higher level, this PR proposes a "whitelist" style testing: all compares should go to the "strict equal" code path unless explicitly allowed. This allows the test system to capture all unintended inequality except the ones that's discussed upon. For example, this PR creates two whitelist items:
- If the operands overrides `__eq__`, use it to determine equality.
- If the operands are floating type, assert approximate equality.
For all other cases, the operands should be strictly equal. Note that for testing purposes, `np.nan` are considered equal to itself.

Authors:
  - Michael Wang (@isVoid)

Approvers:
  - GALI PREM SAGAR (@galipremsagar)
  - @brandon-b-miller

URL: rapidsai#7220
This PR prepares the changelog to be automatically updated during releases.

Authors:
  - GALI PREM SAGAR (@galipremsagar)

Approvers:
  - Keith Kraus (@kkraus14)
  - AJ Schmidt (@ajschmidt8)

URL: rapidsai#7309
@nvdbaranec nvdbaranec requested review from a team as code owners February 4, 2021 15:29
@nvdbaranec nvdbaranec changed the base branch from branch-0.18 to branch-0.19 February 4, 2021 15:32
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java approval

@kkraus14
Copy link
Collaborator

kkraus14 commented Feb 4, 2021

I think we can just merge #7305 directly via a merge commit instead of this. Pinged ops to confirm.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The java merge looks fine

@kkraus14
Copy link
Collaborator

kkraus14 commented Feb 4, 2021

#7305 merged so closing this.

@kkraus14 kkraus14 closed this Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants