[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci] #9274

GPUtester · 2021-09-22T19:00:18Z

Forward-merge triggered by push to branch-21.10 that creates a PR to keep branch-21.12 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge.

Removes `-g` from the compile commands generated by distutils to compile Cython files. This will make our container images, conda packages, and python wheels smaller.

Signed-off-by: Jordan Jacobelli <[email protected]>

Fixes: #9234 - [x] This PR introduces optimizations to `sort_index` when there is an already sorted `Index` object and avoids sorting them and performing a `take` operation on them. This **alleviates** a lot of **memory pressure** and has **a 3x to 6x speed up.** On a T4 GPU: `This PR`: ```python In [1]: import cudf In [2]: df = cudf.DataFrame({'a':[1, 2, 3]*100000000, 'b':['a', 'b', 'c']*100000000, 'c':[0.0, 0.12, 10.12]*100000000}) In [3]: %timeit df.sort_index() 174 ms ± 368 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` `branch-21.10`: Won't fit into memory and will error :( on T4 as it tries to perform argsort on an already sorted index. `THIS PR`: ```python In [1]: import cudf In [2]: df = cudf.DataFrame({'a':[1, 2, 3]*10000000, 'b':['a', 'b', 'c']*10000000, 'c':[0.0, 0.12, 10.12]*10000000}) In [3]: %timeit df.sort_index(ascending=False) 69.1 ms ± 221 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [4]: %timeit df.sort_index() 15.2 ms ± 213 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [5]: df_reversed = df[::-1] In [6]: %timeit df_reversed.sort_index() 72.6 ms ± 433 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [7]: %timeit df_reversed.sort_index(ascending=False) 24.1 ms ± 423 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` `branch-21.10`: ```python In [1]: import cudf In [2]: df = cudf.DataFrame({'a':[1, 2, 3]*10000000, 'b':['a', 'b', 'c']*10000000, 'c':[0.0, 0.12, 10.12]*10000000}) In [3]: %timeit df.sort_index(ascending=False) 71.6 ms ± 141 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [4]: %timeit df.sort_index() 71.7 ms ± 189 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [5]: df_reversed = df[::-1] In [6]: %timeit df_reversed.sort_index() 69.1 ms ± 201 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [7]: %timeit df_reversed.sort_index(ascending=False) 69 ms ± 127 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ``` - [x] Also expands params to `Series.sort_index` and refactored the common implementation to `Frame._sort_index`. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Michael Wang (https://github.com/isVoid) - Benjamin Zaitlen (https://github.com/quasiben) URL: #9238

This PR fixes the `gather` API for structs columns when the input is a sliced column. Previously, `gather` calls `child_begin()` and `child_end()` to access the children column so if the input structs column is sliced then the output is incorrect. This closes #9213, and is blocked by #9194 due to conflict work. Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - MithunR (https://github.com/mythrocks) - Mark Harris (https://github.com/harrism) URL: #9218

When #9030 was merged it incorrectly resolved `get_cucollections.cmake` to use features of `rapids_cpm_find` but still call `CPMFindPackage`. This corrects the issues by properly calling `rapids_cpm_find`. Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Keith Kraus (https://github.com/kkraus14) - Mark Harris (https://github.com/harrism) URL: #9189

libcudf doesn't expose zlib in the public facing API, and therefore C++ consumers don't need to also link / include zlib. Authors: - Robert Maynard (https://github.com/robertmaynard) - Keith Kraus (https://github.com/kkraus14) Approvers: - Keith Kraus (https://github.com/kkraus14) - Mark Harris (https://github.com/harrism) URL: #9204

Only run imports tests on x86_64

Provides the Python/Cython bindings for #8702 multibyte_split. This PR depends on #8702 being merged first. Closes #8557 Authors: - Jeremy Dyer (https://github.com/jdye64) - Christopher Harris (https://github.com/cwharris) Approvers: - https://github.com/nvdbaranec - Vyas Ramasubramani (https://github.com/vyasr) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #8998

Issue #9205 Authors: - Devavret Makkar (https://github.com/devavret) Approvers: - Mike Wilson (https://github.com/hyperbolic2346) - Vukasin Milovanovic (https://github.com/vuule) URL: #9235

Temporary workaround for `arm64` Importing cudf on arm64 CPU only nodes is currently not working due to a difference in reported gpu devices between arm64 and amd64 Authors: - Jordan Jacobelli (https://github.com/Ethyling) Approvers: - Ray Douglass (https://github.com/raydouglass) URL: #9252

Fixes #8905. Attempting groupby aggregations with `LIST` keys leads to silent failures and bad results. For instance, attempting hash-based `groupby` aggregations with `LIST` keys only fails on DEBUG builds, thus: ``` /home/myth/dev/cudf/2/cpp/include/cudf/table/row_operators.cuh:447: unsigned int cudf: :element_hasher_with_seed<hash_function, has_nulls>::operator()(cudf::column_device_view, signed in t) const [with T = cudf::list_view; void *<anonymous> = (void *)nullptr; hash_function = default_ha sh; __nv_bool has_nulls = false]: block: [0,0,0], thread: [0,0,0] Assertion `false && "Unsupported type in hash."` failed. ``` In RELEASE builds, a copy of the input `LIST` column is returned, causing each output row to be interpreted as its own group. This commit adds an explicit failure for unsupported groupby key types, i.e. those that don't support equality comparisons (like `LIST`). Authors: - MithunR (https://github.com/mythrocks) Approvers: - Nghia Truong (https://github.com/ttnghia) - Robert Maynard (https://github.com/robertmaynard) - Jake Hemstad (https://github.com/jrhemstad) URL: #9227

Fixes: #9254 This PR fixes `deserialize` in `cudf.MultiIndex` so that there is no data-corruption happening when there are duplicate names. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #9258

This PR add support for struct type into the existing `drop_list_duplicates` API. This is the first time a nested type is supported in this function. Some more code cleanup has also been done. To be clear: Only structs of basic types and structs of structs are supported. Structs of lists are not, due to their complex nature. Closes #8972. Blocked by #9218 (it is merged). Authors: - Nghia Truong (https://github.com/ttnghia) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - https://github.com/nvdbaranec - Mark Harris (https://github.com/harrism) URL: #9202

…9263) Closes #9156 This PR simplifies the parameters when calling thrust::reduce_by_key for the argmin/argmax aggregations in cudf::groupby. The illegalMemoryAccess found in #9156 was due to invalid data being passed from thrust::reduce_by_key through to the BinaryPredicate function as documented in NVIDIA/thrust#1525 The invalid data being passed is only a real issue for strings columns where the device pointer was neither nullptr nor a valid address. The new logic provides only size_type values to thrust::reduce_by_key so invalid values can only be out-of-bounds for the input column which is easily checked before retrieving the string_view objects within the ArgMin and ArgMax operators. This the same as #9244 but based on 21.10 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Devavret Makkar (https://github.com/devavret) - Nghia Truong (https://github.com/ttnghia) - Robert Maynard (https://github.com/robertmaynard) URL: #9263

…lean (#9192) Currently, we map boolean type to `pa.int8` because the bitwidth of cudf boolean mismatches that in arrow. However the implication of this mapping is subtle and may cause unwanted result such as: ```python >>> cudf.StructDtype({ "a": np.bool_, "b": np.int8, }) StructDtype({'a': dtype('int8'), 'b': dtype('int8')}) ``` This PR changes the mapping back to `pa.bool_`, and use explicit type handling when we are dealing with type conversion to arrow. Authors: - Michael Wang (https://github.com/isVoid) Approvers: - https://github.com/brandon-b-miller - H. Thomson Comer (https://github.com/thomcom) URL: #9192

Fixes a Java column vector leak in TableTest#testParquetWriteMap. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #9271

Forward-merge `branch-21.08` into `branch-21.10`

This changes the calls in java/cudf to check for an empty input and return an empty result instead of crashing. Fixes #9253 Authors: - Mike Wilson (https://github.com/hyperbolic2346) Approvers: - Jason Lowe (https://github.com/jlowe) URL: #9262

Issue #9205 depends on #9235 Authors: - Devavret Makkar (https://github.com/devavret) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Elias Stehle (https://github.com/elstehle) - https://github.com/nvdbaranec - Mike Wilson (https://github.com/hyperbolic2346) URL: #9242

GPUtester · 2021-09-22T19:00:29Z

FAILURE - Unable to forward-merge due to conflicts, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/gpuci/#forward-mergers
IMPORTANT: Before merging and approving this PR, be sure to change the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

Closes #8660 Per discussions in thread #8872 , this PR adds a struct-accessor member function to provide a lateral view to a struct type series. Example: ```python >>> import cudf, dask_cudf as dgd >>> ds = dgd.from_cudf(cudf.Series( ... [{'a': 42, 'b': 'str1', 'c': [-1]}, ... {'a': 0, 'b': 'str2', 'c': [400, 500]}, ... {'a': 7, 'b': '', 'c': []}]), npartitions=2) >>> ds.struct.explode().compute() a b c 0 42 str1 [-1] 1 0 str2 [400, 500] 2 7 [] ``` Authors: - Michael Wang (https://github.com/isVoid) Approvers: - Richard (Rick) Zamora (https://github.com/rjzamora) URL: #9086

…view (#9185) Fixes #9140 Added `shallow_hash(column_view)` Added unit tests It computes hash values based on the shallow states of `column_view`: type, size, data pointer, null_mask pointer, offset, and the hash value of the children. `null_count` is not used since it is a cached value and it may vary based on contents of `null_mask`, and may be pre-computed or not. Fixes #9139 Added `is_shallow_equivalent(column_view, column_view)` ~shallow_equal~ Added unit tests It compares two column_views based on the shallow states of column_view: type, size, data pointer, null_mask pointer, offset, and the column_view of the children. null_count is not used since it is a cached value and it may vary based on contents of null_mask, and may be pre-computed or not. Authors: - Karthikeyan (https://github.com/karthikeyann) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) - Jake Hemstad (https://github.com/jrhemstad) - David Wendt (https://github.com/davidwendt) URL: #9185

This PR strips the pyarrow-NativeFile component out of #9225 (since those changes are not yet stable). I feel that it is reasonable to start by merging these fsspec-specific optimizations for 21.10, because they are stable and already result in a significant performance boost over the existing approach to remote storage. I still think it is very important that we eventually plumb NativeFile support into python (cudf and dask_cudf), but we will likely need to target 21.12 for that improvement. Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Ashwin Srinath (https://github.com/shwina) - Benjamin Zaitlen (https://github.com/quasiben) URL: #9265

Fixes #7830, #8443 Features: - Use the new table metadata type that matches the table hierarchy, `table_input_metadata`. - Support struct columns in the writer. Changes: - Null masks are encoded as aligned rowgroups to avoid invalid bits when the number of encoded rows is not divisible by 8 (except for the last rowgroup in each stripe). This also affects list columns. The issue is equivalent to #6763 (boolean columns only). - Added pushdown masks that are used to determine which child elements should not be encoded, including null mask bits. - Use pushdown masks for rowgroup alignment, null mask encoding and value encoding. - Separated the null mask encoding from value encoding - can be further moved to a separate kernel call. Breaking because the table metadata type has changed. Authors: - Vukasin Milovanovic (https://github.com/vuule) - Jason Lowe (https://github.com/jlowe) Approvers: - Robert Maynard (https://github.com/robertmaynard) - AJ Schmidt (https://github.com/ajschmidt8) - Robert (Bobby) Evans (https://github.com/revans2) - Vyas Ramasubramani (https://github.com/vyasr) - Devavret Makkar (https://github.com/devavret) - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu) URL: #9025

Authors: - Devavret Makkar (https://github.com/devavret) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Keith Kraus (https://github.com/kkraus14) URL: #9276

Aligns the function signature for `cudf.DataFrame.apply` with that of `pandas.DataFrame.apply`. This is needed so that dask can build on a common `apply` interface between backends among other reasons. Authors: - https://github.com/brandon-b-miller Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #9275

trxcllnt and others added 19 commits August 25, 2021 14:15

Remove -g from cython compile commands (#9074)

0a00579

Removes `-g` from the compile commands generated by distutils to compile Cython files. This will make our container images, conda packages, and python wheels smaller.

Only run imports tests on x86_64

95f31dd

Signed-off-by: Jordan Jacobelli <[email protected]>

Merge pull request #9241 from Ethyling/arm64

156d5b5

Only run imports tests on x86_64

Use nvcomp's snappy decompression in ORC reader (#9235)

f08d6f1

Issue #9205 Authors: - Devavret Makkar (https://github.com/devavret) Approvers: - Mike Wilson (https://github.com/hyperbolic2346) - Vukasin Milovanovic (https://github.com/vuule) URL: #9235

Fix Java column leak in testParquetWriteMap (#9271)

b0c8bbb

Fixes a Java column vector leak in TableTest#testParquetWriteMap. Authors: - Jason Lowe (https://github.com/jlowe) Approvers: - Robert (Bobby) Evans (https://github.com/revans2) URL: #9271

Merge pull request #9273 from rapidsai/branch-21.08

20a04bc

Forward-merge `branch-21.08` into `branch-21.10`

GPUtester requested review from a team as code owners September 22, 2021 19:00

GPUtester requested review from trxcllnt and rgsl888prabhu September 22, 2021 19:00

github-actions bot added CMake CMake build issue conda labels Sep 22, 2021

github-actions bot added Java Affects Java cuDF API. Python Affects Python cuDF API. gpuCI libcudf Affects libcudf (C++/CUDA) code. labels Sep 22, 2021

isVoid and others added 6 commits September 22, 2021 19:05

Update nvcomp to include fixes for installation of headers (#9276)

eaedf17

Authors: - Devavret Makkar (https://github.com/devavret) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Keith Kraus (https://github.com/kkraus14) URL: #9276

galipremsagar mentioned this pull request Sep 23, 2021

Fix Automerger for Branch-21.12 from branch-21.10 #9285

Merged

ajschmidt8 merged commit 3ed97af into branch-21.12 Sep 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci] #9274

[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci] #9274

GPUtester commented Sep 22, 2021

GPUtester commented Sep 22, 2021

[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci] #9274

[gpuCI] Forward-merge branch-21.10 to branch-21.12 [skip gpuci] #9274

Conversation

GPUtester commented Sep 22, 2021

GPUtester commented Sep 22, 2021