Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve auto merge conflicts for Branch 21.08 from branch 21.06 #8329

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
c732cef
Support create lists column from a `list_scalar` (#8185)
isVoid May 20, 2021
2da8473
Create a String column from UTF8 String byte arrays (#8257)
firestarman May 20, 2021
48647aa
Java: Support creating a scalar from utf8 string (#8294)
firestarman May 20, 2021
0ebf7e6
support RMM aligned resource adapter in JNI (#8266)
rongou May 20, 2021
deee1f6
update changelog (#8297)
ajschmidt8 May 20, 2021
7427049
Remove abc inheritance from Serializable (#8254)
vyasr May 20, 2021
944e932
Implement `lists::concatenate_list_elements` (#8231)
ttnghia May 20, 2021
75e12d1
Actually test equality in assert_groupby_results_equal (#8272)
shwina May 20, 2021
47c3572
Merge remote-tracking branch 'upstream/branch-0.19' into branch-21.06…
ajschmidt8 May 20, 2021
9e308de
Merge pull request #8302 from ajschmidt8/branch-21.06-merge-0.19
ajschmidt8 May 20, 2021
3975f10
Update `CHANGELOG.md` links for calver (#8303)
ajschmidt8 May 20, 2021
2a1075e
use address and length for GDS reads/writes (#8301)
rongou May 20, 2021
b553144
Return python lists for __getitem__ calls to list type series (#8265)
brandon-b-miller May 20, 2021
c7d0524
Copy nested types upon construction (#8244)
isVoid May 20, 2021
9a85b3b
Update cudfjni version to 21.06.0 (#8292)
pxLi May 21, 2021
b84c792
Fix concatenate_lists_ignore_null on rows of all_nulls (#8312)
sperlingxx May 21, 2021
6920f9b
Update readme with correct CUDA versions (#8315)
raydouglass May 21, 2021
5c6b92a
COLLECT_LIST support returning empty output columns. (#8279)
mythrocks May 21, 2021
de579a5
Added decimal writing for CSV writer (#8296)
kaatish May 21, 2021
696902d
Enable implicit casting when concatenating mixed types (#8276)
ChrisJar May 23, 2021
ef20706
Add separator-on-null parameter to strings concatenate APIs (#8282)
davidwendt May 24, 2021
b9588d1
JNI: Refactor the code of making column from scalar (#8310)
firestarman May 24, 2021
936b02d
Add description of the cuIO GDS integration (#8293)
vuule May 24, 2021
259d69b
Revert "patch thrust to fix intmax num elements limitation in scan_by…
cwharris May 24, 2021
3da0d12
added _is_homogeneous property (#8299)
shaneding May 24, 2021
63faf2f
Use empty_like in scatter (#8314)
revans2 May 24, 2021
e555643
Update environment variable used to determine `cuda_version` (#8321)
ajschmidt8 May 24, 2021
b1d7788
Update Java string concatenate test for single column (#8330)
tgravescs May 24, 2021
5c0a75b
Fix cudf release version in readme (#8331)
galipremsagar May 24, 2021
cd4cfba
Merge branch-21.06 into branch-21.08
galipremsagar May 24, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
326 changes: 319 additions & 7 deletions CHANGELOG.md

Large diffs are not rendered by default.

11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,15 +65,16 @@ Please see the [Demo Docker Repository](https://hub.docker.com/r/rapidsai/rapids

cuDF can be installed with conda ([miniconda](https://conda.io/miniconda.html), or the full [Anaconda distribution](https://www.anaconda.com/download)) from the `rapidsai` channel:

For `cudf version == 21.081.06` :
<<<<<<< HEAD
For `cudf version == 21.06` :
```bash
# for CUDA 10.1
# for CUDA 11.0
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=21.081.06 python=3.7 cudatoolkit=10.1
cudf=21.06 python=3.7 cudatoolkit=11.0

# or, for CUDA 10.2
# or, for CUDA 11.2
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=21.081.06 python=3.7 cudatoolkit=10.2
cudf=21.06 python=3.7 cudatoolkit=11.2

```

Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
{% set py_version=environ.get('CONDA_PY', 36) %}
{% set cuda_version='.'.join(environ.get('CUDA_VERSION', '10.1').split('.')[:2]) %}
{% set cuda_version='.'.join(environ.get('CUDA', '10.1').split('.')[:2]) %}

package:
name: cudf
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cudf_kafka/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
{% set py_version=environ.get('CONDA_PY', 36) %}
{% set cuda_version='.'.join(environ.get('CUDA_VERSION', '10.1').split('.')[:2]) %}
{% set cuda_version='.'.join(environ.get('CUDA', '10.1').split('.')[:2]) %}

package:
name: cudf_kafka
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/custreamz/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
{% set py_version=environ.get('CONDA_PY', 36) %}
{% set cuda_version='.'.join(environ.get('CUDA_VERSION', '10.1').split('.')[:2]) %}
{% set cuda_version='.'.join(environ.get('CUDA', '10.1').split('.')[:2]) %}

package:
name: custreamz
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/dask-cudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
{% set py_version=environ.get('CONDA_PY', 36) %}
{% set cuda_version='.'.join(environ.get('CUDA_VERSION', '10.1').split('.')[:2]) %}
{% set cuda_version='.'.join(environ.get('CUDA', '10.1').split('.')[:2]) %}

package:
name: dask-cudf
Expand Down
6 changes: 4 additions & 2 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set version = environ.get('GIT_DESCRIBE_TAG', '0.0.0.dev').lstrip('v') + environ.get('VERSION_SUFFIX', '') %}
{% set minor_version = version.split('.')[0] + '.' + version.split('.')[1] %}
{% set cuda_version='.'.join(environ.get('CUDA_VERSION', '10.1').split('.')[:2]) %}
{% set cuda_version='.'.join(environ.get('CUDA', '10.1').split('.')[:2]) %}

package:
name: libcudf
Expand Down Expand Up @@ -133,12 +133,14 @@ test:
- test -f $PREFIX/include/cudf/io/types.hpp
- test -f $PREFIX/include/cudf/ipc.hpp
- test -f $PREFIX/include/cudf/join.hpp
- test -f $PREFIX/include/cudf/lists/detail/combine.hpp
- test -f $PREFIX/include/cudf/lists/detail/concatenate.hpp
- test -f $PREFIX/include/cudf/lists/detail/copying.hpp
- test -f $PREFIX/include/cudf/lists/lists_column_factories.hpp
- test -f $PREFIX/include/cudf/lists/detail/drop_list_duplicates.hpp
- test -f $PREFIX/include/cudf/lists/detail/interleave_columns.hpp
- test -f $PREFIX/include/cudf/lists/detail/sorting.hpp
- test -f $PREFIX/include/cudf/lists/concatenate_rows.hpp
- test -f $PREFIX/include/cudf/lists/combine.hpp
- test -f $PREFIX/include/cudf/lists/count_elements.hpp
- test -f $PREFIX/include/cudf/lists/explode.hpp
- test -f $PREFIX/include/cudf/lists/drop_list_duplicates.hpp
Expand Down
5 changes: 3 additions & 2 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,8 @@ add_library(cudf
src/join/join.cu
src/join/semi_join.cu
src/lists/contains.cu
src/lists/concatenate_rows.cu
src/lists/combine/concatenate_list_elements.cu
src/lists/combine/concatenate_rows.cu
src/lists/copying/concatenate.cu
src/lists/copying/copying.cu
src/lists/copying/gather.cu
Expand Down Expand Up @@ -332,8 +333,8 @@ add_library(cudf
src/strings/char_types/char_cases.cu
src/strings/char_types/char_types.cu
src/strings/combine/concatenate.cu
src/strings/combine/concatenate_list_elements.cu
src/strings/combine/join.cu
src/strings/combine/join_list_elements.cu
src/strings/contains.cu
src/strings/convert/convert_booleans.cu
src/strings/convert/convert_datetime.cu
Expand Down
22 changes: 0 additions & 22 deletions cpp/cmake/thrust.patch
Original file line number Diff line number Diff line change
Expand Up @@ -81,25 +81,3 @@ index c0c6d59..937ee31 100644
{
typedef AgentScanPolicy<
128, 15, ///< Threads per block, items per thread
diff --git a/thrust/system/cuda/detail/scan_by_key.h b/thrust/system/cuda/detail/scan_by_key.h
index fe4b321c..b3974c69 100644
--- a/thrust/system/cuda/detail/scan_by_key.h
+++ b/thrust/system/cuda/detail/scan_by_key.h
@@ -513,7 +513,7 @@ namespace __scan_by_key {
scan_op(scan_op_)
{
int tile_idx = blockIdx.x;
- Size tile_base = ITEMS_PER_TILE * tile_idx;
+ Size tile_base = ITEMS_PER_TILE * static_cast<Size>(tile_idx);
Size num_remaining = num_items - tile_base;

if (num_remaining > ITEMS_PER_TILE)
@@ -734,7 +734,7 @@ namespace __scan_by_key {
ScanOp scan_op,
AddInitToScan add_init_to_scan)
{
- int num_items = static_cast<int>(thrust::distance(keys_first, keys_last));
+ size_t num_items = static_cast<size_t>(thrust::distance(keys_first, keys_last));
size_t storage_size = 0;
cudaStream_t stream = cuda_cub::stream(policy);
bool debug_sync = THRUST_DEBUG_SYNC_FLAG;
3 changes: 2 additions & 1 deletion cpp/include/cudf/column/column_factories.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -541,7 +541,8 @@ std::unique_ptr<cudf::column> make_structs_column(
*
* The output column will have the same type as `s.type()`
* The output column will contain all null rows if `s.invalid()==false`
* The output column will be empty if `size==0`.
* The output column will be empty if `size==0`. For LIST scalars, the column hierarchy
* from @p s is preserved.
*
* @param[in] s The scalar to use for values in the column.
* @param[in] size The number of rows for the output column.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
namespace cudf {
namespace lists {
/**
* @addtogroup lists_concatenate_rows
* @addtogroup lists_combine
* @{
* @file
*/
Expand Down Expand Up @@ -53,16 +53,47 @@ enum class concatenate_null_policy { IGNORE, NULLIFY_OUTPUT_ROW };
*
* @param input Table of lists to be concatenated.
* @param null_policy The parameter to specify whether a null list element will be ignored from
* concatenation, or any concatenation involving a null list element will result in a null list.
* concatenation, or any concatenation involving a null element will result in a null list.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return A new column in which each row is a list resulted from concatenating all list elements in
* the corresponding row of the input table.
* the corresponding row of the input table.
*/
std::unique_ptr<column> concatenate_rows(
table_view const& input,
concatenate_null_policy null_policy = concatenate_null_policy::IGNORE,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief Concatenating multiple lists on the same row of a lists column into a single list.
*
* Given a lists column where each row in the column is a list of lists of entries, an output lists
* column is generated by concatenating all the list elements at the same row together. If any row
* contains null list elements, the concatenation process will either ignore those null elements, or
* will simply set the entire resulting row to be a null element.
*
* @code{.pseudo}
* l = [ [{1, 2}, {3, 4}, {5}], [{6}, {}, {7, 8, 9}] ]
* r = lists::concatenate_list_elements(l);
* r is [ {1, 2, 3, 4, 5}, {6, 7, 8, 9} ]
* @endcode
*
* @throws cudf::logic_error if the input column is not at least two-level depth lists column (i.e.,
* each row must be a list of list).
* @throws cudf::logic_error if the input lists column contains nested typed entries that are not
* lists.
*
* @param input The lists column containing lists of list elements to concatenate.
* @param null_policy The parameter to specify whether a null list element will be ignored from
* concatenation, or any concatenation involving a null element will result in a null list.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return A new column in which each row is a list resulted from concatenating all list elements in
* the corresponding row of the input lists column.
*/
std::unique_ptr<column> concatenate_list_elements(
column_view const& input,
concatenate_null_policy null_policy = concatenate_null_policy::IGNORE,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/** @} */ // end of group
} // namespace lists
} // namespace cudf
49 changes: 49 additions & 0 deletions cpp/include/cudf/lists/detail/combine.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
/*
* Copyright (c) 2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cudf/column/column.hpp>
#include <cudf/lists/combine.hpp>
#include <cudf/lists/lists_column_view.hpp>

namespace cudf {
namespace lists {
namespace detail {
/**
* @copydoc cudf::lists::concatenate_rows
*
* @param stream CUDA stream used for device memory operations and kernel launches.
*/
std::unique_ptr<column> concatenate_rows(
table_view const& input,
concatenate_null_policy null_policy,
rmm::cuda_stream_view stream,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @copydoc cudf::lists::concatenate_list_elements
*
* @param stream CUDA stream used for device memory operations and kernel launches.
*/
std::unique_ptr<column> concatenate_list_elements(
column_view const& input,
concatenate_null_policy null_policy,
rmm::cuda_stream_view stream,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

} // namespace detail
} // namespace lists
} // namespace cudf
17 changes: 1 addition & 16 deletions cpp/include/cudf/lists/detail/copying.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020, NVIDIA CORPORATION.
* Copyright (c) 2020-2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -48,21 +48,6 @@ std::unique_ptr<cudf::column> copy_slice(lists_column_view const& lists,
rmm::cuda_stream_view stream,
rmm::mr::device_memory_resource* mr);

/**
* @brief Create a single-level empty lists column.
*
* An empty lists column contains empty children so the column's
* basic type is recorded.
*
* @param child_type The type used for the child column.
* @param stream CUDA stream used for device memory operations and kernel launches
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New empty lists column.
*/
std::unique_ptr<cudf::column> make_empty_lists_column(data_type child_type,
rmm::cuda_stream_view stream,
rmm::mr::device_memory_resource* mr);

} // namespace detail
} // namespace lists
} // namespace cudf
5 changes: 1 addition & 4 deletions cpp/include/cudf/lists/detail/scatter.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -526,10 +526,7 @@ struct list_child_constructor {

if (num_child_rows == 0) {
// make an empty lists column using the input child type
return make_empty_lists_column(
source_lists_column_view.child().child(lists_column_view::child_column_index).type(),
stream,
mr);
return empty_like(source_lists_column_view.child());
}

auto child_list_views = rmm::device_uvector<unbound_list_view>(num_child_rows, stream, mr);
Expand Down
42 changes: 42 additions & 0 deletions cpp/include/cudf/lists/lists_column_factories.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Copyright (c) 2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <cudf/column/column.hpp>
#include <cudf/scalar/scalar.hpp>
#include <cudf/types.hpp>

namespace cudf {
namespace lists {
namespace detail {

/**
* @brief Internal API to construct a lists column from a `list_scalar`, for public
* use, use `cudf::make_column_from_scalar`.
*
* @param[in] value The `list_scalar` to construct from
* @param[in] size The number of rows for the output column.
* @param[in] stream CUDA stream used for device memory operations and kernel launches.
* @param[in] mr Device memory resource used to allocate the returned column's device memory.
*/
std::unique_ptr<cudf::column> make_lists_column_from_scalar(
list_scalar const& value,
size_type size,
rmm::cuda_stream_view stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

} // namespace detail
} // namespace lists
} // namespace cudf
Loading