Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement strings::repeat_strings #8423

Merged
merged 30 commits into from
Jun 9, 2021
Merged
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b44caba
Add skeleton for the new API
ttnghia May 28, 2021
1797d9d
Rename folder `copying` to `copy`
ttnghia May 31, 2021
980b52f
Rewrite doxygen
ttnghia May 31, 2021
925ff3f
Finish a draft implementation
ttnghia Jun 1, 2021
43d1bce
Add some unit tests
ttnghia Jun 1, 2021
1675b17
Rewrite doxygen
ttnghia Jun 1, 2021
186867c
Handle errors
ttnghia Jun 1, 2021
b674815
Complete unit tests
ttnghia Jun 1, 2021
db2dc83
Merge branch 'branch-21.08' into strings_repeat
ttnghia Jun 1, 2021
579dd15
Add a throw test
ttnghia Jun 1, 2021
078efc2
Fix copyright header
ttnghia Jun 1, 2021
c57056d
Rename files
ttnghia Jun 2, 2021
b17f14c
Rename `repeat_join` to `repeat_strings`
ttnghia Jun 2, 2021
98e5180
Some optimizations
ttnghia Jun 2, 2021
ec6f2ab
Using `stream` parameter for calling `scalar::is_valid`
ttnghia Jun 3, 2021
4aace60
Remove error bound check and add doxygen
ttnghia Jun 4, 2021
d4cbf8b
Merge branch 'branch-21.08' into strings_repeat
ttnghia Jun 4, 2021
77e95cc
Fix doxygen
ttnghia Jun 4, 2021
4b34d63
Fix typo
ttnghia Jun 4, 2021
41687a6
Move `src/strings/copy` back to `src/strings/copying`
ttnghia Jun 4, 2021
3eec00c
Rename `copy.hpp` to `repeat_strings.hpp`
ttnghia Jun 4, 2021
3ffd253
Change order in CMakeLists.txt
ttnghia Jun 4, 2021
80a5801
Fix typo
ttnghia Jun 4, 2021
4805344
Rename test file
ttnghia Jun 4, 2021
f8b93f2
Address review comments
ttnghia Jun 7, 2021
fbce944
Merge remote-tracking branch 'origin/branch-21.08' into strings_repeat
ttnghia Jun 8, 2021
b684f1e
Resolve merge conflicts
ttnghia Jun 8, 2021
123a551
Use `copy_bitmask` without null check
ttnghia Jun 8, 2021
105d630
Fix doxygen, change return type for the scalar version, and fix tests
ttnghia Jun 8, 2021
45df559
Reorder headers in `meta.yaml`
ttnghia Jun 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Add skeleton for the new API
ttnghia committed May 30, 2021
commit b44cabae8ab690eac871b1b21a1eb43ea705d4be
1 change: 1 addition & 0 deletions conda/recipes/libcudf/meta.yaml
Original file line number Diff line number Diff line change
@@ -180,6 +180,7 @@ test:
- test -f $PREFIX/include/cudf/strings/convert/convert_integers.hpp
- test -f $PREFIX/include/cudf/strings/convert/convert_ipv4.hpp
- test -f $PREFIX/include/cudf/strings/convert/convert_urls.hpp
- test -f $PREFIX/include/cudf/strings/copy.hpp
- test -f $PREFIX/include/cudf/strings/detail/combine.hpp
- test -f $PREFIX/include/cudf/strings/detail/concatenate.hpp
- test -f $PREFIX/include/cudf/strings/detail/converters.hpp
1 change: 1 addition & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -349,6 +349,7 @@ add_library(cudf
src/strings/convert/convert_urls.cu
src/strings/copying/concatenate.cu
src/strings/copying/copying.cu
src/strings/copying/repeat.cu
src/strings/extract.cu
src/strings/filling/fill.cu
src/strings/filter_chars.cu
81 changes: 81 additions & 0 deletions cpp/include/cudf/strings/copy.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
/*
* Copyright (c) 2021, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cudf/scalar/scalar.hpp>
#include <cudf/strings/strings_column_view.hpp>

namespace cudf {
namespace strings {
/**
* @addtogroup strings_copy
* @{
* @file strings/copy.hpp
* @brief Strings APIs for copying
*/

/**
* @brief Row-wise concatenates the given list of strings columns and
* returns a single strings column result.
*
* @code{.pseudo}
* Example:
* s1 = ['aa', null, '', 'dd']
* out = concatenate({s1, s2})
* out is ['aa', null, 'cc', null]
* @endcode
*
* @throw cudf::logic_error if input columns are not all strings columns.
* @throw cudf::logic_error if separator is not valid.
* @throw cudf::logic_error if only one column is specified
*
* @param strings_columns List of string columns to concatenate.
* @param mr Device memory resource used to allocate the returned column's device memory.
* @return New column with concatenated results.
*/
std::unique_ptr<column> repeat(
string_scalar const& string,
size_type repeat_times,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
* @brief
*
* @code{.pseudo}
* Example:
* c0 = ['aa', null, '', 'ee', null, 'ff']
* out = concatenate({c0, c1, c2}, sep)
* out is [null, null, null, null, null, null]
*
* @endcode
*
* @throw cudf::logic_error if no input columns are specified - table view is empty
* @throw cudf::logic_error if input columns are not all strings columns.
* @throw cudf::logic_error if the number of rows from @p separators and @p strings_columns
* do not match
*
* @param strings_columns List of strings columns to concatenate.
* @param mr Resource for allocating device memory.
* @return New column with concatenated results.
*/
std::unique_ptr<column> repeat(
strings_column_view const& strings_column,
size_type repeat_times,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/** @} */ // end of doxygen group
} // namespace strings
} // namespace cudf
1 change: 1 addition & 0 deletions cpp/include/doxygen_groups.h
Original file line number Diff line number Diff line change
@@ -122,6 +122,7 @@
* @defgroup strings_combine Combining
* @defgroup strings_contains Searching
* @defgroup strings_convert Converting
* @defgroup strings_copy Copying
* @defgroup strings_substring Substring
* @defgroup strings_find Finding
* @defgroup strings_modify Modifying
Empty file.