Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix calls to deprecated strings factory API #14771

Merged

Conversation

davidwendt
Copy link
Contributor

Description

Fixes deprecation warnings introduced when #14202 merged.
Most of these are for calls to cudf::make_strings_column which deprecated the chars-column function overload.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added bug Something isn't working 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Jan 17, 2024
@davidwendt davidwendt self-assigned this Jan 17, 2024
@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jan 18, 2024
@davidwendt davidwendt marked this pull request as ready for review January 18, 2024 13:04
@davidwendt davidwendt requested a review from a team as a code owner January 18, 2024 13:04
@bdice
Copy link
Contributor

bdice commented Jan 18, 2024

@davidwendt Is there any planned refactoring for make_strings_children? It seems like we would need the chars column that it returns to be a raw buffer, to eliminate the size_type limitation on that column. Is that correct?

@davidwendt
Copy link
Contributor Author

@davidwendt Is there any planned refactoring for make_strings_children? It seems like we would need the chars column that it returns to be a raw buffer, to eliminate the size_type limitation on that column. Is that correct?

Yes, that is correct.
My plan is to make a new make_strings_children in an experimental namespace so we can make changes to callers incrementally.

jb, num_rows, cudf::get_default_stream(), rmm::mr::get_current_device_resource());
return cudf::make_strings_column(
num_rows, std::move(children.first), std::move(children.second), 0, {});
num_rows, std::move(offsets), std::move(chars->release().data.release()[0]), 0, {});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repeating pattern chars->release().data.release()[0] throughout this PR is a bit awkward. This relates to my comment about refactoring make_strings_children to return a buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the awkwardness is only temporary once the new utility is created.

auto chars_column = strings::detail::create_chars_child_column(bytes, stream, mr);
// merge the strings
auto d_chars = chars_column->mutable_view().template data<char>();
rmm::device_uvector<char> chars(bytes, stream, mr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we prefer a device_uvector instead of a raw buffer here, given that we want to hold character buffers at the end? (This pattern reoccurs.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, maybe it's just nicer to have the typed container than a raw buffer. It plays well with our sync/async vector factories and the .release() is fairly easy to deal with when you're ready for a buffer. I'm okay with keeping this, just interested in hearing your thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typing aids in handling the remaining code that would have needed casting. Additionally, the inline call to release() doesn't necessitate a call to std::move.

cpp/include/cudf_test/column_wrapper.hpp Outdated Show resolved Hide resolved
std::get<0>(cudf::test::to_host<cudf::size_type>(str_col.offsets())));
col_data.emplace_back(std::get<0>(cudf::test::to_host<int8_t>(str_col.chars())));
}

Copy link
Contributor Author

@davidwendt davidwendt Jan 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was not involved in verifying the results.
I removed it because it was calling the deprecated chars() function.

@karthikeyann
Copy link
Contributor

These deprecated warnings didn't show up, while compiling with rapids-compose. @trxcllnt
I started using dev container now, which shows these deprecated warnings.

Copy link
Contributor

@karthikeyann karthikeyann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rmm::device_uvector<char> chars(size_t, stream, mr) pattern works well (although I would prefer rmm buffer and chars_data = static_cast<char*>(buffer.data()), and a std::move). But this is fine.

could we update cudf::strings::detail::make_strings_children to return pair of column and rmm::device_buffer ? (perhaps, should this be part of another PR?). This should help us avoid the std::move(chars_column->release().data.release()[0]).

@davidwendt
Copy link
Contributor Author

could we update cudf::strings::detail::make_strings_children to return pair of column and rmm::device_buffer ? (perhaps, should this be part of another PR?). This should help us avoid the std::move(chars_column->release().data.release()[0]).

Yes, a separate PR. See these comments/responses:
#14771 (comment)
#14771 (comment)

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Thanks for explaining the next steps a bit more. I like that plan.

@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 4317313 into rapidsai:branch-24.02 Jan 19, 2024
68 checks passed
@davidwendt davidwendt deleted the fix-dep-strs-factory-calls branch January 19, 2024 14:32
rapids-bot bot pushed a commit that referenced this pull request Jan 24, 2024
Follow-up PR to #14771.

I noticed the strings example code still had a deprecated function call:

```
-- Build files have been written to: /opt/conda/conda-bld/work/cpp/examples/strings/build
[1/8] Building CXX object CMakeFiles/libcudf_apis.dir/libcudf_apis.cpp.o
[2/8] Linking CXX executable libcudf_apis
[3/8] Building CUDA object CMakeFiles/custom_prealloc.dir/custom_prealloc.cu.o
[4/8] Building CUDA object CMakeFiles/custom_with_malloc.dir/custom_with_malloc.cu.o
[5/8] Linking CUDA executable custom_prealloc
[6/8] Linking CUDA executable custom_with_malloc
[7/8] Building CUDA object CMakeFiles/custom_optimized.dir/custom_optimized.cu.o
/opt/conda/conda-bld/work/cpp/examples/strings/custom_optimized.cu: In function 'std::unique_ptr<cudf::column> redact_strings(const cudf::column_view&, const cudf::column_view&)':
/opt/conda/conda-bld/work/cpp/examples/strings/custom_optimized.cu:158:40: warning: 'std::unique_ptr<cudf::column> cudf::make_strings_column(cudf::size_type, rmm::device_uvector<int>&&, rmm::device_uvector<char>&&, rmm::device_buffer&&, cudf::size_type)' is deprecated [-Wdeprecated-declarations]
  158 |   auto result =
      |               ~                        ^                                                          
/opt/conda/conda-bld/work/cpp/include/cudf/column/column_factories.hpp:510:42: note: declared here
  510 | [[deprecated]] std::unique_ptr<column> make_strings_column(size_type num_strings,
      |                                          ^~~~~~~~~~~~~~~~~~~
[8/8] Linking CUDA executable custom_optimized
```

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mark Harris (https://github.com/harrism)

URL: #14838
PointKernel pushed a commit to PointKernel/cudf that referenced this pull request Jan 25, 2024
)

Follow-up PR to rapidsai#14771.

I noticed the strings example code still had a deprecated function call:

```
-- Build files have been written to: /opt/conda/conda-bld/work/cpp/examples/strings/build
[1/8] Building CXX object CMakeFiles/libcudf_apis.dir/libcudf_apis.cpp.o
[2/8] Linking CXX executable libcudf_apis
[3/8] Building CUDA object CMakeFiles/custom_prealloc.dir/custom_prealloc.cu.o
[4/8] Building CUDA object CMakeFiles/custom_with_malloc.dir/custom_with_malloc.cu.o
[5/8] Linking CUDA executable custom_prealloc
[6/8] Linking CUDA executable custom_with_malloc
[7/8] Building CUDA object CMakeFiles/custom_optimized.dir/custom_optimized.cu.o
/opt/conda/conda-bld/work/cpp/examples/strings/custom_optimized.cu: In function 'std::unique_ptr<cudf::column> redact_strings(const cudf::column_view&, const cudf::column_view&)':
/opt/conda/conda-bld/work/cpp/examples/strings/custom_optimized.cu:158:40: warning: 'std::unique_ptr<cudf::column> cudf::make_strings_column(cudf::size_type, rmm::device_uvector<int>&&, rmm::device_uvector<char>&&, rmm::device_buffer&&, cudf::size_type)' is deprecated [-Wdeprecated-declarations]
  158 |   auto result =
      |               ~                        ^                                                          
/opt/conda/conda-bld/work/cpp/include/cudf/column/column_factories.hpp:510:42: note: declared here
  510 | [[deprecated]] std::unique_ptr<column> make_strings_column(size_type num_strings,
      |                                          ^~~~~~~~~~~~~~~~~~~
[8/8] Linking CUDA executable custom_optimized
```

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - David Wendt (https://github.com/davidwendt)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mark Harris (https://github.com/harrism)

URL: rapidsai#14838
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants