Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate round to pylibcudf #15863

Merged
merged 13 commits into from
May 31, 2024
Merged

Conversation

lithomas1
Copy link
Contributor

@lithomas1 lithomas1 commented May 24, 2024

Description

xref #15162

Migrate round.pxd to use pylibcudf APIs.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added Python Affects Python cuDF API. CMake CMake build issue labels May 24, 2024
@lithomas1 lithomas1 added feature request New feature or request non-breaking Non-breaking change labels May 24, 2024
@lithomas1 lithomas1 marked this pull request as ready for review May 28, 2024 15:14
@lithomas1 lithomas1 requested a review from a team as a code owner May 28, 2024 15:14
@@ -9,9 +9,9 @@ from cudf._lib.pylibcudf.libcudf.column.column_view cimport column_view

cdef extern from "cudf/round.hpp" namespace "cudf" nogil:

ctypedef enum rounding_method "cudf::rounding_method":
HALF_UP "cudf::rounding_method::HALF_UP"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the strings after the enum values mean anything (e.g. "cudf::rounding_method::HALF_UP") or are they purely cosmetic?

(asking since I don't want to unintentionally change behavior by changing this here)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings are meaningful, but removing them is the right answer. The strings are a way to alias the names of Cython objects so that they appear differently in generated C code. IOW when you write cdef rounding_method x = HALF_UP in Cython code, that will translate to C code that says enum rounding_method X = cudf::rounding_method::HALF_UP. The reason that we previously need this in enums in our Cython is because prior to version 3.0 Cython did not support C++ scoped enumerations (enum class), but all of our C++ enums are defined that way, so the alias was the only way to trick Cython into generating the code that we needed (with raw C enums you don't need to do cudf::rounding_method::HALF_UP, you can just do cudf::HALF_UP because the enum doesn't create it's own namespace). By switching to using an enum class here we remove the need for that.

@lithomas1 lithomas1 requested review from vyasr and removed request for mroeschke May 28, 2024 23:12
@vyasr vyasr added the pylibcudf Issues specific to the pylibcudf package label May 28, 2024
vyasr
vyasr previously requested changes May 29, 2024
@@ -9,9 +9,9 @@ from cudf._lib.pylibcudf.libcudf.column.column_view cimport column_view

cdef extern from "cudf/round.hpp" namespace "cudf" nogil:

ctypedef enum rounding_method "cudf::rounding_method":
HALF_UP "cudf::rounding_method::HALF_UP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The strings are meaningful, but removing them is the right answer. The strings are a way to alias the names of Cython objects so that they appear differently in generated C code. IOW when you write cdef rounding_method x = HALF_UP in Cython code, that will translate to C code that says enum rounding_method X = cudf::rounding_method::HALF_UP. The reason that we previously need this in enums in our Cython is because prior to version 3.0 Cython did not support C++ scoped enumerations (enum class), but all of our C++ enums are defined that way, so the alias was the only way to trick Cython into generating the code that we needed (with raw C enums you don't need to do cudf::rounding_method::HALF_UP, you can just do cudf::HALF_UP because the enum doesn't create it's own namespace). By switching to using an enum class here we remove the need for that.

round as cpp_round,
rounding_method as cpp_rounding_method,
)
from cudf._lib.pylibcudf.round cimport rounding_method
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general rule we want to avoid cimporting anything from pylibcudf into cudf._lib. The goal is for cudf to become pure Python, or as close to it as possible (we may still use Cython, but primarily for cudf-specific internal optimizations; calls to the libcudf APIs should be done by treating pylibcudf as a pure Python API).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Followup:
Is there still a plan for a Cython API to pylibcudf then?
(or are the pxd's for internal pylibcudf use only)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I still want the Cython API, but it's a lower priority. The Cython API will probably end up used in specific parts of cudf internals where we might actually see performance benefits, but in general I expect the bulk of usage to be of the Python API. The exceptions I foresee are cases like groupby or joins where generating many groupby/agg objects from cuDF can be accelerated using the pylibcudf Cython API.

python/cudf/cudf/_lib/round.pyx Outdated Show resolved Hide resolved
@lithomas1 lithomas1 mentioned this pull request May 29, 2024
3 tasks
@lithomas1 lithomas1 requested a review from vyasr May 29, 2024 23:34
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@lithomas1 lithomas1 added the 5 - DO NOT MERGE Hold off on merging; see PR for details label May 30, 2024
@lithomas1
Copy link
Contributor Author

lithomas1 commented May 30, 2024

I need to update the tests to swap the order of result and expected.

EDIT: Done

@lithomas1 lithomas1 removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label May 30, 2024
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@wence- wence- dismissed vyasr’s stale review May 31, 2024 15:14

Issues addressed

@wence-
Copy link
Contributor

wence- commented May 31, 2024

/merge

@rapids-bot rapids-bot bot merged commit e7be142 into rapidsai:branch-24.08 May 31, 2024
70 checks passed
@lithomas1 lithomas1 deleted the pylibcudf-round branch May 31, 2024 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue feature request New feature or request non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants