Use fused types for overloaded function signatures #14969

vyasr · 2024-02-05T18:00:52Z

Description

This change makes the pylibcudf API more convenient and a more faithful reproduction of the underlying libcudf APIs that offer overloaded signatures. In cases like binary ops where we were previously using runtime instance checks, this change also removes unnecessary runtime overhead if the calling code is Cython since in those cases the types at the call site are known at compile time.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

shwina

Nice! I believe this is the first use of fused types in our Cython

python/cudf/cudf/_lib/pylibcudf/copying.pxd

vyasr · 2024-02-05T19:46:45Z

/merge

bdice · 2024-02-05T19:45:26Z

python/cudf/cudf/_lib/copying.pyx

-            pylibcudf.Table([col.to_pylibcudf(mode="read") for col in target_columns]),
-        )
+    tbl = pylibcudf.copying.scatter(
+        pylibcudf.Table([col.to_pylibcudf(mode="read") for col in sources])


Can we pull out the line that defines the table/scalar here and declare a separate variable for it? It is hard to read a three-line expression that just supplies one argument.

Yeah that's fair. TBH I'm not too concerned about how legacy Cython code looks right now because once we've completed pylibcudf we'll need to rewrite cudf internals substantially to leverage pylibcudf properly, so all of this will go away.

Would pulling that out into a variable be able to avoid the conditional cdef-ing problem? Why/why not?

I'm confused, was this question meant for the other thread?

No, I'm trying to ask a related question here. I realize now it's because this is def and the other is cpdef.

Ah right that's why I was confused, now I see. You're on the right track; it's less about this function being a def or a cpdef, and more about what it's calling. We're calling pylibcudf as a Python API here, so it accepts arbitrary objects as input and we're not cdefing them. In the other places we have to cdef the inputs because we are calling cdef functions (C++ cdef extern functions) that require strongly typed input.

bdice · 2024-02-05T19:48:19Z

python/cudf/cudf/_lib/pylibcudf/binaryop.pyx

-                    (<Column> lhs).view(),
-                    dereference((<Scalar> rhs).c_obj),
+                    lhs.view(),
+                    dereference(rhs.c_obj),


Is it possible to evaluate the lhs/rhs objects ahead of time and reduce the dispatching logic here?

I'm assuming that what you're envisioning is something like (pseudocode):

if lhs is column and rhs is column: c_lhs = lhs.view() c_rhs = rhs.view() elif lhs is column and rhs is scalar: c_lhs = lhs.view() c_rhs = dereference(rhs.c_obj) ... result = move(cpp_binaryop(c_lhs, c_rhs, ...)) ...

Is that what you're thinking?

If so, unfortunately no that doesn't work. There are two reasons:

cdefs can't go inside conditionals. That includes conditionals that are compile-time branches based on fused types. Therefore, you'd still need separate cdefs for c_lhs_scalar and c_lhs_column, for example, then you'd assign them inside the conditional block. That's at least as verbose, but one could perhaps argue that it's more DRY. However:

The C++ function call needs to resolve to the correct overload, so you have to pass the appropriately-typed (known at compile-time) objects to calling cpp_binaryop. So no matter what you need to invoke the function separately for each possible overload.

Ultimately the translation from Cython types (fused or not) to C++ types remains a point of friction.

Let me know if that answered your question or if you meant something else.

Cool. That was my expectation but I didn't want to miss out on anything special if fused types would allow this.

bdice · 2024-02-05T19:50:06Z

python/cudf/cudf/_lib/pylibcudf/copying.pyx

-            cpp_copying.empty_like(
-                (<Column> input).view(),
+    else:
+        source_scalars = _as_vector(source)


As above, is it possible to reduce this dispatching logic when using fused types?

vyasr added 3 commits February 4, 2024 08:42

Implement binaryop using fused types

ba50dae

Use fused types for copying APIs

982e78d

Document use of fused types

4f7ca1e

vyasr added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 5, 2024

vyasr self-assigned this Feb 5, 2024

vyasr requested a review from a team as a code owner February 5, 2024 18:00

vyasr requested review from bdice and isVoid February 5, 2024 18:00

github-actions bot added the Python Affects Python cuDF API. label Feb 5, 2024

shwina approved these changes Feb 5, 2024

View reviewed changes

python/cudf/cudf/_lib/pylibcudf/copying.pxd Show resolved Hide resolved

rapids-bot bot merged commit dfc7f25 into rapidsai:branch-24.04 Feb 5, 2024
68 checks passed

vyasr deleted the feat/pylibcudf_fused_types branch February 5, 2024 19:46

bdice reviewed Feb 5, 2024

View reviewed changes

vyasr added the pylibcudf Issues specific to the pylibcudf package label May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use fused types for overloaded function signatures #14969

Use fused types for overloaded function signatures #14969

vyasr commented Feb 5, 2024

shwina left a comment

vyasr commented Feb 5, 2024

bdice Feb 5, 2024

vyasr Feb 5, 2024

bdice Feb 5, 2024 •

edited

Loading

vyasr Feb 5, 2024

bdice Feb 5, 2024

vyasr Feb 5, 2024

bdice Feb 5, 2024

vyasr Feb 5, 2024

bdice Feb 5, 2024

bdice Feb 5, 2024

Use fused types for overloaded function signatures #14969

Use fused types for overloaded function signatures #14969

Conversation

vyasr commented Feb 5, 2024

Description

Checklist

shwina left a comment

Choose a reason for hiding this comment

vyasr commented Feb 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice Feb 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice Feb 5, 2024 •

edited

Loading