Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing the FrameProxyObject workaround #751

Merged
merged 4 commits into from
Oct 11, 2021

Conversation

madsbk
Copy link
Member

@madsbk madsbk commented Oct 7, 2021

Adding testing of cudf dataframe copy, which fails prior to rapidsai/cudf#9397

UPDATE: This PR now handles the jit-unspill failures we have been seeing with cudf v21.12.

Some background, when we introduced ProxyObject, cudf was using cudf._lib.table.Table as the base class for all frame-like objects (dataframes, series, etc). Table was implemented in cython so our proxy object had to derive from Table in order to support cython functions. The result was to use the class FrameProxyObject as a workaround:

    # In order to support the cuDF API implemented in Cython, we inherit from
    # `cudf._lib.table.Table`, which is the base class of Index, Series, and
    # Dataframes in cuDF.
    # Notice, the order of base classes matters. Since ProxyObject is the first
    # base class, ProxyObject.__init__() is called on creation, which doesn't
    # define the Table._data and Table._index attributes. Thus, accessing
    # FrameProxyObject._data and FrameProxyObject._index is pass-through to
    # ProxyObejct.__getattr__(), which is what we want.
    class FrameProxyObject(ProxyObject, cudf._lib.table.Table):
        pass

Now that cudf uses Frame, a pure Python class, we don't need this workaround anymore. In fact, using FrameProxyObject triggers some infinite recursion errors when used together with v21.12+

cc. @randerzander

@madsbk madsbk requested a review from a team as a code owner October 7, 2021 08:06
@github-actions github-actions bot added the python python code needed label Oct 7, 2021
@madsbk madsbk added 3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 7, 2021
Copy link
Member

@pentschev pentschev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @madsbk !

@pentschev
Copy link
Member

Actually the failure is legit @madsbk , seems like it's an infinite recursion.

@madsbk
Copy link
Member Author

madsbk commented Oct 7, 2021

Actually the failure is legit @madsbk , seems like it's an infinite recursion.

Yes, that is because rapidsai/cudf#9397 hasn't been merged yet.

@madsbk madsbk added 2 - In Progress Currently a work in progress and removed 3 - Ready for Review Ready for review by team labels Oct 7, 2021
@madsbk
Copy link
Member Author

madsbk commented Oct 7, 2021

I will work a bit more on this PR, maybe we can remove the need of the FrameProxyObject completely.

@codecov-commenter
Copy link

codecov-commenter commented Oct 7, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.12@7a426e8). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.12     #751   +/-   ##
===============================================
  Coverage                ?   62.93%           
===============================================
  Files                   ?       21           
  Lines                   ?     2898           
  Branches                ?        0           
===============================================
  Hits                    ?     1824           
  Misses                  ?     1074           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a426e8...f3d7604. Read the comment docs.

Now that cuDF doesn't use the cython implemented Table class,
we don't need the specialized FrameProxyObject class anymore.
We can simpy use ProxyObject everywhere.
@madsbk madsbk changed the title Adding test_cudf_copy Removing the FrameProxyObject workaround Oct 7, 2021
@madsbk madsbk requested a review from pentschev October 7, 2021 13:40
@pentschev
Copy link
Member

centos7 in queue for over 4 hours.

rerun tests

@pentschev
Copy link
Member

rerun tests

1 similar comment
@pentschev
Copy link
Member

rerun tests

@pentschev
Copy link
Member

Seems like the errors are waiting for dask/distributed#5380 to be merged.

@madsbk madsbk added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 2 - In Progress Currently a work in progress labels Oct 8, 2021
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this pull request Oct 8, 2021
This PR address the asymmetry in the following line in `Frame.copy()`:
```python
new_frame = self.__class__.__new__(type(self))
```
The problem is that `self.__class__` might not equal `type(self)`, which is the case for proxy objects like the one used in Dask-CUDA.
Related Dask-CUDA PR: rapidsai/dask-cuda#751

cc. @randerzander

Authors:
  - Mads R. B. Kristensen (https://github.com/madsbk)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #9397
@pentschev
Copy link
Member

rerun tests

@madsbk
Copy link
Member Author

madsbk commented Oct 11, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit bf4e00f into rapidsai:branch-21.12 Oct 11, 2021
@madsbk madsbk deleted the test_cudf_copy branch October 11, 2021 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge improvement Improvement / enhancement to an existing function non-breaking Non-breaking change python python code needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants