Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix build from changes in rapidsai/cudf#8528 #427

Closed

Conversation

cwharris
Copy link
Contributor

@cwharris cwharris commented Jun 26, 2021

rapidsai/cudf#8528 introduced some changes which broke the cuspatial build. specifically cudf.DataFrame._from_data was change to a @classmethod cudf no longer accidentally treats _from_data as an instance method, which is bad behavior cuspatial was depending on. To fix this, adjustments need to be made to cuspatial.GeoDataFrame and some associated utilities.

PR is WIP, but so far fixes all but one failing test.

@cwharris cwharris requested a review from thomcom June 26, 2021 21:12
@github-actions github-actions bot added the Python Related to Python code label Jun 26, 2021
@cwharris cwharris added bug Something isn't working cuDF non-breaking Non-breaking change and removed Python Related to Python code labels Jun 26, 2021
@cwharris cwharris requested review from vyasr and shwina June 26, 2021 21:13
out._index = index
if columns is not None:
out.columns = columns
return out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems identical to DataFrame._from_data, so maybe we just not override it at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, still playing around trying to fix the last CI error, but this probably just goes away.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the constructor of this class, it seems like this method probably needs some logic like

if not isinstance(next(iter(data.values())), GeoColumn):
    # Convert cudf Columns to GeoColumns in `data`
return super()._from_data(data, index, columns)

unless this method is never called with a dict of GeoColumns.

Comment on lines +115 to +116
@classmethod
def _from_data(cls, new_data, name=None, index=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was very confused by the original description of this PR since _from_data was always a classmethod :) I see the edit, but perhaps the change you're referring to happened on a different PR of mine? AFAICT the one you're linking to didn't modify the relevant APIs at all.

out._index = index
if columns is not None:
out.columns = columns
return out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the constructor of this class, it seems like this method probably needs some logic like

if not isinstance(next(iter(data.values())), GeoColumn):
    # Convert cudf Columns to GeoColumns in `data`
return super()._from_data(data, index, columns)

unless this method is never called with a dict of GeoColumns.

@github-actions github-actions bot added the Python Related to Python code label Jun 28, 2021
@vyasr
Copy link
Contributor

vyasr commented Jul 2, 2021

I'm aiming to post a fix for this issue tomorrow.

@cwharris cwharris closed this Jul 2, 2021
rapids-bot bot pushed a commit that referenced this pull request Jul 2, 2021
…ing (#430)

This PR contains three distinct changes required to get cuspatial builds working and tests passing again:
1. RMM switched to rapids-cmake (rapidsai/rmm#800), which requires CMake 3.20.1, so this PR includes the required updates for that.
2. The Arrow upgrade in cudf also moved the location of testing utilities (rapidsai/cudf#7495). Long term cuspatial needs to move away from use of the testing utilities, which are not part of cudf's public API, but we are currently blocked by rapidsai/cudf#8646, so this PR just imports the internal `assert_eq` method as a stopgap to get tests passing.
3. The changes in rapidsai/cudf#8373 altered the way that metadata was propagated to libcudf outputs from previously existing cuDF Python objects. The new code paths require cuspatial to override metadata copying at the GeoDataFrame rather than the GeoColumn level in order to ensure that information about column types is lost in the libcudf round trip and the metadata copying functions are now called on the output DataFrame rather than the input one.

This PR supersedes #427, #428, and #429, all of which can now be closed.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Christopher Harris (https://github.com/cwharris)

URL: #430
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuDF non-breaking Non-breaking change Python Related to Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants