Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Java methods to split and write column views [skip ci] #8546

Merged
merged 8 commits into from
Jun 18, 2021

Conversation

razajafri
Copy link
Contributor

This PR pertains to making a couple of optimizations needed to support cases when the formation of a vector isn't needed for operations.

  • Split returns a ColumnView in cases where we don't need to own the underlying buffers
  • Added a method to write ColumnView to parquet directly, circumventing the formation of Table with ColumnVectors

* Split returns a ColumnView in cases where we don't need to own the
underlying buffers
* Added a method to write ColumnView to parquet directly, circumventing
the formation of Table with ColumnVectors
@razajafri razajafri requested a review from a team as a code owner June 17, 2021 20:43
@github-actions github-actions bot added CMake CMake build issue Java Affects Java cuDF API. labels Jun 17, 2021
@razajafri razajafri changed the title Performance optimizations Performance optimizations [skip ci] Jun 17, 2021
@razajafri razajafri added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jun 17, 2021
@razajafri
Copy link
Contributor Author

build

Comment on lines +55 to +59
if (DEFINED ENV{CUDF_CPP_BUILD_DIR})
set(CUDF_CPP_BUILD_DIR "$ENV{CUDF_CPP_BUILD_DIR}")
else()
set(CUDF_CPP_BUILD_DIR "${CUDF_SOURCE_DIR}/build")
endif()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to push this change a few times, this works fine on my workstation. Can someone please test this patch to see if it doesn't break the flow that most devs use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to test the nightly Java build environment by following the instructions in java/ci/README.md

Copy link
Contributor Author

@razajafri razajafri Jun 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it passes?


2021-06-17 22:40:54 (3.44 MB/s) - 'gds-redistrib-0.95.1.tgz' saved [2143346/2143346]

Removing intermediate container 16dfd7215a18
 ---> bb7efd8d7571
Successfully built bb7efd8d7571
Successfully tagged cudf-build:11.2.2-devel-centos7

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, the above was just building the docker, I followed the instructions and CI is building with my PR. Thank you for this @jlowe

@jlowe jlowe changed the title Performance optimizations [skip ci] Add Java methods to split and write column views [skip ci] Jun 17, 2021
java/src/main/java/ai/rapids/cudf/ColumnView.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/Table.java Outdated Show resolved Hide resolved
Comment on lines +55 to +59
if (DEFINED ENV{CUDF_CPP_BUILD_DIR})
set(CUDF_CPP_BUILD_DIR "$ENV{CUDF_CPP_BUILD_DIR}")
else()
set(CUDF_CPP_BUILD_DIR "${CUDF_SOURCE_DIR}/build")
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to test the nightly Java build environment by following the instructions in java/ci/README.md

@codecov
Copy link

codecov bot commented Jun 17, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.08@9070f48). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 84a90c4 differs from pull request most recent head f9858e0. Consider uploading reports for the commit f9858e0 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.08    #8546   +/-   ##
===============================================
  Coverage                ?   82.59%           
===============================================
  Files                   ?      109           
  Lines                   ?    17858           
  Branches                ?        0           
===============================================
  Hits                    ?    14750           
  Misses                  ?     3108           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9070f48...f9858e0. Read the comment docs.

java/src/main/java/ai/rapids/cudf/ColumnView.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/ColumnView.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/Table.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/Table.java Outdated Show resolved Hide resolved
java/src/main/java/ai/rapids/cudf/Table.java Outdated Show resolved Hide resolved
@razajafri razajafri requested a review from jlowe June 18, 2021 17:15
@razajafri
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 0099f11 into rapidsai:branch-21.08 Jun 18, 2021
@razajafri razajafri deleted the split-optimization branch June 18, 2021 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants