Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Doc for Polars GPU Engine #16753

Merged
merged 34 commits into from
Sep 16, 2024

Conversation

singhmanas1
Copy link
Contributor

Modified the cudf API docs to add a page on cudf pandas detailing - 1) How to use? 2) How to learn more? 3) How to try on Google Colab?

Copy link

copy-pr-bot bot commented Sep 5, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully convinced that it makes sense to show the basic queries and install instructions, rather than just linking to the polars docs.

Rationale: this leaves us two places we need to update things if anything needs changed.

docs/cudf/source/cudf_polars/index.rst Outdated Show resolved Hide resolved
docs/cudf/source/cudf_polars/index.rst Outdated Show resolved Hide resolved
Comment on lines 10 to 48
.. code-block:: bash

pip install polars[gpu] --extra-index-url=https://pypi.nvidia.com

GPU-based execution can be triggered by simply running ``.collect(engine="gpu")`` instead of ``.collect()``.

.. code-block:: python

# Import the necessary library
import polars as pl

# Define the data for the LazyFrame
ldf = pl.LazyFrame({
"a": [1.242, 1.535],
})

print(ldf.select(pl.col("a").round(1)).collect(engine="gpu"))


For finer control, you can pass a GPUEngine object with additional configuration parameters to the ``engine=`` parameter.

.. code-block:: python

# Import the necessary library
import polars as pl

# Define the data for the LazyFrame
ldf = pl.LazyFrame({
"a": [1.242, 1.535],
})

# Configure the GPU engine with advanced settings
gpu_engine = pl.GPUEngine(
device=0,
raise_on_Fail=True # Ensure the engine fails loudly if it cannot execute on the GPU
)

# Execute the collection with the custom GPU engine configuration
print(ldf.select(pl.col("a").round(1)).collect(engine=gpu_engine))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replicates (approximately) the information that we are maintaining on the polars site. I think the better approach is to not have that here, but to just immediately link there. Perhaps we can have some benchmark results on this landing page?

removed the installation and sample code snippet.
@singhmanas1
Copy link
Contributor Author

I am aligned with the flow. Will add the benchmarks to the page next week.

See - latest flow 093ce0c

@wence- wence- added the cudf.polars Issues specific to cudf.polars label Sep 9, 2024
@bdice
Copy link
Contributor

bdice commented Sep 9, 2024

@singhmanas1 Can you write a proper title for this PR?

@github-actions github-actions bot removed the cudf.polars Issues specific to cudf.polars label Sep 11, 2024
Speed ups experience with Polars GPU Engine
Speed up with Polars GPU Engine for an 80 GB dataset
Added the benchmarks-
1. Query processing time versus dataset size.
2. Per query speedup for all 22 PDS-H queries
@singhmanas1 singhmanas1 changed the title Feat/manas polars docs API Doc for Polars GPU Engine Sep 12, 2024
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@bdice
Copy link
Contributor

bdice commented Sep 16, 2024

Why is there no CI being run here? I want to preview these docs...

@raydouglass
Copy link
Member

/ok to test

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few change requests - the only blocker is the "TBD" link. Everything else can be fixed in a follow-up PR if needed.

Copy link
Contributor

@bdice bdice Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Y axis label should be “Speedup (Polars CPU runtime / Polars GPU runtime)”

docs/cudf/source/cudf_polars/index.rst Outdated Show resolved Hide resolved
docs/cudf/source/cudf_polars/index.rst Outdated Show resolved Hide resolved
docs/cudf/source/cudf_polars/index.rst Outdated Show resolved Hide resolved
:width: 200px
:target: https://colab.research.google.com/github/rapidsai-community/showcase/blob/main/accelerated_data_processing_examples/polars_gpu_engine_demo.ipynb

Take the cuDF backend for Polars for a test-drive in a free GPU-enabled notebook environment using your Google account by `launching on Colab <TBD>`_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to fix this before merging!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdice
Copy link
Contributor

bdice commented Sep 16, 2024

One other change request -- where do we link to this page? It needs to be linked from the cuDF docs somewhere, it should not be an orphaned page. Maybe in https://github.com/rapidsai/cudf/blob/branch-24.10/docs/cudf/source/index.rst.

@bdice
Copy link
Contributor

bdice commented Sep 16, 2024

/ok to test

@bdice bdice added doc Documentation non-breaking Non-breaking change labels Sep 16, 2024
singhmanas1 and others added 13 commits September 16, 2024 10:45
1. Updated benchmark with a graph of speed ups on. compute heavy queries
2. Updated text description for the graph with compute heavy queries
Minor edits to the language
Minor language edits
Minor language edits
Minor language edits
Added hardware configuration for the benchmark
Updated the hardware specs
@brandon-b-miller
Copy link
Contributor

/ok to test

@brandon-b-miller
Copy link
Contributor

/ok to test

docs/cudf/source/cudf_polars/index.rst Outdated Show resolved Hide resolved
@bdice bdice merged commit b6a110e into rapidsai:feature/cudf-polars Sep 16, 2024
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation non-breaking Non-breaking change
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants