Update overview docs

rapidsai · Jul 24, 2024 · 798046f · 798046f
1 parent 834d17c
commit 798046f
Showing 1 changed file with 41 additions and 23 deletions.
diff --git a/python/cudf_polars/docs/overview.md b/python/cudf_polars/docs/overview.md
@@ -59,7 +59,7 @@ The executor for the polars logical plan lives in the cudf repo, in
 
 ```sh
 cd cudf/python/cudf_polars
-uv pip install --no-build-isolation --no-deps -e .
+pip install --no-build-isolation --no-deps -e .
 ```
 
 You should now be able to run the tests in the `cudf_polars` package:
@@ -69,45 +69,63 @@ pytest -v tests
 
 # Executor design
 
-The polars `LazyFrame.collect` functionality offers a
-"post-optimization" callback that may be used by a third party library
-to replace a node (or more, though we only replace a single node) in the
-optimized logical plan with a Python callback that is to deliver the
-result of evaluating the plan. This splits the execution of the plan
-into two phases. First, a symbolic phase which translates to our
-internal representation (IR). Second, an execution phase which executes
-using our IR.
-
-The translation phase receives the a low-level Rust `NodeTraverse`
+The polars `LazyFrame.collect` functionality offers configuration of
+the engine to use for collection through the `engine` argument. At a
+low level, this provides for configuration of a "post-optimization"
+callback that may be used by a third party library to replace a node
+(or more, though we only replace a single node) in the optimized
+logical plan with a Python callback that is to deliver the result of
+evaluating the plan. This splits the execution of the plan into two
+phases. First, a symbolic phase which translates to our internal
+representation (IR). Second, an execution phase which executes using
+our IR.
+
+The translation phase receives the a low-level Rust `NodeTraverser`
 object which delivers Python representations of the plan nodes (and
 expressions) one at a time. During translation, we endeavour to raise
 `NotImplementedError` for any unsupported functionality. This way, if
 we can't execute something, we just don't modify the logical plan at
 all: if we can translate the IR, it is assumed that evaluation will
 later succeed.
 
-The usage of the cudf-based executor is therefore, at present:
+The usage of the cudf-based executor is therefore selected with the
+gpu engine:
 
 ```python
-from cudf_polars.callback import execute_with_cudf
+import polars as pl
 
-result = q.collect(post_opt_callback=execute_with_cudf)
+result = q.collect(engine="gpu")
 ```
 
 This should either transparently run on the GPU and deliver a polars
 dataframe, or else fail (but be handled) and just run the normal CPU
-execution.
+execution. If `POLARS_VERBOSE` is true, then fallback is logged with a
+`PerformanceWarning`.
+
+As well as a string argument, the engine can also be specified with a
+polars `GPUEngine` object. This allows passing more configuration in.
+Currently, the public properties are `device`, to select the device,
+and `memory_resource`, to select the RMM memory resource used for
+allocations during the collection phase.
+
+For example:
+```python
+import polars as pl
+
+result = q.collect(engine=pl.GPUEngine(device=1, memory_resource=mr))
+```
+
+Uses device-1, and the given memory resource. Note that the memory
+resource provided _must_ be valid for allocations on the specified
+device, no checking is performed.
 
-If you want to fail during translation, set the keyword argument
-`raise_on_fail` to `True`:
+For debugging purposes, we can also pass undocumented keyword
+arguments, at the moment, `raise_on_fail` is also supported, which
+raises, rather than falling back, during translation:
 
 ```python
-from functools import partial
-from cudf_polars.callback import execute_with_cudf
 
-result = q.collect(
-    post_opt_callback=partial(execute_with_cudf, raise_on_fail=True)
-)
+result = q.collect(engine=pl.GPUEngine(raise_on_fail=True))
 ```
 
 This is mostly useful when writing tests, since in that case we want
@@ -175,7 +193,7 @@ around their pylibcudf counterparts. We have four (in
 
 1. `Scalar` (a wrapper around a pylibcudf `Scalar`)
 2. `Column` (a wrapper around a pylibcudf `Column`)
-3. `NamedColumn` a `Column` with an additional name
+3. `NamedColumn` (a `Column` with an additional name)
 4. `DataFrame` (a wrapper around a pylibcudf `Table`)
 
 The interfaces offered by these are somewhat in flux, but broadly