Skip to content

Commit

Permalink
Add notes to performance comparisons notebook (#13044)
Browse files Browse the repository at this point in the history
This PR adds a `note` section to the performance comparisons notebook to give users a disclaimer on what they need to do to run this notebook on lower-end hardware.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #13044
  • Loading branch information
galipremsagar authored Mar 31, 2023
1 parent eeddf74 commit f3f84f2
Show file tree
Hide file tree
Showing 6 changed files with 21 additions and 9 deletions.
2 changes: 1 addition & 1 deletion ci/test_notebooks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ pushd notebooks

# Add notebooks that should be skipped here
# (space-separated list of filenames without paths)
SKIPNBS="performance_comparisons.ipynb"
SKIPNBS="performance-comparisons.ipynb"

EXITCODE=0
trap "EXITCODE=1" ERR
Expand Down
2 changes: 1 addition & 1 deletion docs/cudf/source/user_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ groupby
guide-to-udfs
cupy-interop
options
performance-comparisons
performance-comparisons/index
PandasCompat
copy-on-write
```
8 changes: 8 additions & 0 deletions docs/cudf/source/user_guide/performance-comparisons/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Performance comparisons

```{toctree}
:maxdepth: 2
performance-comparisons
```
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,16 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook compares the performance of `cuDF` and `pandas`. The comparisons performed are on identical data sizes. This notebook primarily showcases the factor\n",
"of speedups users can have when the similar `pandas` APIs are run on GPUs using `cudf`.\n",
"\n",
"The hardware details used to run these performance comparisons are at the end of this page."
"The hardware details used to run these performance comparisons are at the end of this page.\n",
"\n",
"**Note**: This notebook is written to measure performance on NVIDIA GPUs with large memory. If running on hardware with lower memory, please consider lowering the `num_rows` values. Performance results may vary by data size, as well as the CPU and GPU used."
]
},
{
Expand Down Expand Up @@ -576,9 +579,10 @@
},
"outputs": [],
"source": [
"num_rows = 300_000_000\n",
"pd_series = pd.Series(\n",
" np.random.choice(\n",
" [\"123\", \"56.234\", \"Walmart\", \"Costco\", \"rapids ai\"], size=300_000_000\n",
" [\"123\", \"56.234\", \"Walmart\", \"Costco\", \"rapids ai\"], size=num_rows\n",
" )\n",
")"
]
Expand Down Expand Up @@ -1368,10 +1372,10 @@
},
"outputs": [],
"source": [
"size = 100_000_000\n",
"num_rows = 100_000_000\n",
"pdf = pd.DataFrame()\n",
"pdf[\"key\"] = np.random.randint(0, 2, size)\n",
"pdf[\"val\"] = np.random.randint(0, 7, size)\n",
"pdf[\"key\"] = np.random.randint(0, 2, num_rows)\n",
"pdf[\"val\"] = np.random.randint(0, 7, num_rows)\n",
"\n",
"\n",
"def custom_formula_udf(df):\n",
Expand Down Expand Up @@ -1634,7 +1638,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.10"
},
"vscode": {
"interpreter": {
Expand Down
1 change: 1 addition & 0 deletions notebooks/performance-comparisons
1 change: 0 additions & 1 deletion notebooks/performance_comparisons.ipynb

This file was deleted.

0 comments on commit f3f84f2

Please sign in to comment.