diff --git a/source/examples/rapids-1brc-single-node/notebook.ipynb b/source/examples/rapids-1brc-single-node/notebook.ipynb index cf1df511..aee011e5 100755 --- a/source/examples/rapids-1brc-single-node/notebook.ipynb +++ b/source/examples/rapids-1brc-single-node/notebook.ipynb @@ -20,7 +20,7 @@ ] }, "source": [ - "# Measuring performance with the One Billion Row Challenge\n", + "# Measuring Performance with the One Billion Row Challenge\n", "\n", "The [One Billion Row Challenge](https://www.morling.dev/blog/one-billion-row-challenge/) is a programming competition aimed at Java developers to write the most efficient code to process a one billion line text file and calculate some metrics. The challenge has inspired solutions in many languages beyond Java including [Python](https://github.com/gunnarmorling/1brc/discussions/62).\n", "\n", @@ -53,7 +53,7 @@ "id": "d2037a13", "metadata": {}, "source": [ - "## Reference implementation\n", + "## Reference Implementation\n", "\n", "A reference implementation written with popular PyData tools would likely be something along the lines of the following Pandas code (assuming you have enough RAM to fit the data into memory).\n", "\n", @@ -335,7 +335,7 @@ "id": "04cc7676", "metadata": {}, "source": [ - "## GPU solution with RAPIDS\n", + "## GPU Solution with RAPIDS\n", "\n", "Now let's look at using RAPIDS to speed up our Pandas implementation of the challenge. If you directly convert the reference implementation from Pandas to cuDF you will run into some [limitations cuDF has with string columns](https://github.com/rapidsai/cudf/issues/13733). Also depending on your GPU you may run into memory limits as cuDF will read the whole dataset into memory and machines typically have less GPU memory than CPU memory.\n", "\n", @@ -382,7 +382,7 @@ "id": "24acf4d9-c5dc-42c5-8f73-92026f0cc581", "metadata": {}, "source": [ - "### Dask dashboard\n", + "### Dask Dashboard\n", "\n", "We can also make use of the [Dask Dashboard](https://docs.dask.org/en/latest/dashboard.html) to see what is going on. \n", "\n", diff --git a/source/examples/rapids-autoscaling-multi-tenant-kubernetes/notebook.ipynb b/source/examples/rapids-autoscaling-multi-tenant-kubernetes/notebook.ipynb index 4b61f5ae..886a359d 100644 --- a/source/examples/rapids-autoscaling-multi-tenant-kubernetes/notebook.ipynb +++ b/source/examples/rapids-autoscaling-multi-tenant-kubernetes/notebook.ipynb @@ -15,7 +15,7 @@ ] }, "source": [ - "# Autoscaling multi-tenant Kubernetes Deep-Dive\n", + "# Autoscaling Multi-Tenant Kubernetes Deep-Dive\n", "\n", "In this example we are going to take a deep-dive into launching an autoscaling multi-tenant RAPIDS environment on Kubernetes.\n", "\n", @@ -135,7 +135,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Prometheus stack\n", + "### Prometheus Stack\n", "\n", "Let's start by installing the [Kubernetes Prometheus Stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) which includes everything we need to run Prometheus on our cluster.\n", "\n", @@ -217,7 +217,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Image steaming (optional)\n", + "### Image Steaming (optional)\n", "\n", "In order to steam the container image to the GKE nodes our image needs to be stored in [Google Cloud Artifact Registry](https://cloud.google.com/artifact-registry/) in the same region as our cluster.\n", "\n", @@ -236,7 +236,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Image prepuller (optional)\n", + "### Image Prepuller (optional)\n", "\n", "If you know that many users are going to want to frequently pull a specific container image I like to run a small `DaemonSet` which ensures that image starts streaming onto a node as soon as it joins the cluster. This is optional but can reduce wait time for users." ] @@ -376,7 +376,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Running some work\n", + "## Running Some Work\n", "\n", "Next let's connect to the Jupyter session and run some work on our cluster. You can do this by port forwarding the Jupyter service to your local machine.\n", "\n", @@ -397,7 +397,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Check our capabilities\n", + "### Check Capabilities\n", "\n", "Let's make sure our environment is all set up correctly by checking out our capabilities. We can start by running `nvidia-smi` to inspect our Notebook GPU." ] @@ -521,7 +521,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Small workload\n", + "### Small Workload\n", "\n", "Let's run a small RAPIDS workload that stretches our Kubernetes cluster a little and causes it to scale. \n", "\n", @@ -1071,7 +1071,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Simulating many multi-tenant workloads\n", + "## Simulating Many Multi-Tenant Workloads\n", "\n", "Now we have a toy workload which we can use to represent one user on our multi-tenant cluster.\n", "\n", @@ -1180,7 +1180,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Simulating our multi-tenant workloads\n", + "### Simulating our Multi-Tenant Workloads\n", "\n", "To see how our Kubernetes cluster behaves when many users are sharing it we want to run our haversine workload a bunch of times. \n", "\n", @@ -1744,7 +1744,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Pending pods\n", + "### Pending Pods\n", "\n", "First let's see how long each of our Pods spent in a `Pending` phase. This is the amount of time users would have to wait for their work to start running when they create their Dask clusters." ] diff --git a/source/examples/rapids-azureml-hpo/notebook.ipynb b/source/examples/rapids-azureml-hpo/notebook.ipynb index 50a73321..d4bee24c 100644 --- a/source/examples/rapids-azureml-hpo/notebook.ipynb +++ b/source/examples/rapids-azureml-hpo/notebook.ipynb @@ -12,7 +12,7 @@ ] }, "source": [ - "# Train and hyperparameter tune with RAPIDS" + "# Train and Hyperparameter-Tune with RAPIDS" ] }, { @@ -65,7 +65,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Initialize workspace" + "# Initialize Workspace" ] }, { @@ -119,7 +119,7 @@ "tags": [] }, "source": [ - "# Access data from Datastore URI" + "# Access Data from Datastore URI" ] }, { @@ -157,7 +157,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Create AML compute" + "# Create AML Compute" ] }, { @@ -271,14 +271,14 @@ "tags": [] }, "source": [ - "# Train Model on remote compute" + "# Train Model on Remote Compute" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Create experiment\n", + "## Create Experiment\n", "\n", "Track all the runs in your workspace" ] @@ -352,7 +352,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Submit the training job " + "## Submit the Training Job " ] }, { @@ -435,7 +435,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Tune model hyperparameters" + "# Tune Model Hyperparameters" ] }, { @@ -449,7 +449,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Start a hyperparameter sweep" + "## Start a Hyperparameter Sweep" ] }, { @@ -540,7 +540,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Find and register best model" + "## Find and Register Best Model" ] }, { @@ -563,7 +563,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Delete cluster" + "# Delete Cluster" ] }, { diff --git a/source/examples/rapids-ec2-mnmg/notebook.ipynb b/source/examples/rapids-ec2-mnmg/notebook.ipynb index 6e4a3038..79ca421a 100644 --- a/source/examples/rapids-ec2-mnmg/notebook.ipynb +++ b/source/examples/rapids-ec2-mnmg/notebook.ipynb @@ -18,7 +18,7 @@ ] }, "source": [ - "# Multi-node multi-GPU example on AWS using dask-cloudprovider\n", + "# Multi-node Multi-GPU Example on AWS using dask-cloudprovider\n", "\n", "[Dask Cloud Provider](https://cloudprovider.dask.org/en/latest/) is a native cloud integration for dask. It helps manage Dask clusters on different cloud platforms. In this notebook, we will look at how we can use the package to set-up a AWS cluster and run a multi-node multi-GPU (MNMG) example with [RAPIDS](https://rapids.ai/). RAPIDS provides a suite of libraries to accelerate data science pipelines on the GPU entirely. This can be scaled to multiple nodes using Dask as we will see through this notebook. " ] @@ -28,7 +28,7 @@ "id": "98cf9dae", "metadata": {}, "source": [ - "## Create your cluster\n", + "## Create Your Cluster\n", "\n", "\n", "```{note}\n", @@ -49,7 +49,7 @@ "id": "3af8c63a", "metadata": {}, "source": [ - "## Client set up\n", + "## Client Set Up\n", "\n", "Now we can create a [Dask Client](https://distributed.dask.org/en/latest/client.html) with the cluster we just defined. " ] diff --git a/source/examples/rapids-optuna-hpo/notebook.ipynb b/source/examples/rapids-optuna-hpo/notebook.ipynb index 392a07b0..127d08ce 100644 --- a/source/examples/rapids-optuna-hpo/notebook.ipynb +++ b/source/examples/rapids-optuna-hpo/notebook.ipynb @@ -92,7 +92,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Loading the data\n", + "# Loading the Data\n", "## Data Acquisition\n", "Dataset can be acquired from Kaggle: [BNP Paribas Cardif Claims Management](https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/data). To download the dataset:\n", " \n", diff --git a/source/examples/rapids-sagemaker-higgs/notebook.ipynb b/source/examples/rapids-sagemaker-higgs/notebook.ipynb index 90618de6..2f710c26 100644 --- a/source/examples/rapids-sagemaker-higgs/notebook.ipynb +++ b/source/examples/rapids-sagemaker-higgs/notebook.ipynb @@ -14,7 +14,7 @@ ] }, "source": [ - "# Running RAPIDS hyperparameter experiments at scale on Amazon SageMaker" + "# Running RAPIDS Hyperparameter Experiments at Scale on Amazon SageMaker" ] }, { diff --git a/source/examples/rapids-sagemaker-hpo/notebook.ipynb b/source/examples/rapids-sagemaker-hpo/notebook.ipynb index 4227da90..9ab5d7b0 100644 --- a/source/examples/rapids-sagemaker-hpo/notebook.ipynb +++ b/source/examples/rapids-sagemaker-hpo/notebook.ipynb @@ -16,7 +16,7 @@ ] }, "source": [ - "# Deep Dive into running Hyper Parameter Optimization on AWS SageMaker" + "# Deep Dive into Running Hyper Parameter Optimization on AWS SageMaker" ] }, { @@ -1922,7 +1922,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "### Getting the best Model" + "### Getting the Best Model" ] }, { diff --git a/source/examples/time-series-forecasting-with-hpo/notebook.ipynb b/source/examples/time-series-forecasting-with-hpo/notebook.ipynb index 2ec93a23..a85dd241 100644 --- a/source/examples/time-series-forecasting-with-hpo/notebook.ipynb +++ b/source/examples/time-series-forecasting-with-hpo/notebook.ipynb @@ -18,7 +18,7 @@ ] }, "source": [ - "# Perform time series forecasting on Google Kubernetes Engine with NVIDIA GPUs" + "# Perform Time Series Forecasting on Google Kubernetes Engine with NVIDIA GPUs" ] }, { @@ -271,7 +271,7 @@ "id": "f304ea68-381f-45b4-9e27-201a35e31239", "metadata": {}, "source": [ - "## Data preprocessing" + "## Data Preprocessing" ] }, { diff --git a/source/examples/xgboost-azure-mnmg-daskcloudprovider/notebook.ipynb b/source/examples/xgboost-azure-mnmg-daskcloudprovider/notebook.ipynb index 2760896c..01586d14 100644 --- a/source/examples/xgboost-azure-mnmg-daskcloudprovider/notebook.ipynb +++ b/source/examples/xgboost-azure-mnmg-daskcloudprovider/notebook.ipynb @@ -21,7 +21,7 @@ ] }, "source": [ - "# Multi-Node Multi-GPU XGBoost example on Azure using dask-cloudprovider\n", + "# Multi-Node Multi-GPU XGBoost Example on Azure using dask-cloudprovider\n", "\n", "[Dask Cloud Provider](https://cloudprovider.dask.org/en/latest/) is a native cloud intergration library for Dask. It helps manage Dask clusters on different cloud platforms. In this notebook, we will look at how we can use this package to set-up an Azure cluster and run a multi-node multi-GPU (MNMG) example with [RAPIDS](https://rapids.ai/). RAPIDS provides a suite of libraries to accelerate data science pipelines on the GPU entirely. This can be scaled to multiple nodes using Dask as we will see in this notebook. \n", "\n", diff --git a/source/examples/xgboost-gpu-hpo-job-parallel-k8s/notebook.ipynb b/source/examples/xgboost-gpu-hpo-job-parallel-k8s/notebook.ipynb index dff1f6d0..944b106f 100644 --- a/source/examples/xgboost-gpu-hpo-job-parallel-k8s/notebook.ipynb +++ b/source/examples/xgboost-gpu-hpo-job-parallel-k8s/notebook.ipynb @@ -16,7 +16,7 @@ ] }, "source": [ - "# Scaling up hyperparameter optimization with Kubernetes and XGBoost GPU algorithm" + "# Scaling up Hyperparameter Optimization with Kubernetes and XGBoost GPU Algorithm" ] }, { diff --git a/source/examples/xgboost-gpu-hpo-job-parallel-ngc/notebook.ipynb b/source/examples/xgboost-gpu-hpo-job-parallel-ngc/notebook.ipynb index 8d6fe211..051464ac 100644 --- a/source/examples/xgboost-gpu-hpo-job-parallel-ngc/notebook.ipynb +++ b/source/examples/xgboost-gpu-hpo-job-parallel-ngc/notebook.ipynb @@ -15,7 +15,7 @@ ] }, "source": [ - "# Scaling up hyperparameter optimization with NVIDIA DGX Cloud and XGBoost GPU algorithm" + "# Scaling up Hyperparameter Optimization with NVIDIA DGX Cloud and XGBoost GPU Algorithm" ] }, { diff --git a/source/examples/xgboost-gpu-hpo-mnmg-parallel-k8s/notebook.ipynb b/source/examples/xgboost-gpu-hpo-mnmg-parallel-k8s/notebook.ipynb index a2e7e1bd..37200062 100644 --- a/source/examples/xgboost-gpu-hpo-mnmg-parallel-k8s/notebook.ipynb +++ b/source/examples/xgboost-gpu-hpo-mnmg-parallel-k8s/notebook.ipynb @@ -19,7 +19,7 @@ ] }, "source": [ - "# Scaling up hyperparameter optimization with multi-GPU workload on Kubernetes" + "# Scaling up Hyperparameter Optimization with Multi-GPU Workload on Kubernetes" ] }, { diff --git a/source/index.md b/source/index.md index 1fdcd08c..c84ae0ac 100644 --- a/source/index.md +++ b/source/index.md @@ -77,7 +77,7 @@ There are many tools to deploy RAPIDS. ````{grid-item-card} :link: examples/index :link-type: doc -{fas}`book;sd-text-primary` Workflow examples +{fas}`book;sd-text-primary` Workflow Examples ^^^ For inspiration see our example notebooks with opinionated deployments of RAPIDS to boost machine learning workflows.