diff --git a/r_examples/r_api_serving_examples/API Serving Examples.ipynb b/r_examples/r_api_serving_examples/API Serving Examples.ipynb deleted file mode 100644 index cb85c5bac6..0000000000 --- a/r_examples/r_api_serving_examples/API Serving Examples.ipynb +++ /dev/null @@ -1,610 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# R API Serving Examples\n", - "\n", - "In this example, we demonstrate how to quickly compare the runtimes of three methods for serving a model from an R hosted REST API. The following SageMaker examples discuss each method in detail:\n", - "\n", - "* **Plumber**\n", - " * Website: [https://www.rplumber.io/](https://www.rplumber.io)\n", - " * SageMaker Example: [r_serving_with_plumber](../r_serving_with_plumber)\n", - "* **RestRServe**\n", - " * Website: [https://restrserve.org](https://restrserve.org)\n", - " * SageMaker Example: [r_serving_with_restrserve](../r_serving_with_restrserve)\n", - "* **FastAPI** (reticulated from Python)\n", - " * Website: [https://fastapi.tiangolo.com](https://fastapi.tiangolo.com)\n", - " * SageMaker Example: [r_serving_with_fastapi](../r_serving_with_fastapi)\n", - " \n", - "We will reuse the docker images from each of these examples. Each one is configured to serve a small XGBoost model which has already been trained on the classical Iris dataset." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Building Docker Images for Serving\n", - "\n", - "First, we will build each docker image from the provided SageMaker Examples." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Plumber Serving Image" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "!cd .. && docker build -t r-plumber -f r_serving_with_plumber/Dockerfile r_serving_with_plumber" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### RestRServe Serving Image" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "!cd .. && docker build -t r-restrserve -f r_serving_with_restrserve/Dockerfile r_serving_with_restrserve" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### FastAPI Serving Image" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "scrolled": true - }, - "outputs": [], - "source": [ - "!cd .. && docker build -t r-fastapi -f r_serving_with_fastapi/Dockerfile r_serving_with_fastapi" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Launch Serving Containers" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we will launch each search container. The containers will be launch on the following ports to avoid port collisions on your local machine or SageMaker Notebook instance:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ports = {\n", - " \"plumber\": 5000,\n", - " \"restrserve\": 5001,\n", - " \"fastapi\": 5002,\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!bash launch.sh" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!docker container list" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Define Simple Client" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import requests\n", - "from tqdm import tqdm\n", - "import pandas as pd" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def get_predictions(examples, instance=requests, port=5000):\n", - " payload = {\"features\": examples}\n", - " return instance.post(f\"http://127.0.0.1:{port}/invocations\", json=payload)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def get_health(instance=requests, port=5000):\n", - " instance.get(f\"http://127.0.0.1:{port}/ping\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Define Example Inputs" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we define a example inputs from the classical [Iris](https://archive.ics.uci.edu/ml/datasets/iris) dataset.\n", - "* Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "column_names = [\"Sepal.Length\", \"Sepal.Width\", \"Petal.Length\", \"Petal.Width\", \"Label\"]\n", - "iris = pd.read_csv(\n", - " \"s3://sagemaker-sample-files/datasets/tabular/iris/iris.data\", names=column_names\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "iris_features = iris[[\"Sepal.Length\", \"Sepal.Width\", \"Petal.Length\", \"Petal.Width\"]]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "example = iris_features.values[:1].tolist()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "many_examples = iris_features.values[:100].tolist()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Testing\n", - "\n", - "Now it's time to test how each API server performs under stress." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We will test two use cases:\n", - "* **New Requests**: In this scenario, we test how quickly the server can respond with predictions when each client request establishes a new connection with the server. This simulates the server's ability to handle real-time requests. We could make this more realistic by creating an asynchronous environment that tests the server's ability to fulfill concurrent rather than sequential requests.\n", - "* **Keep Alive / Reuse Session**: In this scenario, we test how quickly the server can respond with predictions when each client request uses a session to keep its connection to the server alive between requests. This simulates the server's ability to handle sequential batch requests from the same client." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For each of the two use cases, we will test the performance on following situations:\n", - "\n", - "* 1000 requests of a single example\n", - "* 1000 requests of 100 examples\n", - "* 1000 pings for health status" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## New Requests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Plumber" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# verify the prediction output\n", - "get_predictions(example, port=ports[\"plumber\"]).json()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(example, port=ports[\"plumber\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(many_examples, port=ports[\"plumber\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " get_health(port=ports[\"plumber\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### RestRserve" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# verify the prediction output\n", - "get_predictions(example, port=ports[\"restrserve\"]).json()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(example, port=ports[\"restrserve\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(many_examples, port=ports[\"restrserve\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " get_health(port=ports[\"restrserve\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### FastAPI" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# verify the prediction output\n", - "get_predictions(example, port=ports[\"fastapi\"]).json()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(example, port=ports[\"fastapi\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(many_examples, port=ports[\"fastapi\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " get_health(port=ports[\"fastapi\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Keep Alive (Reuse Session)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now, let's test how each one performs when each request reuses a session connection. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# reuse the session for each post and get request\n", - "instance = requests.Session()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Plumber" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(example, instance=instance, port=ports[\"plumber\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(many_examples, instance=instance, port=ports[\"plumber\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " get_health(instance=instance, port=ports[\"plumber\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### RestRserve" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(example, instance=instance, port=ports[\"restrserve\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(many_examples, instance=instance, port=ports[\"restrserve\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " get_health(instance=instance, port=ports[\"restrserve\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### FastAPI" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(example, instance=instance, port=ports[\"fastapi\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " _ = get_predictions(many_examples, instance=instance, port=ports[\"fastapi\"])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for i in tqdm(range(1000)):\n", - " get_health(instance=instance, port=ports[\"fastapi\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Stop All Serving Containers" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally, we will shut down the serving containers we launched for the tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "!docker kill $(docker ps -q)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Conclusion" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this example, we demonstrated how to conduct a simple performance benchmark across three R model serving solutions. We leave the choice of serving solution up to the reader since in some cases it might be appropriate to customize the benchmark in the following ways:\n", - "\n", - "* Update the serving example to serve a specific model\n", - "* Perform the tests across multiple instances types\n", - "* Modify the serving example and client to test asynchronous requests.\n", - "* Deploy the serving examples to SageMaker Endpoints to test within an autoscaling environment.\n", - "\n", - "For more information on serving your models in custom containers on SageMaker, please see our [support documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-main.html) for the latest updates and best practices." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "conda_python3", - "language": "python", - "name": "conda_python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/r_examples/r_api_serving_examples/iris.csv b/r_examples/r_api_serving_examples/iris.csv deleted file mode 100644 index 8b6393099a..0000000000 --- a/r_examples/r_api_serving_examples/iris.csv +++ /dev/null @@ -1,151 +0,0 @@ -Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species -5.1,3.5,1.4,0.2,setosa -4.9,3,1.4,0.2,setosa -4.7,3.2,1.3,0.2,setosa -4.6,3.1,1.5,0.2,setosa -5,3.6,1.4,0.2,setosa -5.4,3.9,1.7,0.4,setosa -4.6,3.4,1.4,0.3,setosa -5,3.4,1.5,0.2,setosa -4.4,2.9,1.4,0.2,setosa -4.9,3.1,1.5,0.1,setosa -5.4,3.7,1.5,0.2,setosa -4.8,3.4,1.6,0.2,setosa -4.8,3,1.4,0.1,setosa -4.3,3,1.1,0.1,setosa -5.8,4,1.2,0.2,setosa -5.7,4.4,1.5,0.4,setosa -5.4,3.9,1.3,0.4,setosa -5.1,3.5,1.4,0.3,setosa -5.7,3.8,1.7,0.3,setosa -5.1,3.8,1.5,0.3,setosa -5.4,3.4,1.7,0.2,setosa -5.1,3.7,1.5,0.4,setosa -4.6,3.6,1,0.2,setosa -5.1,3.3,1.7,0.5,setosa -4.8,3.4,1.9,0.2,setosa -5,3,1.6,0.2,setosa -5,3.4,1.6,0.4,setosa -5.2,3.5,1.5,0.2,setosa -5.2,3.4,1.4,0.2,setosa -4.7,3.2,1.6,0.2,setosa -4.8,3.1,1.6,0.2,setosa -5.4,3.4,1.5,0.4,setosa -5.2,4.1,1.5,0.1,setosa -5.5,4.2,1.4,0.2,setosa -4.9,3.1,1.5,0.2,setosa -5,3.2,1.2,0.2,setosa -5.5,3.5,1.3,0.2,setosa -4.9,3.6,1.4,0.1,setosa -4.4,3,1.3,0.2,setosa -5.1,3.4,1.5,0.2,setosa -5,3.5,1.3,0.3,setosa -4.5,2.3,1.3,0.3,setosa -4.4,3.2,1.3,0.2,setosa -5,3.5,1.6,0.6,setosa -5.1,3.8,1.9,0.4,setosa -4.8,3,1.4,0.3,setosa -5.1,3.8,1.6,0.2,setosa -4.6,3.2,1.4,0.2,setosa -5.3,3.7,1.5,0.2,setosa -5,3.3,1.4,0.2,setosa -7,3.2,4.7,1.4,versicolor -6.4,3.2,4.5,1.5,versicolor -6.9,3.1,4.9,1.5,versicolor -5.5,2.3,4,1.3,versicolor -6.5,2.8,4.6,1.5,versicolor -5.7,2.8,4.5,1.3,versicolor -6.3,3.3,4.7,1.6,versicolor -4.9,2.4,3.3,1,versicolor -6.6,2.9,4.6,1.3,versicolor -5.2,2.7,3.9,1.4,versicolor -5,2,3.5,1,versicolor -5.9,3,4.2,1.5,versicolor -6,2.2,4,1,versicolor -6.1,2.9,4.7,1.4,versicolor -5.6,2.9,3.6,1.3,versicolor -6.7,3.1,4.4,1.4,versicolor -5.6,3,4.5,1.5,versicolor -5.8,2.7,4.1,1,versicolor -6.2,2.2,4.5,1.5,versicolor -5.6,2.5,3.9,1.1,versicolor -5.9,3.2,4.8,1.8,versicolor -6.1,2.8,4,1.3,versicolor -6.3,2.5,4.9,1.5,versicolor -6.1,2.8,4.7,1.2,versicolor -6.4,2.9,4.3,1.3,versicolor -6.6,3,4.4,1.4,versicolor -6.8,2.8,4.8,1.4,versicolor -6.7,3,5,1.7,versicolor -6,2.9,4.5,1.5,versicolor -5.7,2.6,3.5,1,versicolor -5.5,2.4,3.8,1.1,versicolor -5.5,2.4,3.7,1,versicolor -5.8,2.7,3.9,1.2,versicolor -6,2.7,5.1,1.6,versicolor -5.4,3,4.5,1.5,versicolor -6,3.4,4.5,1.6,versicolor -6.7,3.1,4.7,1.5,versicolor -6.3,2.3,4.4,1.3,versicolor -5.6,3,4.1,1.3,versicolor -5.5,2.5,4,1.3,versicolor -5.5,2.6,4.4,1.2,versicolor -6.1,3,4.6,1.4,versicolor -5.8,2.6,4,1.2,versicolor -5,2.3,3.3,1,versicolor -5.6,2.7,4.2,1.3,versicolor -5.7,3,4.2,1.2,versicolor -5.7,2.9,4.2,1.3,versicolor -6.2,2.9,4.3,1.3,versicolor -5.1,2.5,3,1.1,versicolor -5.7,2.8,4.1,1.3,versicolor -6.3,3.3,6,2.5,virginica -5.8,2.7,5.1,1.9,virginica -7.1,3,5.9,2.1,virginica -6.3,2.9,5.6,1.8,virginica -6.5,3,5.8,2.2,virginica -7.6,3,6.6,2.1,virginica -4.9,2.5,4.5,1.7,virginica -7.3,2.9,6.3,1.8,virginica -6.7,2.5,5.8,1.8,virginica -7.2,3.6,6.1,2.5,virginica -6.5,3.2,5.1,2,virginica -6.4,2.7,5.3,1.9,virginica -6.8,3,5.5,2.1,virginica -5.7,2.5,5,2,virginica -5.8,2.8,5.1,2.4,virginica -6.4,3.2,5.3,2.3,virginica -6.5,3,5.5,1.8,virginica -7.7,3.8,6.7,2.2,virginica -7.7,2.6,6.9,2.3,virginica -6,2.2,5,1.5,virginica -6.9,3.2,5.7,2.3,virginica -5.6,2.8,4.9,2,virginica -7.7,2.8,6.7,2,virginica -6.3,2.7,4.9,1.8,virginica -6.7,3.3,5.7,2.1,virginica -7.2,3.2,6,1.8,virginica -6.2,2.8,4.8,1.8,virginica -6.1,3,4.9,1.8,virginica -6.4,2.8,5.6,2.1,virginica -7.2,3,5.8,1.6,virginica -7.4,2.8,6.1,1.9,virginica -7.9,3.8,6.4,2,virginica -6.4,2.8,5.6,2.2,virginica -6.3,2.8,5.1,1.5,virginica -6.1,2.6,5.6,1.4,virginica -7.7,3,6.1,2.3,virginica -6.3,3.4,5.6,2.4,virginica -6.4,3.1,5.5,1.8,virginica -6,3,4.8,1.8,virginica -6.9,3.1,5.4,2.1,virginica -6.7,3.1,5.6,2.4,virginica -6.9,3.1,5.1,2.3,virginica -5.8,2.7,5.1,1.9,virginica -6.8,3.2,5.9,2.3,virginica -6.7,3.3,5.7,2.5,virginica -6.7,3,5.2,2.3,virginica -6.3,2.5,5,1.9,virginica -6.5,3,5.2,2,virginica -6.2,3.4,5.4,2.3,virginica -5.9,3,5.1,1.8,virginica diff --git a/r_examples/r_api_serving_examples/launch.sh b/r_examples/r_api_serving_examples/launch.sh deleted file mode 100644 index e456602d35..0000000000 --- a/r_examples/r_api_serving_examples/launch.sh +++ /dev/null @@ -1,11 +0,0 @@ -#!/bin/bash - -echo "Launching Plumber" -docker run -d --rm -p 5000:8080 r-plumber - -echo "Launching RestRServer" -docker run -d --rm -p 5001:8080 r-restrserve - -echo "Launching FastAPI" -docker run -d --rm -p 5002:8080 r-fastapi -