Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new azure serverless notebook #506

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
353 changes: 353 additions & 0 deletions docs/notebooks/azure/navigator_tabular_azure_maas.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,353 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<a target=\"_parent\" href=\"https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/amazon/navigator_tabular_amazon_bedrock.ipynb\">\n",
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
"</a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TODO\n",
"# Gretel Navigator Tabular on Azure MaaS \n",
"\n",
"This Notebook will walk you through deploying Gretel Navigator Tabular as a Bedrock Marketplace Model. You can deploy Gretel Navigator as an endpoint in Bedrock and interact with the model using the Gretel SDK.\n",
"\n",
"This Notebook will walk you through the following steps:\n",
"\n",
"* Deploy Gretel Navigator Tabular on Amazon Bedrock\n",
"* Install and configure the Gretel SDK\n",
"* Generate synthetic data with the Gretel SDK and the Bedrock Endpoint\n",
"* Edit and augment existing data with the Gretel SDK and the Bedrock Endpoint"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TODO\n",
"# Deploy Gretel Navigator\n",
"\n",
"To get started, visit the [Amazon Bedrock homepage](https://us-west-2.console.aws.amazon.com/bedrock/home?region=us-west-2#/) in the AWS Console. In this example we'll be using `us-west-2`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Under **Foundation Models**, select **Model Catalog**:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/1_model-catalog.png\" alt=\"Model Catalog\" width=\"70%\">\n",
"\n",
"2. Under **Providers** on the left side, select **Gretel**:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/2_providers.png\" alt=\"Provider\" width=\"70%\">\n",
"\n",
"4. Click on **View subscription options**:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/3_subscription-options.png\" alt=\"Subscription Options\" width=\"70%\">\n",
"\n",
"\n",
"6. Click on **Subscribe**:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/4_subscribe.png\" alt=\"Subscribe\" width=\"60%\">\n",
"\n",
"\n",
"8. Wait for the subscription to complete:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/5_subscription_complete.png\" alt=\"Subscription Complete\" width=\"70%\">\n",
"\n",
"\n",
"10. Once the subscription is complete, click **Deploy**:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/6_deploy.png\" alt=\"Deploy\" width=\"70%\">\n",
"\n",
"\n",
"12. You should reach a configuration screen like below. For this example, we will use the defaults. Update the fields for your use case and modify the **Advanced Settings** as required.\n",
"\n",
"\n",
"When you are done with the configuration, click the **Deploy** button on the bottom right.\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/7_config_deploy.png\" alt=\"Configure and Deploy\" width=\"70%\">\n",
"\n",
"\n",
"8. Remain on the page, and you should eventually see something like this:\n",
"\n",
"<img src=\"https://gretel-blueprints-pub.s3.us-west-2.amazonaws.com/navigator_bedrock/8_in_progress.png\" alt=\"Deployment Progress\" width=\"70%\">\n",
"\n",
"\n",
"Wait for the model to deploy and the **Endpoint status** to change from **Creating** to **In Service**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From the **Marketplace deployments** page (see above). Retrieve the **Endpoint Name (ARN)** and set the variable below."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Install the OpenAI SDK (if you do not already have it)\n",
"\n",
"!pip install -U -qq openai gretel-client"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Import required libraries\n",
"from openai import OpenAI\n",
"from getpass import getpass"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"Azure endpoint: https://gretelserverlessendpointmaas.eastus2.models.ai.azure.com\n",
"Azure API key: ········\n"
]
}
],
"source": [
"# Set region and get credentials securely\n",
"AZURE_ENDPOINT = input(\"Azure endpoint: \")\n",
"AZURE_API_KEY = getpass(\"Azure API key: \")\n",
"\n",
"# the `AzureOpenAI` client mangles the URL, so we stick with the default `OpenAI` client\n",
"oai_client = OpenAI(base_url=AZURE_ENDPOINT, api_key=AZURE_API_KEY)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Create an Azure Adapter using the Gretel SDK\n",
"\n",
"from gretel_client import Gretel\n",
"\n",
"azure_open_ai = Gretel.create_navigator_azure_oai_adapter(oai_client)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Generate and Augment Datasets using Gretel Navigator\n",
"\n",
"Alright, we're now ready to start creating data! We'll first generate some data using a single prompt, and then we'll add a couple of new columns. Try out some of your own prompts to see how it works."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Generating data: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:10, 0.96 records/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" first_name last_name email gender city country\n",
"0 Antoine Roux [email protected] Male Marseille France\n",
"1 Léa Durand [email protected] Female Strasbourg France\n",
"2 Gaspard Fournier [email protected] Male Bordeaux France\n",
"3 Juliette Pierre [email protected] Female Rennes France\n",
"4 Étienne Marchand [email protected] Male Toulouse France\n",
"5 Aurélie Benoit [email protected] Female Lille France\n",
"6 Cédric Renaud [email protected] Male Nice France\n",
"7 Charlotte Garnier [email protected] Female Clermont-Ferrand France\n",
"8 Olivier Dumas [email protected] Male Nantes France\n",
"9 Adèle Carpentier [email protected] Female Montpellier France\n",
"*******\n",
"ResponseMetadata(completion_id='b18cae2d-0a46-4b42-b2d8-f347ac1c6225', usage={'completion_tokens': 550, 'prompt_tokens': 99, 'total_tokens': 649, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'input_bytes': 398, 'output_bytes': 2201, 'total_bytes': 2599, 'billed_bytes': 2600, 'billed_credits': 0.026}, model_id='gretelai/auto')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"# First we'll generate some data from only a prompt. We provide a prompt and some existing sample data to guide the generation process.\n",
"\n",
"import pandas as pd\n",
"\n",
"PROMPT = \"\"\"Generate a mock dataset for users from the Foo company based in France.\n",
"Each user should have the following columns:\n",
"* first_name: traditional French first names.\n",
"* last_name: traditional French surnames.\n",
"* email: formatted as the first letter of their first name followed by their last name @foo.io (e.g., [email protected])\n",
"* gender: Male/Female\n",
"* city: a city in France\n",
"* country: always 'France'.\n",
"\"\"\"\n",
"\n",
"table_headers = [\"first_name\", \"last_name\", \"email\", \"gender\", \"city\", \"country\"]\n",
"table_data = [\n",
" {\n",
" \"first_name\": \"Lea\",\n",
" \"last_name\": \"Martin\",\n",
" \"email\": \"[email protected]\",\n",
" \"gender\": \"Female\",\n",
" \"city\": \"Lyon\",\n",
" \"country\": \"France\",\n",
" }\n",
"]\n",
"\n",
"SAMPLE_DATA = pd.DataFrame(table_data, columns=table_headers)\n",
"\n",
"metadata, synthetic_df = azure_open_ai.generate(\n",
" \"gretelai/auto\",\n",
" PROMPT,\n",
" num_records=10,\n",
" sample_data=SAMPLE_DATA,\n",
")\n",
"\n",
"print(synthetic_df)\n",
"print(\"*******\")\n",
"print(metadata)\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Editing data: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:04, 2.05 records/s]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" first_name last_name email gender city \\\n",
"0 Antoine Roux [email protected] Male Marseille \n",
"1 Léa Durand [email protected] Female Strasbourg \n",
"2 Gaspard Fournier [email protected] Male Bordeaux \n",
"3 Juliette Pierre [email protected] Female Rennes \n",
"4 Étienne Marchand [email protected] Male Toulouse \n",
"5 Aurélie Benoit [email protected] Female Lille \n",
"6 Cédric Renaud [email protected] Male Nice \n",
"7 Charlotte Garnier [email protected] Female Clermont-Ferrand \n",
"8 Olivier Dumas [email protected] Male Nantes \n",
"9 Adèle Carpentier [email protected] Female Montpellier \n",
"\n",
" country occupation education level \n",
"0 France Chef Culinary Arts \n",
"1 France Lawyer Juris Doctor (J.D.) \n",
"2 France Engineer Bachelor's in Engineering \n",
"3 France Teacher Master's in Education \n",
"4 France Doctor Medical Degree (M.D.) \n",
"5 France Artist Bachelor's in Fine Arts \n",
"6 France Programmer Bachelor's in Computer Science \n",
"7 France Nurse Bachelor's in Nursing \n",
"8 France Scientist Ph.D. in Science \n",
"9 France Journalist Bachelor's in Journalism \n",
"*******\n",
"ResponseMetadata(completion_id='989307a2-a741-4b52-9967-26b3b5c7e229', usage={'completion_tokens': 791, 'prompt_tokens': 33, 'total_tokens': 824, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'input_bytes': 134, 'output_bytes': 3165, 'total_bytes': 3299, 'billed_bytes': 3300, 'billed_credits': 0.033}, model_id='gretelai/auto')\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"# Finally, we'll demonstrate Navigator's edit mode, which can augment existing datasets. In this example we'll take our previously\n",
"# generated Synthetic DF and ask Navigator to augment it with new columns.\n",
"\n",
"EDIT_PROMPT = \"\"\"Edit the table and add the following columns:\n",
"* occupation: a random occupation\n",
"* education level: make it relevant to the occupation\n",
"\"\"\"\n",
"\n",
"metadata, augmented_df = azure_open_ai.edit(\n",
" \"gretelai/auto\",\n",
" EDIT_PROMPT,\n",
" seed_data=synthetic_df\n",
")\n",
"\n",
"print(augmented_df)\n",
"print(\"*******\")\n",
"print(metadata)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "monogretel_venv",
"language": "python",
"name": "monogretel_venv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.17"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Loading