Skip to content

Commit

Permalink
fix typo in blueprint
Browse files Browse the repository at this point in the history
  • Loading branch information
johnnygreco committed Oct 23, 2023
1 parent 41a7c81 commit 3249de9
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion sdk_blueprints/Gretel_101_Blueprint.ipynb
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyMuh9CqcuqP+k1q0cAuIMgJ"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_101_Blueprint.ipynb)\n","\n","<br>\n","\n","<center><a href=https://gretel.ai/><img src=\"https://gretel-public-website.s3.us-west-2.amazonaws.com/assets/brand/gretel_brand_wordmark.svg\" alt=\"Gretel\" width=\"350\"/></a></center>\n","\n","<br>\n","\n","## Welcome to the Gretel 101 Blueprint!\n","\n","In this Blueprint, we will use Gretel to train a deep generative model and use it to generate high-quality synthetic (tabular) data. We will accomplish this by submitting training and generation jobs to the [Gretel Cloud](https://gretel.ai/faqs/gretel-cloud) via [Gretel's Python SDK](https://docs.gretel.ai/guides/environment-setup/cli-and-sdk).\n","\n","Behind the scenes, Gretel will spin up workers with the necessary compute resources, set up the model with your desired configuration, and perform the submitted task.\n","\n","## Create your Gretel account\n","\n","To get started, you will need to [sign up for a free Gretel account](https://console.gretel.ai/).\n","\n","<br>\n","\n","#### Ready? Let's go πŸš€"],"metadata":{"id":"nwpvdB3Jn5hG"}},{"cell_type":"markdown","source":["## πŸ’Ύ Install `gretel-client` and its dependencies"],"metadata":{"id":"MPHEAxLufyEo"}},{"cell_type":"code","execution_count":null,"metadata":{"id":"zFeKqpkunEo1"},"outputs":[],"source":["%%capture\n","!pip install gretel-client"]},{"cell_type":"markdown","source":["## πŸ›œ Configure your Gretel session\n","\n","- The `Gretel` object provides a high-level interface for streamlining interactions with Gretel's APIs.\n","\n","- Each `Gretel` instance is bound to a single [Gretel project](https://docs.gretel.ai/guides/gretel-fundamentals/projects).\n","\n","- Running the cell below will prompt you for your Gretel API key, which you can retrieve [here](https://console.gretel.ai/users/me/key).\n","\n","- With `validate=True`, your login credentials will be validated immediately at instantiation."],"metadata":{"id":"DNdDXiI-Xkf1"}},{"cell_type":"code","source":["from gretel_client import Gretel\n","\n","gretel = Gretel(api_key=\"prompt\", validate=True)"],"metadata":{"id":"5qnVwoPZx4j0"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# @title πŸ—‚οΈ Pick a tabular dataset πŸ‘‡ { display-mode: \"form\" }\n","dataset_path_dict = {\n"," \"adult income in the USA (14000 records, 15 fields)\": \"https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/sample_data/us-adult-income.csv\",\n"," \"hospital length of stay (9999 records, 18 fields)\": \"https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/sample_data/sample-synthetic-healthcare.csv\",\n"," \"customer churn (7032 records, 21 fields)\": \"https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/sample_data/monthly-customer-payments.csv\"\n","}\n","\n","dataset = \"adult income in the USA (14000 records, 15 fields)\" # @param [\"adult income in the USA (14000 records, 15 fields)\", \"hospital length of stay (9999 records, 18 fields)\", \"customer churn (7032 records, 21 fields)\"]\n","dataset = dataset_path_dict[dataset]\n"],"metadata":{"id":"uRbY7vk3tSBg"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["import pandas as pd\n","\n","# explore the data using pandas\n","df = pd.read_csv(dataset)\n","df.head()"],"metadata":{"id":"cW3VKpyPvm6W"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## πŸ‹οΈβ€β™‚οΈ Train a generative model\n","\n","- The [tabular-actgan](https://github.com/gretelai/gretel-blueprints/blob/main/config_templates/gretel/synthetics/tabular-actgan.yml) base config tells Gretel which model to train and how to configure it.\n","\n","- You can replace `tabular-actgan` with the path to a custom config or select any of the tabular configs [listed here](https://github.com/gretelai/gretel-blueprints/tree/main/config_templates/gretel/synthetics).\n","\n","- The training data is passed in using the `data_source` argument. Its type can be a file path or `DataFrame`.\n","\n","- **Tip:** Click the printed Console URL to monitor your job's progress in the Gretel Console."],"metadata":{"id":"SwROZthrvXil"}},{"cell_type":"code","source":["trained = gretel.submit_train(\"tabular-actgan\", data_source=dataset)"],"metadata":{"id":"i89eGZwIxSCW"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## 🧐 Evaluate the synthetic data quality\n","\n","- Gretel automatically creates a [synthetic data quality report](https://docs.gretel.ai/reference/evaluate/synthetic-data-quality-report) for each model you train.\n","\n","- The training results object returned by `submit_train` has a `GretelReport` attribute for viewing the quality report.\n"],"metadata":{"id":"eljkfb8jb_hK"}},{"cell_type":"code","source":["# view the quality scores\n","print(trained.report)"],"metadata":{"id":"bNZqhFPOclrV"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# display the full report within this notebook\n","trained.report.display_in_notebook()"],"metadata":{"id":"3QMiP7lKecE5"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# inspect the synthetic data used to create the report\n","df_synth_report = trained.fetch_report_synthetic_data()\n","df_synth_report.head()"],"metadata":{"id":"2dHuQT_cuIno"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## πŸ€– Generate synthetic data\n","\n","- The `model_id` argument can be the ID of any trained model within the current project.\n"],"metadata":{"id":"ZIeY7TczxvDV"}},{"cell_type":"code","source":["generated = gretel.submit_generate(trained.model_id, num_records=1000)"],"metadata":{"id":"J6XZUuR2eguX"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# inspect the generated synthetic data\n","generated.synthetic_data.head()"],"metadata":{"id":"-_do0Kvvunv2"},"execution_count":null,"outputs":[]}]}
{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"authorship_tag":"ABX9TyNosAwAWvwVU9i43TeCxQrP"},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_101_Blueprint.ipynb)\n","\n","<br>\n","\n","<center><a href=https://gretel.ai/><img src=\"https://gretel-public-website.s3.us-west-2.amazonaws.com/assets/brand/gretel_brand_wordmark.svg\" alt=\"Gretel\" width=\"350\"/></a></center>\n","\n","<br>\n","\n","## Welcome to the Gretel 101 Blueprint!\n","\n","In this Blueprint, we will use Gretel to train a deep generative model and use it to generate high-quality synthetic (tabular) data. We will accomplish this by submitting training and generation jobs to the [Gretel Cloud](https://gretel.ai/faqs/gretel-cloud) via [Gretel's Python SDK](https://docs.gretel.ai/guides/environment-setup/cli-and-sdk).\n","\n","Behind the scenes, Gretel will spin up workers with the necessary compute resources, set up the model with your desired configuration, and perform the submitted task.\n","\n","## Create your Gretel account\n","\n","To get started, you will need to [sign up for a free Gretel account](https://console.gretel.ai/).\n","\n","<br>\n","\n","#### Ready? Let's go πŸš€"],"metadata":{"id":"nwpvdB3Jn5hG"}},{"cell_type":"markdown","source":["## πŸ’Ύ Install `gretel-client` and its dependencies"],"metadata":{"id":"MPHEAxLufyEo"}},{"cell_type":"code","execution_count":null,"metadata":{"id":"zFeKqpkunEo1"},"outputs":[],"source":["%%capture\n","!pip install gretel-client"]},{"cell_type":"markdown","source":["## πŸ›œ Configure your Gretel session\n","\n","- The `Gretel` object provides a high-level interface for streamlining interactions with Gretel's APIs.\n","\n","- Each `Gretel` instance is bound to a single [Gretel project](https://docs.gretel.ai/guides/gretel-fundamentals/projects).\n","\n","- Running the cell below will prompt you for your Gretel API key, which you can retrieve [here](https://console.gretel.ai/users/me/key).\n","\n","- With `validate=True`, your login credentials will be validated immediately at instantiation."],"metadata":{"id":"DNdDXiI-Xkf1"}},{"cell_type":"code","source":["from gretel_client import Gretel\n","\n","gretel = Gretel(api_key=\"prompt\", validate=True)"],"metadata":{"id":"5qnVwoPZx4j0"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# @title πŸ—‚οΈ Pick a tabular dataset πŸ‘‡ { display-mode: \"form\" }\n","dataset_path_dict = {\n"," \"adult income in the USA (14000 records, 15 fields)\": \"https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/sample_data/us-adult-income.csv\",\n"," \"hospital length of stay (9999 records, 18 fields)\": \"https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/sample_data/sample-synthetic-healthcare.csv\",\n"," \"customer churn (7032 records, 21 fields)\": \"https://raw.githubusercontent.com/gretelai/gretel-blueprints/main/sample_data/monthly-customer-payments.csv\"\n","}\n","\n","dataset = \"adult income in the USA (14000 records, 15 fields)\" # @param [\"adult income in the USA (14000 records, 15 fields)\", \"hospital length of stay (9999 records, 18 fields)\", \"customer churn (7032 records, 21 fields)\"]\n","dataset = dataset_path_dict[dataset]\n"],"metadata":{"id":"uRbY7vk3tSBg"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["import pandas as pd\n","\n","# explore the data using pandas\n","df = pd.read_csv(dataset)\n","df.head()"],"metadata":{"id":"cW3VKpyPvm6W"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## πŸ‹οΈβ€β™‚οΈ Train a generative model\n","\n","- The [tabular-actgan](https://github.com/gretelai/gretel-blueprints/blob/main/config_templates/gretel/synthetics/tabular-actgan.yml) base config tells Gretel which model to train and how to configure it.\n","\n","- You can replace `tabular-actgan` with the path to a custom config file, or you can select any of the tabular configs [listed here](https://github.com/gretelai/gretel-blueprints/tree/main/config_templates/gretel/synthetics).\n","\n","- The training data is passed in using the `data_source` argument. Its type can be a file path or `DataFrame`.\n","\n","- **Tip:** Click the printed Console URL to monitor your job's progress in the Gretel Console."],"metadata":{"id":"SwROZthrvXil"}},{"cell_type":"code","source":["trained = gretel.submit_train(\"tabular-actgan\", data_source=dataset)"],"metadata":{"id":"i89eGZwIxSCW"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## 🧐 Evaluate the synthetic data quality\n","\n","- Gretel automatically creates a [synthetic data quality report](https://docs.gretel.ai/reference/evaluate/synthetic-data-quality-report) for each model you train.\n","\n","- The training results object returned by `submit_train` has a `GretelReport` attribute for viewing the quality report.\n"],"metadata":{"id":"eljkfb8jb_hK"}},{"cell_type":"code","source":["# view the quality scores\n","print(trained.report)"],"metadata":{"id":"bNZqhFPOclrV"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# display the full report within this notebook\n","trained.report.display_in_notebook()"],"metadata":{"id":"3QMiP7lKecE5"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# inspect the synthetic data used to create the report\n","df_synth_report = trained.fetch_report_synthetic_data()\n","df_synth_report.head()"],"metadata":{"id":"2dHuQT_cuIno"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## πŸ€– Generate synthetic data\n","\n","- The `model_id` argument can be the ID of any trained model within the current project.\n"],"metadata":{"id":"ZIeY7TczxvDV"}},{"cell_type":"code","source":["generated = gretel.submit_generate(trained.model_id, num_records=1000)"],"metadata":{"id":"J6XZUuR2eguX"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["# inspect the generated synthetic data\n","generated.synthetic_data.head()"],"metadata":{"id":"-_do0Kvvunv2"},"execution_count":null,"outputs":[]}]}

0 comments on commit 3249de9

Please sign in to comment.