diff --git a/use_cases/details/gpt-dp.md b/use_cases/details/gpt-dp.md new file mode 100644 index 00000000..ceb9d5cd --- /dev/null +++ b/use_cases/details/gpt-dp.md @@ -0,0 +1,7 @@ +![Create free text data with privacy guarantees](https://blueprints.gretel.cloud/use_cases/images/gpt-dp.png "Create free text data with privacy guarantees") + +Unlock the potential of your text data while ensuring privacy by applying [differentially private fine-tuning using GPT](https://gretel.ai/blog/generate-differentially-private-synthetic-text-with-gretel-gpt). This method allows you to create a version of your free text data that maintains the integrity of sensitive information while still providing high-quality outputs. + +We recommend having a dataset of at least 10,000 samples to ensure reasonable quality. Note that differential privacy requires more epochs, which leads to longer training times compared to running without differential privacy. + +Prefer coding? Check out the [SDK notebook](https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/generate_differentially_private_synthetic_text.ipynb) example. \ No newline at end of file diff --git a/use_cases/details/navigator-ft-simple.md b/use_cases/details/navigator-ft-simple.md new file mode 100644 index 00000000..9e4f56af --- /dev/null +++ b/use_cases/details/navigator-ft-simple.md @@ -0,0 +1,8 @@ +![Generate multi-modal synthetic data with Navigator Fine Tuning](https://blueprints.gretel.cloud/use_cases/images/navigator-ft-hero.png "Generate multi-modal synthetic data with Navigator Fine Tuning") + +If you’re new to Gretel, our Navigator Fine-Tuning blueprint is a great place to start. This blueprint automatically selects our comprehensive multi-modal model, a great one-stop shop for most synthetic data generation needs. Just answer a few questions, review the model configuration and hit **Run**. + +Navigator Fine-Tuning supports mutliple tabular modalities of data within a single model, such as numeric, categorical, and free text data. + +Prefer coding? Check out the [SDK notebook](https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/demo/navigator-fine-tuning-intro-tutorial.ipynb) example. + diff --git a/use_cases/details/navigator-ft.md b/use_cases/details/navigator-ft.md index f106d3e3..85e9a486 100644 --- a/use_cases/details/navigator-ft.md +++ b/use_cases/details/navigator-ft.md @@ -1,6 +1,6 @@ ![Generate synthetic tabular, text and time series data](https://blueprints.gretel.cloud/use_cases/images/navigator-ft-hero.png "Generate synthetic tabular, text and time series data") -We are excited to announce the public preview of **Navigator Fine Tuning**, the latest advancement in our suite of synthetic data solutions. This new feature builds upon the recent general availability of [Gretel Navigator](https://console.gretel.ai/navigator), enabling you to generate data not only from a prompt, but also from fine-tuning the underlying model on your domain-specific real-world datasets to generate the highest quality synthetic data. +**Navigator Fine Tuning** is the latest advancement in our suite of synthetic data solutions. It builds upon the recent general availability of [Gretel Navigator](https://console.gretel.ai/navigator), enabling you to generate data not only from a prompt, but also from fine-tuning the underlying model on your domain-specific real-world datasets to generate the highest quality synthetic data. One of the standout features of Navigator Fine Tuning is its support for multiple tabular data modalities within a single model. This means you can now generate datasets that maintain correlations across: - Numeric Data: Continuous or discrete numbers diff --git a/use_cases/details/synthetic.md b/use_cases/details/synthetic.md index e9e9ed33..9003b4e2 100644 --- a/use_cases/details/synthetic.md +++ b/use_cases/details/synthetic.md @@ -1,6 +1,6 @@ ![Generate synthetic tabular data](https://blueprints.gretel.cloud/use_cases/images/synthetic-tabular-generation.png "Generate synthetic tabular data") -If you’re new to Gretel, our synthetic data blueprint is a great place to start. This gentle introduction to synthetic data generation automatically selects our popular [ACTGAN model](https://gretel.ai/blog/scale-synthetic-data-to-millions-of-rows-with-actgan) and provides a sample healthcare dataset. Just answer a few questions, review the model configuration and hit **Run**. +The synthetic data blueprint is a great introduction to synthetic data generation using our [ACTGAN model](https://gretel.ai/blog/scale-synthetic-data-to-millions-of-rows-with-actgan) for numeric and categorical data using a sample healthcare dataset. Just answer a few questions, review the model configuration and hit **Run**. Prefer coding? Check out the [Gretel 101 notebook](https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_101_Blueprint.ipynb) example. Synthesize data in just 4 lines of code! diff --git a/use_cases/gretel.json b/use_cases/gretel.json index 805a198c..03a333cf 100644 --- a/use_cases/gretel.json +++ b/use_cases/gretel.json @@ -17,62 +17,22 @@ } }, { - "gtmId": "use-case-synthetic", - "title": "Generate synthetic data from complex tabular datasets", - "description": "Handle high-dimensional data with thousands of columns and millions of rows.", - "cardType": "Console", - "icon": "synthetics.png", - "detailsFileName": "synthetic.md", - "modelType": "synthetics", - "modelCategory": "synthetics", - "defaultConfig": "config_templates/gretel/synthetics/tabular-actgan.yml", - "sampleDataset": { - "fileName": "sample-synthetic-healthcare.csv", - "description": "Use this sample electronic health records (EHR) dataset to synthesize an entirely new set of statistically equivalent records.", - "records": 9999, - "fields": 18, - "trainingTime": "6 mins", - "bytes": 830021 - }, - "button1": { - "label": "Gretel 101 Notebook", - "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_101_Blueprint.ipynb" - }, - "button2": { - "label": "Advanced Examples Notebook", - "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_Advanced_Tabular_Blueprint.ipynb" - } - }, - { - "gtmId": "use-case-navigator-ft", - "title": "[Public Preview] Generate synthetic tabular, text and time series data with Navigator Fine Tuning ", - "description": "Try out our latest synthetic model supporting tabular, text, JSON and time series data in a single dataset.", + "gtmId": "use-case-navigator-ft-simple", + "title": "Generate multi-modal synthetic data with Navigator Fine Tuning", + "description": "Try out our latest synthetic model to combine numeric data, categorical data, free text data, and more in a single dataset.", "cardType": "Console", - "tag": "Preview", - "icon": "navigator-ft.png", - "detailsFileName": "navigator-ft.md", + "tag": "New", + "icon": "navigator-ft-simple.png", + "detailsFileName": "navigator-ft-simple.md", "modelType": "navigator_ft", "modelCategory": "synthetics", - "defaultConfig": "config_templates/gretel/synthetics/navigator-ft.yml", - "sampleDataset": { - "fileName": "sample-patient-events.csv", - "description": "This medical dataset contains sequences of annotated events (such as hospital admission, diagnosis, treatment, etc.) for 1,712 synthetic patients.", - "records": 7348, - "fields": 17, - "trainingTime": "25 mins", - "bytes": 2386363 - }, - "button1": { - "label": "SDK Notebook", - "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/demo/navigator-fine-tuning-intro-tutorial.ipynb" - } + "defaultConfig": "config_templates/gretel/synthetics/navigator-ft.yml" }, { "gtmId": "use-case-redact-pii", "title": "Transform unstructured data into AI-ready formats", "description": "De-identify, transform, or label text and tabular data for AI.", "cardType": "Console", - "tag": "New", "icon": "transform.png", "modelType": "transform_v2", "modelCategory": "transform", @@ -86,6 +46,22 @@ "bytes": 5647 } }, + { + "gtmId": "use-case-gpt-dp", + "title": "Create free text data with privacy guarantees", + "description": "Leverage differentially private fine-tuning with GPT to generate a provably-private version of your free text data.", + "cardType": "Console", + "tag": "New", + "icon": "GPTwithDP.png", + "detailsFileName": "gpt-dp.md", + "modelType": "gpt_x", + "modelCategory": "synthetics", + "defaultConfig": "config_templates/gretel/synthetics/natural-language-differential-privacy.yml", + "button1": { + "label": "SDK Notebook", + "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/generate_differentially_private_synthetic_text.ipynb" + } + }, { "gtmId": "use-case-gretel_tuner", "title": "Optimize your synthetic data", @@ -102,6 +78,33 @@ "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/demo/gretel-tuner-advanced-tutorial.ipynb" } }, + { + "gtmId": "use-case-synthetic", + "title": "Generate synthetic data from complex tabular datasets", + "description": "Handle high-dimensional data with thousands of columns and millions of rows.", + "cardType": "Console", + "icon": "synthetics.png", + "detailsFileName": "synthetic.md", + "modelType": "synthetics", + "modelCategory": "synthetics", + "defaultConfig": "config_templates/gretel/synthetics/tabular-actgan.yml", + "sampleDataset": { + "fileName": "sample-synthetic-healthcare.csv", + "description": "Use this sample electronic health records (EHR) dataset to synthesize an entirely new set of statistically equivalent records.", + "records": 9999, + "fields": 18, + "trainingTime": "6 mins", + "bytes": 830021 + }, + "button1": { + "label": "Gretel 101 Notebook", + "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_101_Blueprint.ipynb" + }, + "button2": { + "label": "Advanced Examples Notebook", + "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/sdk_blueprints/Gretel_Advanced_Tabular_Blueprint.ipynb" + } + }, { "gtmId": "use-case-tabular-dp", "title": "Create provably private versions of sensitive data", @@ -165,6 +168,29 @@ "bytes": 63000 } }, + { + "gtmId": "use-case-navigator-ft", + "title": "Generate synthetic tabular, text and time series data with Navigator Fine Tuning ", + "description": "Try out our latest synthetic model supporting tabular, text, JSON and time series data in a single dataset.", + "cardType": "Console", + "icon": "navigator-ft.png", + "detailsFileName": "navigator-ft.md", + "modelType": "navigator_ft", + "modelCategory": "synthetics", + "defaultConfig": "config_templates/gretel/synthetics/navigator-ft.yml", + "sampleDataset": { + "fileName": "sample-patient-events.csv", + "description": "This medical dataset contains sequences of annotated events (such as hospital admission, diagnosis, treatment, etc.) for 1,712 synthetic patients.", + "records": 7348, + "fields": 17, + "trainingTime": "25 mins", + "bytes": 2386363 + }, + "button1": { + "label": "SDK Notebook", + "link": "https://colab.research.google.com/github/gretelai/gretel-blueprints/blob/main/docs/notebooks/demo/navigator-fine-tuning-intro-tutorial.ipynb" + } + }, { "gtmId": "use-case-transform-database", "title": "Redact PII in a database", diff --git a/use_cases/icons/navigator-ft-simple.png b/use_cases/icons/navigator-ft-simple.png new file mode 100644 index 00000000..4bfdc0fe Binary files /dev/null and b/use_cases/icons/navigator-ft-simple.png differ diff --git a/use_cases/icons/navigator-ft-simple@2x.png b/use_cases/icons/navigator-ft-simple@2x.png new file mode 100644 index 00000000..b9a45549 Binary files /dev/null and b/use_cases/icons/navigator-ft-simple@2x.png differ diff --git a/use_cases/icons/navigator-ft-simple@3x.png b/use_cases/icons/navigator-ft-simple@3x.png new file mode 100644 index 00000000..9c189d1a Binary files /dev/null and b/use_cases/icons/navigator-ft-simple@3x.png differ diff --git a/use_cases/images/gpt-dp.png b/use_cases/images/gpt-dp.png new file mode 100644 index 00000000..b26d5d9e Binary files /dev/null and b/use_cases/images/gpt-dp.png differ