-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into console_not_supported
Showing
22 changed files
with
248,274 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
![Amazon SageMaker Data Wrangler](https://github.com/aws/amazon-sagemaker-examples/raw/main/_static/sagemaker-banner.png) | ||
|
||
# Amazon SageMaker Data Wrangler Examples | ||
|
||
Example flows that demonstrate how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler. | ||
|
||
## :books: Background | ||
|
||
[Amazon SageMaker Data Wrangler](https://aws.amazon.com/sagemaker/data-wrangler/) reduces the time it takes to aggregate and prepare data for ML. From a single interface in SageMaker Studio, you can import data from Amazon S3, Amazon Athena, Amazon Redshift, AWS Lake Formation, and Amazon SageMaker Feature Store, and in just a few clicks SageMaker Data Wrangler will automatically load, aggregate, and display the raw data. It will then make conversion recommendations based on the source data, transform the data into new features, validate the features, and provide visualizations with recommendations on how to remove common sources of error such as incorrect labels. Once your data is prepared, you can build fully automated ML workflows with Amazon SageMaker Pipelines or import that data into Amazon SageMaker Feature Store. | ||
|
||
|
||
|
||
The [SageMaker example notebooks](https://sagemaker-examples.readthedocs.io/en/latest/) are Jupyter notebooks that demonstrate the usage of Amazon SageMaker. | ||
|
||
## :hammer_and_wrench: Setup | ||
|
||
Amazon SageMaker Data Wrangler is a feature in Amazon SageMaker Studio. Use this section to learn how to access and get started using Data Wrangler. Do the following: | ||
|
||
* Complete each step in [Prerequisites](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-prerequisite). | ||
|
||
* Follow the procedure in [Access Data Wrangler](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-access) to start using Data Wrangler. | ||
|
||
|
||
|
||
|
||
## :notebook: Examples | ||
|
||
### **[Tabular DataFlow](tabular-dataflow/README.md)** | ||
|
||
This example provide quick walkthrough of how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler for Tabular dataset. | ||
|
||
### **[Timeseries DataFlow](timeseries-dataflow/readme.md)** | ||
|
||
This example provide quick walkthrough of how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler for Timeseries dataset. | ||
|
||
### **[Joined DataFlow](joined-dataflow/readme.md)** | ||
|
||
This example provide quick walkthrough of how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler for Joined dataset. | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
## Import Flow | ||
|
||
Each of the example has a flow file available which you can directly import to expedite the process or validate the flow. | ||
|
||
Here are the steps to import the flow | ||
|
||
* Download the flow file | ||
|
||
* In Sagemaker Studio, drag and drop the flow file or use the upload button to browse the flow and upload | ||
|
||
![uploadflow](/uploadflow.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
|
||
|
||
Amazon SageMaker Data Wrangler | ||
======================================= | ||
|
||
These example flows demonstrates how to aggregate and prepare data for | ||
Machine Learning using Amazon SageMaker Data Wrangler. | ||
|
||
|
||
------------------ | ||
|
||
`Amazon SageMaker Data | ||
Wrangler <https://aws.amazon.com/sagemaker/data-wrangler/>`__ reduces | ||
the time it takes to aggregate and prepare data for ML. From a single | ||
interface in SageMaker Studio, you can import data from Amazon S3, | ||
Amazon Athena, Amazon Redshift, AWS Lake Formation, and Amazon SageMaker | ||
Feature Store, and in just a few clicks SageMaker Data Wrangler will | ||
automatically load, aggregate, and display the raw data. It will then | ||
make conversion recommendations based on the source data, transform the | ||
data into new features, validate the features, and provide | ||
visualizations with recommendations on how to remove common sources of | ||
error such as incorrect labels. Once your data is prepared, you can | ||
build fully automated ML workflows with Amazon SageMaker Pipelines or | ||
import that data into Amazon SageMaker Feature Store. | ||
|
||
The `SageMaker example | ||
notebooks <https://sagemaker-examples.readthedocs.io/en/latest/>`__ are | ||
Jupyter notebooks that demonstrate the usage of Amazon SageMaker. | ||
|
||
Setup | ||
------------------------- | ||
|
||
Amazon SageMaker Data Wrangler is a feature in Amazon SageMaker Studio. | ||
Use this section to learn how to access and get started using Data | ||
Wrangler. Do the following: | ||
|
||
- Complete each step in | ||
`Prerequisites <https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-prerequisite>`__. | ||
|
||
- Follow the procedure in `Access Data | ||
Wrangler <https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-access>`__ | ||
to start using Data Wrangler. | ||
|
||
Examples | ||
------------------- | ||
|
||
Tabular Dataflow | ||
--------------------------- | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
tabular-dataflow/index | ||
|
||
Timeseries Dataflow | ||
---------------------------- | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
timeseries-dataflow/index | ||
|
||
Joined Dataflow | ||
---------------------------- | ||
|
||
.. toctree:: | ||
:maxdepth: 1 | ||
|
||
joined-dataflow/index |
Oops, something went wrong.