Skip to content

Commit

Permalink
Merge branch 'main' into pt111-example-update
Browse files Browse the repository at this point in the history
  • Loading branch information
atqy authored Sep 23, 2022
2 parents 5c64ca5 + 7af5f4e commit 1ecc0e2
Show file tree
Hide file tree
Showing 24 changed files with 248,278 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,7 @@
"\n",
"\n",
"# # Build the docker image locally with the image name and then push it to ECR\n",
"image_id=$(docker images -q | head -n1)\n",
"docker tag $image_id ${fullname}\n",
"docker tag deepjavalibrary/djl-serving:0.18.0-deepspeed ${fullname}\n",
"\n",
"docker push $fullname"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
"\n",
"You can use a streaming labeling job to perpetually send new data objects to Amazon SageMaker Ground Truth to be labeled. Ground Truth streaming labeling jobs remain active until they are manually stopped or have been idle for more than 10 days. You can intermittently send new data objects to workers while the labeling job is active. \n",
"\n",
"#### Note: Streaming Labeling Jobs are currently not supported on the Ground Truth Console. The way to work with this is launching the streaming job via api and viewing statistics on the Ground Truth labeling job.\n",
"\n",
"\n",
"Use this notebook to create a Ground Truth streaming labeling job using any of the [built-in task types](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-task-types.html). You can make necessary parameter changes for the custom workflow. You can either configure the notebook to create a labeling job using your own input data, or run the notebook on *default* mode and use provided, image input data. **To use your own input data, set `DEFAULT` to `False`**."
]
},
Expand Down
9 changes: 8 additions & 1 deletion index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ We recommend the following notebooks as a broad introduction to the capabilities
:maxdepth: 1
:caption: Prepare data

sagemaker-datawrangler/index
sagemaker_processing/spark_distributed_data_processing/sagemaker-spark-processing_outputs
sagemaker_processing/basic_sagemaker_data_processing/basic_sagemaker_processing_outputs

Expand Down Expand Up @@ -210,10 +211,16 @@ More examples
sagemaker-clarify/index
scientific_details_of_algorithms/index
aws_marketplace/index



.. toctree::
:maxdepth: 1
:caption: Community examples

contrib/index
contrib/index





41 changes: 41 additions & 0 deletions sagemaker-datawrangler/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
![Amazon SageMaker Data Wrangler](https://github.com/aws/amazon-sagemaker-examples/raw/main/_static/sagemaker-banner.png)

# Amazon SageMaker Data Wrangler Examples

Example flows that demonstrate how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler.

## :books: Background

[Amazon SageMaker Data Wrangler](https://aws.amazon.com/sagemaker/data-wrangler/) reduces the time it takes to aggregate and prepare data for ML. From a single interface in SageMaker Studio, you can import data from Amazon S3, Amazon Athena, Amazon Redshift, AWS Lake Formation, and Amazon SageMaker Feature Store, and in just a few clicks SageMaker Data Wrangler will automatically load, aggregate, and display the raw data. It will then make conversion recommendations based on the source data, transform the data into new features, validate the features, and provide visualizations with recommendations on how to remove common sources of error such as incorrect labels. Once your data is prepared, you can build fully automated ML workflows with Amazon SageMaker Pipelines or import that data into Amazon SageMaker Feature Store.



The [SageMaker example notebooks](https://sagemaker-examples.readthedocs.io/en/latest/) are Jupyter notebooks that demonstrate the usage of Amazon SageMaker.

## :hammer_and_wrench: Setup

Amazon SageMaker Data Wrangler is a feature in Amazon SageMaker Studio. Use this section to learn how to access and get started using Data Wrangler. Do the following:

* Complete each step in [Prerequisites](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-prerequisite).

* Follow the procedure in [Access Data Wrangler](https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-access) to start using Data Wrangler.




## :notebook: Examples

### **[Tabular DataFlow](tabular-dataflow/README.md)**

This example provide quick walkthrough of how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler for Tabular dataset.

### **[Timeseries DataFlow](timeseries-dataflow/readme.md)**

This example provide quick walkthrough of how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler for Timeseries dataset.

### **[Joined DataFlow](joined-dataflow/readme.md)**

This example provide quick walkthrough of how to aggregate and prepare data for Machine Learning using Amazon SageMaker Data Wrangler for Joined dataset.



11 changes: 11 additions & 0 deletions sagemaker-datawrangler/import-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## Import Flow

Each of the example has a flow file available which you can directly import to expedite the process or validate the flow.

Here are the steps to import the flow

* Download the flow file

* In Sagemaker Studio, drag and drop the flow file or use the upload button to browse the flow and upload

![uploadflow](/uploadflow.png)
69 changes: 69 additions & 0 deletions sagemaker-datawrangler/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@


Amazon SageMaker Data Wrangler
=======================================

These example flows demonstrates how to aggregate and prepare data for
Machine Learning using Amazon SageMaker Data Wrangler.


------------------

`Amazon SageMaker Data
Wrangler <https://aws.amazon.com/sagemaker/data-wrangler/>`__ reduces
the time it takes to aggregate and prepare data for ML. From a single
interface in SageMaker Studio, you can import data from Amazon S3,
Amazon Athena, Amazon Redshift, AWS Lake Formation, and Amazon SageMaker
Feature Store, and in just a few clicks SageMaker Data Wrangler will
automatically load, aggregate, and display the raw data. It will then
make conversion recommendations based on the source data, transform the
data into new features, validate the features, and provide
visualizations with recommendations on how to remove common sources of
error such as incorrect labels. Once your data is prepared, you can
build fully automated ML workflows with Amazon SageMaker Pipelines or
import that data into Amazon SageMaker Feature Store.

The `SageMaker example
notebooks <https://sagemaker-examples.readthedocs.io/en/latest/>`__ are
Jupyter notebooks that demonstrate the usage of Amazon SageMaker.

Setup
-------------------------

Amazon SageMaker Data Wrangler is a feature in Amazon SageMaker Studio.
Use this section to learn how to access and get started using Data
Wrangler. Do the following:

- Complete each step in
`Prerequisites <https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-prerequisite>`__.

- Follow the procedure in `Access Data
Wrangler <https://docs.aws.amazon.com/sagemaker/latest/dg/data-wrangler-getting-started.html#data-wrangler-getting-started-access>`__
to start using Data Wrangler.

Examples
-------------------

Tabular Dataflow
---------------------------

.. toctree::
:maxdepth: 1

tabular-dataflow/index

Timeseries Dataflow
----------------------------

.. toctree::
:maxdepth: 1

timeseries-dataflow/index

Joined Dataflow
----------------------------

.. toctree::
:maxdepth: 1

joined-dataflow/index
Loading

0 comments on commit 1ecc0e2

Please sign in to comment.