This repository demonstrates an approach to using Amazon SageMaker's support for bring-your-own-algorithms and frameworks to train and deploy MATLAB machine learning models.
For more information see Running MATLAB Machine Learning Jobs in Amazon SageMaker webinar from MATLAB Expo 2023.
- An Amazon Web Services™ (AWS) account.
- A Linux® computer with
This repository includes code to build a Docker container that uses MATLAB batch token licensing. Batch token licensing is currently only available as part of a pilot program. For more information about batch token eligibility, contact the MathWorks cloud team at [email protected].
You are responsible for the cost of the AWS services used.
matlab
- MATLAB code that providessagemaker.MATLABEstimator()
and other classes. It uses Amazon SageMaker Python SDK to call Amazon SageMaker APIs. For training a model this requires building the training image from thedocker
folder. To deploy a model this uses MATLAB Compiler SDK'scompiler.package.microserviceDockerImage
functionality.docker
- aDockerfile
that builds a container suitable for using in a SageMaker training job to train a MATLAB model. This image contains MATLAB code that glues the Amazon SageMaker training environment to the user supplied MATLAB code for training models.
TrainAndDeployClassificationTree
- shows using MATLAB in a Amazon SageMaker training job to train a decision tree (usingfitctree
) on the Fisher iris data, deploying that model to a Amazon SageMaker endpoint, and then requesting a prediction from that endpoint.DeployExistingModel.mlx
- shows deploying a pretrained MATLAB model to Amazon SageMaker endpoint, and then requesting a prediction from that endpoint.
If you are using a AWS profile
other than default
add it to .env
file in root of repo
echo AWS_PROFILE=myprofile > .env
Put your MATLAB batch licensing token in training.env
This container is based on mathworks/matlab-deps
with the following changes/additions:
- Runs as root (as Amazon SageMaker requires this to have access to mounted volumes)
- Install MATLAB, Statistics and Machine Learning Toolbox, and Parallel Computing Toolbox.
- Adds
matlab-batch
- Adds MATLAB code from this repository that glues the Amazon SageMaker training environment to the user supplied MATLAB code.
- Provides entrypoint that installs any other required products and then calls
matlab-batch
to run the training code.
To build and push the training image
cd docker
make build
make test-local
make push
In MATLAB:
- Create a
sagemaker.MATLABEstimator()
- Upload training data to
s3
- Call
fit()
- The
TrainingFunction
is analysed and a .mltbx file is created containing the TrainingFunction and all files required to execute it. - This mltbx file is copied to s3 and it's location passed to the training job via the hyperparameters.
- Any additional products required are passed to the training image via the
MATLAB_REQUIRED_PRODUCTS
environment variable. - SageMaker runs the training image with the
train
command - The training image installs any products specified by
MATLAB_REQUIRED_PRODUCTS
and then callsmatlab-batch train
train
is a function provided by this repo that installs the training job mltbx and executes the training function from that
- To deploy a model need to provide an
inference handler
that subclassessagemaker_inference.DefaultInferenceHandler
- When deploying a model the inference handler is compiled and packaged using Compiler SDK's
compiler.package.microserviceDockerImage
functionality.- The Dockerfile generated by
compiler.package.microserviceDockerImage
is modified before building the image to meet SageMaker's requirements for an inference container.
- The Dockerfile generated by
- The image is pushed to a Amazon Elastic Container Registry (Amazon ECR) registry
- an Amazon SageMaker endpoint is create that uses that container, and then predicton can be made against that endpoint just like any other Amazon SageMaker endpoint.
- Create a new MATLAB class that inherits from
sagemaker_inference.DefaultInferenceHandler
- At a minimum add a
%#function
pragma to this implementation to specify the MATLAB functions needed to evaluate the model types you want to support. - Override any of
decode_input
,load_model
,predict
orencode_output
methods.
decode_input
: default implementation supports input data of typetext/csv
data and returns the data as a MATLABtable
encode_output
: default implementation supports encodine a MATLAB table of output data with typetext/csv
load_model
: default implementation loads a variable calledmodel
from a MAT file calledmodel.mat
from the SageMakermodel
folder (typically/opt/ml/model)
)predict
: default implementation attempts to evaluate the loaded modeloutput = model(inputData)
.
Copyright 2023 The MathWorks, Inc.