Problem statement: One of the most anticipated capabilities of machine learning and AI is to help people with disabilities. The deaf community cannot do what most of the population takes for granted and are often placed in degrading situations due to these challenges they face every day. In this ZenML Project you will see how computer vision can be used to create a model that can bridge the gap for the deaf and hard of hearing by learning American Sign Language and be able to understand the meaning of each sign.
This project uses ZenML to create a pipeline that will train a model to detect and recognize the American Sign Language alphabet in real-time images using Yolov5, MLFlow and the Vertex AI Platform.
The purpose of this repository is to demonstrate how ZenML empowers you to build, track and deploy a computer vision pipeline using some of the most popular tools in the industry:
- By offering you a framework and template on which you can base your own work
- By using a custom code Object Detection algorithm called Yolov5
- By integrating with tools like MLflow to track the hyperparameters and metrics of the model
- By allowing you to train your model on Google Vertex AI Platform with minimal effort
Note: This project is based on Interactive ABC's with American Sign Language. The main difference is that this project is using ZenML to create a pipeline that will train a model to detect and recognize the American Sign Language alphabet in real-time images using Yolov5, MLFlow and Vertex AI Platform.
In order to build a model that can detect and recognize the American Sign Language alphabet in real-time images we will need to do the following steps:
- Download the dataset from Roboflow
- Augment the training and valdiation sets using Albumentations
- Train the model using a pretrained model from Yolov5 while tracking the hyperparameters and metrics using MLflow within a GPU environment by leveraging Google's Vertex AI Step Operator stack component.
- Load the model in a different pipeline that deploys the model using BentoML and the provided ZenML integration.
- Create an inference pipeline that will use the deployed model to detect and recognize the American Sign Language alphabet in test images from the first pipeline.
In order to follow this tutorial, you need to have the following software installed on your local machine:
- Python (version 3.7-3.9)
- Docker
- GCloud CLI (authenticated)
- MLFlow Tracking Server (deployed remotely)
- Remote ZenML Server: a Remote Deployment of the ZenML HTTP server and database
For advanced use cases where we have a remote orchestrator or step operators such as Vertex AI or to share stacks and pipeline information with a team we need to have a separated non-local remote ZenML Server that can be accessible from your machine as well as all stack components that may need access to the server. Read more information about the use case here
In order to achieve this there are two different ways to get access to a remote ZenML Server.
- Deploy and manage the server manually on your own cloud/
- Sign up for ZenML Enterprise and get access to a hosted version of the ZenML Server with no setup required.
Let's jump into the Python packages you need. Within the Python environment of your choice, run:
git clone https://github.com/zenml-io/zenml-projects.git
git submodule update --init --recursive
cd zenml-projects/sign-language-detection-yolov5
pip install -r requirements.txt
pip install -r yolov5/requirements.txt
Starting with ZenML 0.20.0, ZenML comes bundled with a React-based dashboard. This dashboard allows you to observe your stacks, stack components and pipeline DAGs in a dashboard interface. To access this, you need to launch the ZenML Server and Dashboard locally, but first you must install the optional dependencies for the ZenML server:
zenml connect --url=$ZENML_SERVER_URL
zenml init
I will show how to create Google Cloud resources for this project using gcloud cli
. Follow this if you don't have it set up.
List the current configurations and check that project_id
is set to your GCP project:
gcloud config list
If not, use:
gcloud config set project <PROJECT_ID>
Create a service account:
gcloud iam service-accounts create <NAME>
# Example:
gcloud iam service-accounts create zenml-sa
Grant permission to the service account:
gcloud projects add-iam-policy-binding <PROJECT_ID> --member="serviceAccount:<SA-NAME>@<PROJECT_ID>.iam.gserviceaccount.com" --role=<ROLE>
# Example:
gcloud projects add-iam-policy-binding zenml-vertex-ai --member="serviceAccount:[email protected]" --role=roles/storage.admin
gcloud projects add-iam-policy-binding zenml-vertex-ai --member="serviceAccount:[email protected]" --role=roles/aiplatform.admin
Generate a key file:
gcloud iam service-accounts keys create <FILE-NAME>.json --iam-account=<SA-NAME>@<PROJECT_ID>.iam.gserviceaccount.com
# Example:
gcloud iam service-accounts keys create credentials.json [email protected]
Make sure that these credentials are available in your environment so that the relevant permissions are picked up for GCP authentication:
export GOOGLE_APPLICATION_CREDENTIALS="<PATH_TO_JSON_KEY_FILE>"
Vertex AI and ZenML will use this bucket for output of any artifacts from the training run:
gsutil mb -l <REGION> gs://bucket-name
# Example:
gsutil mb -l europe-west1 gs://zenml-bucket
ZenML will use this registry to push your job images that Vertex will use.
a) Enable Container Registry
b) Authenticate your local docker
cli with your GCP container registry:
docker pull busybox
docker tag busybox gcr.io/<PROJECT-ID/busybox
docker push gcr.io/<PROJECT-ID>/busybox
5. Enable Vertex AI API
To be able to use custom Vertex AI jobs, you first need to enable their API inside Google Cloud console.
Set a GCP bucket as your artifact store:
zenml artifact-store register <NAME> --flavor=gcp --path=<GCS_BUCKET_PATH>
# Example:
zenml artifact-store register gcp-store --flavor=gcp --path=gs://zenml-bucket
Create a Vertex step operator:
zenml step-operator register <NAME> \
--flavor=vertex \
--project=<PROJECT-ID> \
--region=<REGION> \
--machine_type=<MACHINE-TYPE> \
--accelerator_type=<ACCELERATOR-TYPE> \
--accelerator_count=<ACCELERATOR-COUNT> \
--service_account_path=<SERVICE-ACCOUNT-KEY-FILE-PATH>
# Example:
zenml step-operator register vertex \
--flavor=vertex \
--project=zenml-core \
--region=europe-west1 \
--machine_type=n1-standard-4 \
--accelerator_type=NVIDIA_TESLA_K80 \
--accelerator_count=1 \
--service_account_path=credentials.json
List of available machines
Register a container registry:
zenml container-registry register <NAME> --type=default --uri=gcr.io/<PROJECT-ID>/<IMAGE>
# Example:
zenml container-registry register gcr_registry --type=default --uri=gcr.io/zenml-vertex-ai/busybox
Register a remote MLFlow tracking server:
zenml experiment-tracker register <NAME> --flavor=mlflow --tracking_uri=<MLFLOW-TRACKING-SERVER-URI> --tracking_username=<USERNAME> --tracking_password=<PASSWORD>
# Example:
zenml experiment-tracker register mlflow --flavor=mlflow --tracking_uri=http://mlflow_zenml_yolo:5000 --tracking_username=admin --tracking_password=admin
Register the new stack (change names accordingly):
zenml stack register vertex_mlflow_stack \
-o default \
-c gcr_registry \
-a gcp-store \
-s vertex \
-e mlflow
View all your stacks: zenml stack list
Activate the stack:
zenml stack set vertex_training_stack
Install relevant integrations:
zenml integration install bentoml gcp -y
In order to run the project, simply run:
python run.py -c train_and_deploy_and_predict
A blog explaining this project in depth is forthcoming!
If you'd like to watch the video that explains the project, you can watch the video.
The training pipeline is made up of the following steps:
data_loader.py
: Loads the data from the Roboflow platforms using the given API key and saves it to the artifact store as dictionary that contains all information about each set by using thezenml.artifacts.DatasetArtifact
class.train_augmenter.py
: Loads the training set data from the artifact store and performs data augmentation using thealbumentations
library. It then saves the augmented data to the artifact store as azenml.artifacts.DatasetArtifact
class.vald_augmenter.py
: Loads the validation set data from the artifact store and performs data augmentation using thealbumentations
library. It then saves the augmented data to the artifact store as azenml.artifacts.DatasetArtifact
class.trainer.py
: Loads the augmented training and validation data from the artifact store and trains the model using theyolov5
library in a custom Vertex AI job. which track the training process using themlflow
library. It then saves the trained model to the artifact store as azenml.artifacts.ModelArtifact
class.
The Deployment pipeline is made up of the following steps:
model_loader.py
: Loads the trained model from the previously trained pipeline and saves it locally.deployment_triggeer.py
: Triggers the deployment process once the model is loaded locally.bento_builder.py
: Builds a BentoML bundle from the model and saves it to the artifact store and passes it to the next step, which is thebento_deployer.py
.bento_deployer.py
: Deploys the BentoML bundle to the Vertex AI endpoint.
The Inference pipeline is made up of the following steps:
inference_loader.py
: Loads the Test set data from the first step of the training pipeline and save it locally.prediction_service_loader.py
: Loads the ZenML prediction service in order to make predictions on the test set data.predictor.py
: Runs the prediction service on the test set data and print the results.
- Documentation on Step Operators
- More on Step Operators
- Documentation on how to create a GCP service account
- ZenML CLI documentation