Authors: Elizaveta Isianova, Lukas Malek, Lukas Rasocha, Vratislav Besta, Weihang Li
This project presents an attempt to find a solution for predicting the successful assembly of tires onto wheels autonomously. Currently, the method uses purely an image based classification to predict whether the tire was assembled correctly. To enrich this, we attempt to use an LSTM model to analyze inputs from torque and force sensors of the assembling robot, enabling the system to determine the optimal conditions for tire assembly without human intervention. The goal is to increase efficiency and accuracy in tire assembly processes, reducing the reliance on manual labor and minimizing errors.
The project is based on real case situated within the Testbed for Industry 4.0 at CTU Prague. The current quality control methodology uses CNNs for the visual inspection of tire assemblies.
Data are meassured and labelled by the lab. The dataset is generated through robotic cell runs, every sample is then labeled as true (successful assembly) or false (unsuccessful assembly).
This project aims to introduce a new method for enhancing the quality control process in car wheel assembly executed by a delta robot.
Departing from the picture-based assessment using CNNs, our approach aims to evaluate the correctness of the assembly based on the data from a force-torque sensor. This transforms the dataset into a collection of time series, capturing recorded sensor data from individual tire assemblies. Each element from the series is a 6D vector combining a 3 DOF force vector and a 3 DOF torque vector.
The chosen methodology is an implementation of Long Short-Term Memory Recurrent Neural Networks (LSTM RNNs) using PyTorch since the data are in timeseries. There is no existing baseline solution for the current problem. Therefore the project could be evaluated and compared to the existing CNN approach.
Due to the small dataset limited by the time constraints and the amount of labelled data, we don't expect to obtain a well performing model, but rather want to present a method for further development.
As a third-party framework we are going to use PyTorch Lightning and maybe with a Pytorch Forecasting package built on top of the Lightning.
git clone https://github.com/malek-luky/Automatic-Wheel-Assembly-Detection.git
cd Automatic-Wheel-Assembly-Detection
make conda
This will build an image of our project and run it in a container. In the container you will have all the dependencies, data and code needed to run the project. We have three different dockerfiles:
- conda_setup: debugging purposes, sets the environement and waits for user to run it in interactive mode
- train_model: downloads dependencies, trains model and send it to Weight and Biases (wandb)
- deploy_model: downloads dependencies and the model from wandb and waits for user input to make predictions
The following steps to build and run are written for train_model only, but it can be easily changed for any dockerfile.
git clone https://github.com/malek-luky/Automatic-Wheel-Assembly-Detection.git
cd Automatic-Wheel-Assembly-Detection
<uncomment line 21 and 22 inside dockerfiles/train_model.dockerfile>
docker build -f dockerfiles/train_model.dockerfile . -t trainer:latest
docker run --name trainer -e WANDB_API_KEY=<WANDB_API_KEY> trainer:latest
There is an error while loading the data from the bucket. Unfortunately, there is no workaround at this moment.
make train_model
docker run --name trainer -e WANDB_API_KEY=<WANDB_API_KEY> europe-west1-docker.pkg.dev/wheel-assembly-detection/wheel-assembly-detection-images/train_model:latest
- Open Compute Engine
- Create a name
- Region:
europe-west1 (Belgium)
- Zone:
europe-west1-b
- Machine configuration:
Compute-optimized
- Series:
C2D
- Machine Type:
c2d-standard-4
(must have at least 16GB RAM) - Boot disk:
20 GB
- Container image:
<ADDRESS-OF-IMAGE-IN-ARTIFACT-REGISTRY>
(click Deploy Container) - Restart policy:
never
- The rest is default
If the gcloud
command is unkown, follow the steps for your OS. Otherwise there are three three dockerfiles that can be deployed to Virtual Machine in GCP (suffix _vm
to the dockerfile name`). All of the create the same instance but with specific container. The instance of the name is folowing the dockerfile name (conda_setup/train_model/deploy_model)
make train_model_vm
gcloud compute ssh --zone "europe-west1-b" "train-model" --project "wheel-assembly-detection"
- Can be via SSH inside the browser Compute Engine
- Or locally using command similar to this one
gcloud compute ssh --zone "europe-west1-b" "<name_of_instance>" --project "wheel-assembly-detection"
(the instatnces can be listed usinggcloud compute instances list
)
docker ps
: shows the docker files running on the machinedocker logs <CONATINER_ID>
wait until its successfully pulleddocker ps
: pulled container has new IDdocker exec -it CONTAINER-ID /bin/bash
: starts the docker in interactive window (only the conda_wheel_assemly_detection, the rest only train the model, upload the model and exits, maybe setting the restart policy to "never" should fix this issue)
It re-creates filtered
, normalized
and processed
folders. The processed data is stored in data/processed/dataset_concatenated.csv
and is used for training.
python src/data/make_dataset.py
python src/models/train_model.py
python src/models/train_model.py
python src/models/train_model.py --wandb_on
conda remove --name DTU_ML_Ops --all
This repository is configured for deployment using Google CloudοΈ βοΈ. The images in this repository are re-built and deployed automatically using GitHub Actions and stored in Google Artifact Registry on every push to the main
branch.
We also automatically re-train the model using Vertex AI, store it in Weights & Biases model registry and deploy it using Google Cloud Run.
With access to GCP you can simply make your changes and merge it into main. When the merge is done, GitHub Actions will automatically train and deploy the model. We have 4 workflows in total.
- build_conda: build the image and stores in in GCP
- build_train: runs the built image on Vertex AI to train the model and sends it to wandb
- build_deploy: deploy the image to cloud run to handle user requests and via FastAPI gives predictions
The model is deployed using Google Cloud Run. You can make a prediction using the following command:
curl -X 'POST' \
'https://deployed-model-service-t2tcujqlqq-ew.a.run.app/predict' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"sequence": [
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.2, 0.3, 0.4, 0.3, 0.4, 0.5, 0.3, 0.2],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.1, 0.2, 0.3, 0.2, 0.3, 0.4, 0.2, 0.1],
[0.2, 0.3, 0.4, 0.3, 0.4, 0.5, 0.3, 0.2]
]
}'
Our model can also be deployed locally. The guidelines for running a local server and making predictions are here
Contributions are always welcome! If you have any ideas or suggestions for the project, please create an issue or submit a pull request. Please follow these conventions for commit messages.
- Docker: "PC Setup" inside the docker file
- Conda: Package manager
- GCP
- Cloud Storage: Stores data for dvc pull
- Artifact Registry: Stores built docker images (can be created into container)
- Compute Engine: Enables creating virtual machines
- Functions / Run: Deployment
- Vertex AI: includes virtual machines, training of AI models ("abstraction above VM...")
- OmegaConf: Handle the config data for train_model.py
- CodeCov: Creates the coverage report and submit it as a comments to the pull request
- CookieCutter: Template used for generating code sctructure
- DVC: Data versioning tool, similar is github but for data
- GitHub: Versioning tool for written code
- GitHub Actions: Run pytest, Codecov and upload built docker images to GCP
- Pytest: Runs some tests to check whether the code is working
- CodeCov: Tool for uploading coverage report from pytest as a comment to pull requests
- Weight and Biases: wandb, used for storing and tracking the trained model
- Pytorch Lightning: Framework for training our LTSM model and storing default config values
- Forecasting: Abstracion above Pytorch Lightning working with Timeseries data
- Torchserve: Used for local deployment
- FastAPI: Creates API for our model, wrap it into container so it can be accessed anywhere
- Slack/SMS: Handle the alerts, Slack for deployed model, SMS for a server cold-run
The directory structure of the project looks like this:
βββ .dvc/ <- Cache and config for data version control
βββ .github/workflows <- Includes the steps for GitHub Actions
β βββ build_conda <- Conda dockerfile: Build conda image and push it to GCP
β βββ build_deploy <- Deploy dockerfile: build, push and deploy
β βββ build_train <- Train dockerfile: Build train image and push it to GCP
β βββ pytests <- Runs the data and model pytests
βββ data <- Run dvc pull to see this folder
β βββ filtered <- Seperated raw data, one file is one meassurement
β βββ normalized <- Normalized filtered data
β βββ processed <- Torch sensors from normalized data and concatenated csv
β βββ raw <- Original meassurements
βββ deployment <- Other deployment options as Cloud Function and torchserve
β βββ cloud_functions <- File that can be run as a Cloud Function on GCP (WIP)
β βββ torchserve/ <- All data needed for local deployment
βββ dockerfiles <- Storage of out dockerfiles
β βββ conda_wheel <- Setups the machine and open interactive environement
β βββ train_wheel <- Runs train_model.py that upload the new model to wandb
β βββ serve_model <- Uses FastAPI, as the only dockerfile also deploys the model
β βββ README <- Notes and few commands regarding the dockerfiles struggle
βββ docs <- Documentation folder
β βββ index.md <- Homepage for your documentation
β βββ mkdocs.yml <- Configuration file for mkdocs
β βββ source/ <- Source directory for documentation files
βββ reports <- Generated analysis as HTML, PDF, LaTeX, etc.
β βββ figures/ <- Generated graphics and figures to be used in reporting
β βββ README <- Exam questions and project work progress
βββ src <- Source code
β βββ data <- Scripts to download or generate data
β β βββ filter <- Seperates the meassurement into csv files
β β βββ make_dataset <- Runs filter->normalize->process as one script
β β βββ normalize <- Normalizes the filtered data
β β βββ process <- Changes normalized data into torch files and concatenated csv
β β βββ README <- Includes more details about the scripts
β β βββ utils <- File with custom functions
β βββ helper <- Folder with custom functions
β β βββ convert_reqs <- Function that mirrors the requirements to environment.yml
β β βββ gcp_utils <- Function that returns wandb_api on GCP cloud via secret
β β βββ logger <- Creates logs to logs/ folder for easier debugging
β βββ models <- Model implementations, training script and prediction script
β β βββ arch_model <- Old model class definition and function calls
β β βββ arch_train_m <- Old model using Forecasting and TemporalFusionTransformer
β β βββ model <- New lightweight model class definition and function calls
β β βββ predict_model <- Predicts the result from unseen data
β β βββ train_model <- New lightweight model using Lightning's LTSM
βββ tests <- Contains all pytest for Github workflow
β βββ test_data <- Checks if data exist and the data shape
β βββ test_model <- Check if the trained model is correct
βββ .gitignore <- Data that are now pushed to GitHub
βββ .pre-commit-config <- Formats the code following pep8 and mirror requirements.txt
βββ LICENSE <- Open-source license info
βββ Makefile <- Makefile with convenience commands like `make data` or `make train`
βββ README.md <- The top-level README which you are reading right now
βββ data.dvc <- Links the newest data from GCP Cloud Storage
βββ environment.yml <- Requirements for new conda env, also used inside docker
βββ pyproject.toml <- Project (python) configuration file
βββ requirements.txt <- The pip requirements file for reproducing the environment
Created using mlops_template, a cookiecutter template for getting started with Machine Learning Operations (MLOps).