Deploy to production

ML Modular Workflow Setup

This project demonstrates a machine learning workflow that provides a structured approach to building, training, and deploying machine learning models.

Business Context

Aim of this project is to develop a ml system to predict the behavior of customers as to retain customer.

Dataset

The provided dataset can be found at: Telco Customer Churn

In this repository: data-source/Telco-Customer-Churn.csv

About Telcom Customer Churn Dataset

Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The raw data contains 7043 rows (customers) and 21 columns (features).

The “Churn” column is our target.

Prerequisites

python3.8 (or any other version)
python3.8-venv (venv of that version)

Installation

To run this project, first install virtual environment:

python3 -m venv venv

Then to activate the Python virtual environment, run the following command:

source venv/bin/activate

Then run the following code to make sure python has path to virtual environments python:

export PYTHONPATH=~/path_to_directory/model-training-with-modular-workflow

Make sure to replace path_to directory with your local path.

To install all packages and initialize the project, run the following command:

pip install -e .

This will run the setup.py file to initialize the project and save metadata. Make sure you have requirements.txt, README.md file ready.

Directory Structure

To set up the folder structure run the folder_structure_setup.sh. Run:

bash folder_structure_setup.sh

This will generate a folder structure necessary for modular workflow. See below:

├── data-source
├── src
│   ├── __init__.py
│   ├── components
│   │   ├── __init__.py
│   │   ├── data_ingestion.py
│   │   ├── data_transformation.py
│   │   ├── model_monitoring.py
│   │   └── model_trainer.py
│   ├── exception.py
│   ├── logger.py
│   ├── pipelines
│   │   ├── __init__.py
│   │   ├── prediction_pipeline.py
│   │   └── training_pipeline.py
│   └── utils.py
├── .gitignore
├── main.py
├── app.py
├── EDA.ipynb
├── README.md
├── requirements.txt
├── folder_structure_setup.sh
├── test-logging-integration.py
└── test-request.py

Workflow Steps

Machine Learning training and development phase can be divided into 4 steps:

Data Ingestion: In this step raw data is taken from data sources (e.g database, warehouse etc) and preprocessed and split into training, test and validation sets.
Data transformation: This is stage for data exploration, data cleaning, feature engineering. It takes raw data from data ingestion stage and creates featured data for model training.
Model Training: This stages takes the featured data from data transformation stage and trains models using the data. This stage work is to select architecture for model continuously train, tune a model. The models is the output of this stage.
Model Evaluation: This stage compares the trained model to select the best of them. Prepares the model for deployment.

Tools Used

1. DVC (Data Version Control)

DVC is used for tracking data files and ensuring version control for datasets.

Initialize DVC

dvc init

Track Files

dvc add artifacts/data_ingestion/raw.csv

2. Feature Store

The feature store is managed using Feast, allowing storage and retrieval of features.

View Feature Store

cd feature_repo
feast ui

3. MLflow for Experiment Tracking

MLflow is used to track experiments and visualize metrics.

Start MLflow Server

mlflow ui

4. Run ML Pipeline and Experiment Tracking

Execute the training pipeline and track the experiment metrics on MLflow.

Run Training Pipeline

Ensure the MLflow server is running before executing:

python3 src/pipelines/training_pipeline.py

5. Run Flask App

A Flask application is provided for serving the model.

Start the Flask App

python3 app.py

6. Test the Model API

Test the API using the provided test file.

Run Test Request

python3 test-request.py

Expected Output

Status Code: 200
Response: {'churn_category': 'No', 'prediction': 0, 'status': 'success'}

Create API Endpoints with FastAPI

Run flask in dev mode:

first change virtual environment to v-fast:

python3 -m venv v-fast
source v-fast/bin/activate

Install required packages

pip install -r fast-requirements.txt

Run the App in dev mode:

fastapi dev main.py

You will see the app at https://localhost:8000

The create api for ml model inference is https://localhost:8000/predict

Deploy to production

Use docker to build a image to deploy:

docker build -t flask-ml:0 .

Run the container in detach mode to deploy:

docker run -d -p 8000:8000 flask-ml:0

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.dvc		.dvc
__pycache__		__pycache__
artifacts		artifacts
data-source		data-source
feature_repo		feature_repo
logs		logs
mlartifacts/353769656674813691		mlartifacts/353769656674813691
mlops_modular_telcome_customer_data.egg-info		mlops_modular_telcome_customer_data.egg-info
mlruns		mlruns
src		src
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
EDA.ipynb		EDA.ipynb
README.Docker.md		README.Docker.md
README.md		README.md
app.py		app.py
compose.yaml		compose.yaml
fast-requirements.txt		fast-requirements.txt
folder_structure_setup.sh		folder_structure_setup.sh
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
test-fastapi-app.py		test-fastapi-app.py
test-flask-app.py		test-flask-app.py
test-logging-integration.py		test-logging-integration.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Modular Workflow Setup

Business Context

Table of Contents

Dataset

Prerequisites

Installation

Directory Structure

Workflow Steps

Tools Used

1. DVC (Data Version Control)

Initialize DVC

Track Files

2. Feature Store

View Feature Store

3. MLflow for Experiment Tracking

Start MLflow Server

4. Run ML Pipeline and Experiment Tracking

Run Training Pipeline

5. Run Flask App

Start the Flask App

6. Test the Model API

Run Test Request

Expected Output

Create API Endpoints with FastAPI

Deploy to production

About

Releases

Packages

Languages

coder7475/model-training-with-modular-workflow

Folders and files

Latest commit

History

Repository files navigation

ML Modular Workflow Setup

Business Context

Table of Contents

Dataset

Prerequisites

Installation

Directory Structure

Workflow Steps

Tools Used

1. DVC (Data Version Control)

Initialize DVC

Track Files

2. Feature Store

View Feature Store

3. MLflow for Experiment Tracking

Start MLflow Server

4. Run ML Pipeline and Experiment Tracking

Run Training Pipeline

5. Run Flask App

Start the Flask App

6. Test the Model API

Run Test Request

Expected Output

Create API Endpoints with FastAPI

Deploy to production

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages