The goal of this project is to analyze a real machine monitoring dataset and create a machine learning model to predict when the machine being monitored requires maintenance.
This project uses the data provided for the 2010 PHM Society Conference Data Challenge. The challenge focused on RUL (remaining useful life) estimation for a high-speed CNC (computer numerical control) milling machine cutter using measurements from:
- Dynamometers: Force readings on the pieces being cut.
- Accelerometers: Vibration data during cutting operations.
- Acoustic emission sensors: High-frequency energy emissions linked to tool wear.
- The data set is publicly available on Kaggle: PHM data challenge 2010
- Licensed under CC0-1.0.
Most of the work was done in Jupyter notebooks, in them you can find a mix of written analysis, code, diagrams and data visualization.
Notebook | Description |
---|---|
1.0-exploratory-data-analysis.ipynb | Initial data exploration and visualization of signal and wear data. |
2.0-feature-engineering.ipynb | Feature extraction from signals, including frequency-domain features. |
3.0-linear-regression.ipynb | Training and evaluation of a Lasso regression model. |
4.0-CNN-regression.ipynb | Training and evaluation of a 1-dimensional CNN model for regression. |
A dashboard was built with Streamlit to visualize key insights from the data analysis and model predictions. It was developed in a separate GitHub repo but you can explore it directly at:
- Download the data from the source. Save it and unzip in a directory called
/data/raw/
. - Install the dependencies in
requirements.txt
eg. via pip by running:
pip install -r requirements.txt
- Run the notebooks in the editor of your choice.
Contributors: