Enhance data analysis and machine learning skills in the 'Industrial Copper Modeling' project. Tackle complex sales data challenges, employ regression models for pricing predictions, and master lead classification for targeted customer solutions
- Gain a deep understanding of dataset variables and types.
- Handle missing data with appropriate strategies.
- Prepare categorical features through encoding and data type conversion.
- Address skewness and ensure data balance.
- Identify and manage outliers.
- Resolve date discrepancies for data integrity.
- Visualize and correct skewness.
- Identify and rectify outliers.
- Feature improvement and creation for more effective modeling.
- Success and Failure Classification: Focusing on 'Won' and 'Lost' status.
- Algorithm Assessment: Evaluating algorithms for classification.
- Algorithm Selection: Choosing the Random Forest Classifier.
- Hyperparameter Tuning: Fine-tuning with GridSearchCV and cross-validation.
- Model Accuracy and Metrics: Assessing performance and metrics.
- Model Persistence: Saving the model for future use.
- Algorithm Assessment: Identifying algorithms for regression.
- Algorithm Selection: Opting for the Random Forest Regressor.
- Hyperparameter Tuning: Fine-tuning with GridSearchCV and cross-validation.
- Model Accuracy and Metrics: Evaluating regression model performance.
- Model Persistence: Saving the regression model for future applications.
https://github.com/praveendecode/Industrial_Copper_Modeling
pip install -r requirements.txt
streamlit run app.py
http://localhost:8501
- Python
- Numpy
- Pandas
- Scikit-Learn
- Matplotlib
- Seaborn
- Pickle
- Streamlit
- Docker
- Classification: Achieved 98.999% accuracy with ExtraTrees Forest Classifier.
- Regression: Achieved 98.3% accuracy with ExtraTrees Forest Regressor.