Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year. CVDs are a group of disorders of the heart and blood vessels and include coronary heart disease, cerebrovascular disease, rheumatic heart disease and other conditions. Four out of 5CVD deaths are due to heart attacks and strokes, and one third of these deaths occur prematurely in people under 70 years of age. Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.
Individuals at risk of CVD may demonstrate raised blood pressure, glucose, and lipids as well as overweight and obesity. These can all be easily measured in primary care facilities. Identifying those at highest risk of CVDs and ensuring they receive appropriate treatment can prevent premature deaths. Access to essential noncommunicable disease medicines and basic health technologies in all primary health care facilities is essential to ensure that those in need receive treatment and counselling.The dataset contains medical records of 299 patients who had heart failure, collected during their follow-up period, where each patient profile has 13 clinical features. https://archive.ics.uci.edu/ml/datasets/Heart+failure+clinical+records
- Data Analysis
- 5 columns contains outliers this columns are ( creatitinine_phosphokinase, ejection_fraction, platelets, sereum_creatinine, serum_soidum).
- Imbalanced target class ( I'll used resampling techniques to add more copies of the minority class )
- Performance Evaluation
- Splitting the dataset by 80 % for training set and 20 % validation set.
- Training and Validation
- After training and experimenting different algorithms using ensemble models have good accuracy score than linear and nonlinear models.
- Gradient Boosting Classifier ( 93 % accuracy score )
- Fine Tuning
- Using {'learning_rate': 0.1, 'max_depth': 9, 'n_estimators': 1000, 'subsample': 0.7} for Gradient Boosting Classifier improved the accuracy by 1 %.
- Performance Results
- Validation Score: 97%
- ROC_AUC Score: 96.9 %
Live demo: https://heartfailure-predictor.herokuapp.com/
Kaggle Kernel: https://www.kaggle.com/gabbygab/patients-survival-prediction-web-application