Skip to content

cuizy1017/MOOC-Dropout-Prediction

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KDD-CUP -2015

In this project, we address MOOC attrition using machine learning techniques for KDD Cup 2015 dataset. Feature engineering is performed on this dataset to generate categorical, time and course completion based features and avoid overfitting. We explored sampling methods to avoid the original class imbalance problem. KNN, Logistic Regression, Neural Network, Random Forest, Gradient Boosting and XGBoost algorithms are applied to obtain the predictions. We picked the best models out of these as estimators to the weighted voting classifier. Our model reports an accuracy of 87.74% and AUC of 87.97%.

We have the feature engineered individual CSV files in the data folder and notebooks contain jupyter notebook of different experiments conducted.

Submissions Folder contain the final submission for the course.

About

KDD-CUP -2015

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%