Course : [CS7.403] Statistical Methods in AI
Instructor : Dr. Anoop Namboodiri
Team Name : Skynet
Participants :
- Megha Bose
- Samyak Jain
- Nirmal Manoj
- Varul Srivastava
This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class.
This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes.
Fig.1 SMOTE ApproachDATASET :
We plan to use the Pima Indians Diabetes Database, which contains 2 classes and 768 samples. The reason for choosing this was, that it's a medical database and thus, is a realistic scenario where such class imbalances occour. Number of positive class samples is only 268.
Link-to-Dataset
Timeline | Milestone |
---|---|
November 2 | Project Allocation |
November 7 | Project Proposal Finalized |
November 14 | Implementation of 4 of 8 Approaches |
November 21 | Implementation complete |
November 22 | Project report complete |
November 24 | Project Completion |