Breast Cancer Prediction using fuzzy clustering and classification
The objective of these predictions is to assign patients to either a benign group that is noncancerous or a malignant group that is cancerous.
The experimental study is based on the Wisconsin Breast Cancer database from the UC Irvine Machine Learning Repository. Dataset Link
The Breast Cancer database was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg. It contains 699 instances, 458 (65.5%) benign and 241 (34.5%) malignant cases. Each case is characterized by 9 attributes as described by Table I and two classes (benign and malignant).
Attributes and domains are as follows:
- Clump Thickness: 1 – 10
- Uniformity of Cell Size: 1 – 10
- Uniformity of Cell shape: 1 – 10
- Marginal Adhesion: 1 – 10
- Single Epithelial Cell Size: 1 – 10
- Bare Nuclei: 1 – 10
- Bland Chromatin: 1 – 10
- Normal Nucleoli: 1 – 10
- Mitoses: 1 – 10
C4.5 Classifier: 89.6%
KNN Classifier: 95.4%
K means Clustering and C4.5 decision tree classifier: 95.1%
Fuzzy K means clustering and C4.5 decision tree classifier: 96.5%
K means Clustering and Fuzzy knn classifier: 93.7%
Fuzzy K means Clustering and Fuzzy knn Classifier: 93.7%
Fuzzy K means Clustering and Fuzzy knn Classifier with feature selection (Final Model) : 96.5%