Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence
Abstract
Ahmad LG*,Eshlaghy AT,Poorebrahimi A,Ebrahimi M,Razavi AR
Objective: The number and size of medical databases are increasing rapidly but most of these data are not analyzed
for finding the valuable and hidden knowledge. Advanced data mining techniques can be used to discover hidden
patterns and relationships. Models developed from these techniques are useful for medical practitioners to make right
decisions. The present research studied the application of data mining techniques to develop predictive models for
breast cancer recurrence in patients who were followed-up for two years.
Method: The patients were registered in the Iranian Center for Breast Cancer (ICBC) program from 1997 to 2008.
The dataset contained 1189 records, 22 predictor variables, and one outcome variable. We implemented machine
learning techniques, i.e., Decision Tree (C4.5), Support Vector Machine (SVM), and Artificial Neural Network (ANN) to
develop the predictive models. The main goal of this paper is to compare the performance of these three well-known
algorithms on our data through sensitivity, specificity, and accuracy.
Results and Conclusion: Our analysis shows that accuracy of DT, ANN and SVM are 0.936, 0.947 and 0.957
respectively. The SVM classification model predicts breast cancer recurrence with least error rate and highest accuracy.
The predicted accuracy of the DT model is the lowest of all. The results are achieved using 10-fold cross-validation for
measuring the unbiased prediction accuracy of each model.
PDF