|
Improving quality of breast cancer data through pre-processing |
|---|---|
| รหัสดีโอไอ | |
| Creator | 1. Vatinee Sukmak 2. Jaree Thongkam |
| Title | Improving quality of breast cancer data through pre-processing |
| Publisher | Faculty of Engineering, Khon Kaen Univeristy |
| Publication Year | 2556 |
| Journal Title | KKU Engineering Journal |
| Journal Vol. | 40 |
| Journal No. | 4 |
| Page no. | 493-504 |
| Keyword | Breast cancer data,Pre-processing, Data mining,Decision rules |
| ISSN | 0125-8273 |
| Abstract | Using data mining for medical prognosis becomes a promising approach recently. In the mining process, the raw data are commonly suffering from outlier and imbalanced problems which affect the performance of the model in predicting the unseen data. Thus, choosing appropriate data mining algorithms has a straight forward impact on the prediction model. The objective of this study is to investigate the use of three kinds of data pre-processing techniques including outlier filtering, Synthetic Minority Over-sampling TEchnique (SMOTE) and attribute selections for improving the quality of breast cancer data at Srinagarind Hospital in Thailand. Three types of decision rule building techniques, i.e. Decision Table with Na?ve Bays (DTNB), Repeated Incremental Pruning to Produce Error Reduction (RIPPER) and PART Decision List were employed. The performance of proposed approaches was evaluated through the Area Under the receiver operating characteristics Curve (AUC) of the decision rules. Experimental results have shown that applying the suitable data pre-processing, especially the outlier filtering method, can lead to the significant improvement of the prediction performance of decision rule models. |