![]() |
Data mining model and application for stroke prediction: A combination of demographic and medical screening data approach |
---|---|
รหัสดีโอไอ | |
Creator | Sotarat Thammaboosadee |
Title | Data mining model and application for stroke prediction: A combination of demographic and medical screening data approach |
Contributor | Teerapat Kansadub2 |
Publisher | Nakhon Pathom Rajabhat University |
Publication Year | 2562 |
Journal Title | Interdisciplinary Research Review |
Journal Vol. | 14 |
Journal No. | 4 |
Page no. | 61 |
Keyword | Stroke prediction, data mining, medical screening data |
URL Website | http://apps.npru.ac.th |
Website title | สถาบันวิจัยและพัฒนา มหาวิทยาลัยราชภัฏนครปฐม |
ISSN | 2697-522X |
Abstract | This paper presents the data mining process that was used for building a stroke prediction model based on demographic informationand medical screening data. The data that was gathered from a physical therapy center in Thailand comprised ofoutpatients' medical records, medical screening forms, and a target variable. A group of 147 stroke patients and 294 non-strokeindividuals with six demographic predictors were selected for the study. Three classification algorithms were used in the study.These were; Na??ve Bayes, Decision Tree, and Artificial Neural Network (ANN). They were used to analyze the data collectedand the results were compared. They were evaluated by use of a 10-fold cross-validation method. The selection criteria wereprimarily measured by accuracy and the area under ROC curve (AUC). The secondary selection criteria were indicated byFalse-Positive Rate (FPR) and False-Negative Rate (FNR). The results showed that the best performing algorithm that wasstudied was ANN combined with integrated data. This approach have an overall accuracy of 0.84, an AUC of 0.90, a FPR of0.12 and an FNR of 0.25. The results of the study demonstrated that ANN with the integration of demographic and medicalscreening data produced the best predictive performance compared to the other models. This result was found according toboth the primary and secondary model selection criteria. |