![]() |
The density-based minority over-sampling framework for class imbalanced problems |
---|---|
รหัสดีโอไอ | |
Title | The density-based minority over-sampling framework for class imbalanced problems |
Creator | Chumphol Bunkhumpornpat |
Contributor | Krung Sinapiromsaran, Chidchanok Lursinsap |
Publisher | Chulalongkorn University |
Publication Year | 2554 |
Keyword | Data mining, Cluster analysis, Sampling |
Abstract | A dataset embodies the class imbalanced problem when the target class has a very small number of instances relative to the other classes. A trivial classifier typically fails to predict the positive instances due to its tiny size. In this thesis, the density-based minority over-sampling framework is proposed. It relies on a density-based notion of clusters and is designed to over-sample an arbitrarily shaped cluster discovered by the density-based clustering algorithm. In detail, my framework generates a synthetic instance along the shortest path from each instance in a cluster of a minority class to the pseudo-centroid of this cluster. Consequently, a set of the synthetic instances is dense near the pseudo-centroid and is sparse far from this centroid. Due to the distribution of the set, a classifier faces more emphatically around the core region than it does around the border region. The experimental results show that my framework improves accuracy, F-value (combination term of Precision and Recall), and AUC of a classifier more than SMOTE and Safe-Level-SMOTE. |
URL Website | cuir.car.chula.ac.th |