|
Identifying Text-based Online Thai Hate Speech in Social Media |
|---|---|
| รหัสดีโอไอ | |
| Creator | Siranuch Hemtanon |
| Title | Identifying Text-based Online Thai Hate Speech in Social Media |
| Contributor | Ketsara Phetkrachang, Wachira Yangyuen |
| Publisher | Department of Information Science Faculty of Humanities and Social Sciences, Khon Kaen University |
| Publication Year | 2567 |
| Journal Title | Journal of Information Science Research and Practice |
| Journal Vol. | 42 |
| Journal No. | 4 |
| Page no. | 44–56 |
| Keyword | Online hate speech, Classification, Social network service, Keyword detection, Text mining |
| URL Website | https://www.tci-thaijo.org/index.php/jiskku/index |
| Website title | Journal of Information Science Research and Practice |
| ISSN | 3027-6586 |
| Abstract | Purpose: This work proposes a method to detect Thai online hate speech which can be categorized to 5 types, including ethnic-based, gender-based, ableism, belief-based, and social status-based hate speech. Online comments from famous social network services in Thailand are collected and annotated for training data.Methodology: Machine learning approaches are employed to perform multiclass classification for identifying the hate speech. Moreover, we exploit the information gain score to determine which terms are significant to relay hateful intent of each hate speech class.Findings: The results of hate speech detection reveal that a language model of combining TF-IDF and trigram using with SVM technique obtained the best performance in detection for 0.76 F-measure score in average. The use of IG score also provides a list of significant terms that related to a specific hate speech class.Applications of this study: Hate speech detection helps to analyze Thai text messages that may be hurtful to recipients. It can actively filter and disallow the message before posting to prevent online cyber bullies in social media platforms, and it reminds users who may unintentionally choose Thai risky words that may cause emotional wound to readers. |