Leveraging PyThaiNLP for Sentiment Analysis of Thai Online Text: A Comparative Study of Logistic Regression and Support Vector Machine
รหัสดีโอไอ
Creator Sunisa Duangtham
Title Leveraging PyThaiNLP for Sentiment Analysis of Thai Online Text: A Comparative Study of Logistic Regression and Support Vector Machine
Contributor Setthaphong Lertritrungrot, Nattavadee Hongboonmee, Wansuree Massagram
Publisher Faculty of Informatics, Mahasarakham University
Publication Year 2568
Journal Title Journal of Applied Informatics and Technology
Journal Vol. 7
Journal No. 2
Page no. 268-282
Keyword PyThaiNLP, Sentiment Analysis, Thai Online Text
URL Website https://ph01.tci-thaijo.org/index.php/jait
Website title Journal of Applied Informatics and Technology
ISSN 3088-1803
Abstract The objective of this study is to compare the performance of sentiment analysis models for Thai online text using the existing PyThaiNLP libraries. For extracting text from online sources to create a dataset, the text was manually categorized into positive, neutral, and negative sentiments. Data preprocessing involved removing punctuation marks, tokenizing, removing non-Thai characters, and bag of words creation. The data was then divided into training and testing sets to build models using three algorithms: logistic regression, logistic regression with stochastic gradient descent (SGD), and support vector machine (SVM). Upon comparison, the logistic regression model was found to perform the best – achieving accuracy of 80.73% with a 90:10 train-test split using the newmm word tokenization tool and the augmented dictionary. The accuracy for analyzing positive sentiment was 81.10%, for neutral sentiment, 80.16%, and for negative sentiment, 80.97%.
Faculty of Informatics

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ