|
Development of a Thai Impolite Word Dictionary and an Algorithm for Detecting Thai Impolite Words using Multiple Patterns Inverted Lists |
|---|---|
| รหัสดีโอไอ | |
| Creator | Nattawut Kaewsiri |
| Title | Development of a Thai Impolite Word Dictionary and an Algorithm for Detecting Thai Impolite Words using Multiple Patterns Inverted Lists |
| Contributor | Kanida Charungchit, Chouvalit Khancome |
| Publisher | KKU Science Journal |
| Publication Year | 2567 |
| Journal Title | KKU Science Journal |
| Journal Vol. | 52 |
| Journal No. | 1 |
| Page no. | 106 - 120 |
| Keyword | Impolite Words, Algorithm, Rude Words Detection, Inverted List, Data Structure |
| URL Website | https://ph01.tci-thaijo.org/index.php/KKUSciJ/article/view/255944 |
| Website title | Thai Journal Online (ThaiJO) |
| ISSN | 3027-6667 |
| Abstract | This research article presents a new concept for detecting impolite Thai words by designing a new structure for a dictionary that stores impolite words. It uses character positions instead of storing words in the traditional dictionary format. The components of each word are classified into an inversion list and stored in the dictionary using a table based on hashing principles for rapid access. The article then develops an algorithm for immediate detection of impolite words as characters are input into the system, eliminating the need for complete entry as required by previous algorithms. The new dictionary is created with a time complexity of O(W) and a space complexity of O(|∑|+|W|), where W is the total length of all words contained in the dictionary. Meanwhile, the time complexity for detection is O(n), where n is the length of the text input into the system. Experimental results with the developed program for detecting impolite words, relying on the new dictionary model, show significantly faster detection compared to the traditional dictionary-based approach. The new algorithm can detect impolite words in the dictionary with 100% accuracy without any errors. |