|
LCS-based Thai Trending Keyword Extraction from Online News |
|---|---|
| รหัสดีโอไอ | |
| Creator | Kietikul Jearanaitanakij |
| Title | LCS-based Thai Trending Keyword Extraction from Online News |
| Contributor | Nattapong Kueakool, Puwadol Limwanichsin, Tiwat Kullawan, Chankit Yongpiyakul |
| Publisher | Faculty of Engineeing Naresuan University |
| Publication Year | 2565 |
| Journal Title | Naresuan University Engineering Journal |
| Journal Vol. | 17 |
| Journal No. | 2 |
| Page no. | 54-61 |
| Keyword | Longest common substring, Natural language processing, Online news, Thai trending keyword, Varying-length keyword |
| URL Website | https://ph01.tci-thaijo.org/index.php/nuej |
| Website title | https://ph01.tci-thaijo.org/index.php/nuej |
| ISSN | 1905-615x |
| Abstract | A trending keyword is a common word or a phrase that is most frequently mentioned in the current period. Extracting trending keywords from Thai online news is not trivial. A too-short keyword may not have a specific meaning because it may be just a common word that does not have any significance to the interpretation. On the other hand, a long common keyword conveys a better meaning. However, the running time to extract the long keyword from a collection of documents may not be bounded within a reasonable time. A problem statement of this research is to find a varying-length trending keyword from Thai online news within a reasonable running time. We propose a novel method to extract trending keywords by applying the longest common substring (LCS) algorithm. The common keywords having high occurrence frequency are selected as the trending keywords. The proposed method inherits the advantage of the reasonable running time from the dynamic programming technique of the LCS algorithm. The experimental results on various sources of Thai online news agencies indicate a superior precision of the proposed method over char-N-gram and word-N-gram strategies. |