![]() |
Towards Electronic Version of the Royin Thai Dictionary from Information-Heavily Semi-structured Data Source |
---|---|
รหัสดีโอไอ | |
Creator | Taneth Ruangrajitpakor |
Title | Towards Electronic Version of the Royin Thai Dictionary from Information-Heavily Semi-structured Data Source |
Contributor | Adisak Kingkaewkanthong, Thepchai Supnithi |
Publisher | Sirindhorn International Institute of Technology, Bangkadi Campus (SIIT-BKD) |
Publication Year | 2561 |
Journal Title | Journal of Intelligent Informatics and Smart Technology |
Journal Vol. | 3 |
Page no. | 1-7 |
Keyword | Dictionary Development, Electronic Dictionary, Information Extraction, Semi-structure Source |
URL Website | https://ph05.tci-thaijo.org/index.php/JIIST |
Website title | Journal of Intelligent Informatics and Smart Technology |
ISSN | 2586-9167 |
Abstract | As to provide knowledge of Thai words, the Royin dictionary has been decided to become digitised. In this work, processes of extracting information from printing version of the dictionary are described. Since the information source is in semi-structured format, an automatic method of type detection is used to extract respective details into database. Patterns and format of the source are fully used in consequence as a hint for extraction. Moreover, ambiguities and their solution in extracting process are discussed. As a result, lexical entries are systematically stored with distinguishable details, and entries are connected with other by interoperable relations. From evaluation, the automatic extraction processes can handle more than 80% of entries in overall, and the remaining ambiguous entries were sent to experts for decision-making. |