|
A new weighting scheme for document ranking based on the modified word-embedding method |
|---|---|
| รหัสดีโอไอ | |
| Creator | 1. Mohammad Edalatfard 2. Morteza Mohammadi Zanjireh 3. Mahdi Bahaghighat |
| Title | A new weighting scheme for document ranking based on the modified word-embedding method |
| Publisher | Faculty of Engineering, Khon Kaen University |
| Publication Year | 2567 |
| Journal Title | Engineering and Applied Science Research |
| Journal Vol. | 51 |
| Journal No. | 2 |
| Page no. | 259-266 |
| Keyword | Ad-hoc retrieval, Information retrieval, Weighting methods, Word embedding, Word2vec |
| URL Website | https://ph01.tci-thaijo.org/index.php/easr/index |
| Website title | Engineering and Applied Science Research |
| ISSN | 2539-6161 |
| Abstract | Finding documents related to a search query or similar to a specific document is among the important duties of information retrieval. The vector space model has fundamental techniques, including the bag-of-words model and the TF-IDF model. These techniques are the main strategies for determining the documents' similarities. Another method for producing a document vector is using word vectors. Thanks to recent advancements in distributed meaning, word vectors can be created with significant volumes of unlabeled textual input, primarily through artificial neural network (ANN)-based methods. A semantic space is built using this data, and word-embedding vectors represent words in this semantic space. The present study examines various approaches for transforming word-embedded vectors into document vectors and offers a new approach. Ad-hoc retrieval is one of the information retrieval tasks to employ these techniques. In this research, the metrics of mean average precision (MAP) and normalized discounted cumulative gain (NDCG) are used to assess the algorithm, followed by comparing various approaches using these two measures. The findings of this investigation demonstrate that the suggested TAW-TFIDF method outperforms alternative weighting methodologies. |