Web Scraping-based System for E-commerce Price Comparison and Similar Product Segmentation
รหัสดีโอไอ
Creator Pongsin Jankaew
Title Web Scraping-based System for E-commerce Price Comparison and Similar Product Segmentation
Contributor Wachirawut Thamviset
Publisher Faculty of Informatics, Mahasarakham University
Publication Year 2568
Journal Title Journal of Applied Informatics and Technology
Journal Vol. 7
Journal No. 2
Page no. 346-362
Keyword Agglomerative Clustering, E-commerce, Product Iden-tification, Web Scraping
URL Website https://ph01.tci-thaijo.org/index.php/jait
Website title Journal of Applied Informatics and Technology
ISSN 3088-1803
Abstract With the booming growth of e-commerce, finding the best deals amid a multitude of online shopping websites has become a challenge. Consumers often spend a considerable amount of time manually sifting and comparing data, leading to uncertainty in decision-making. To address this issue, our research proposes a system that utilizes web scraping techniques to identify top deals from multiple e-commerce sites. We have developed Python-based web scraping scripts and incorporated a configuration file for customization, enabling users to extract product data from diverse websites. The system scrapes data and displays result each time the user enters a query, ensuring that the scraped data is up to date. Furthermore, our system enhances the user experience by incorporating product model datasets for product identification, enabling specific searches based on product specifications, and offering recommendations for similar product models. Finally, in cases where products remain unidentified, we introduce a feature for grouping similar products through an agglomerative clustering method. This method utilizes product name and image features extracted by TF-IDF and Convolutional Neural Networks (CNN), allowing for price comparisons among similar products and enhancing the overall shopping experience. Preliminary evaluations show that our system successfully extracts data from target websites with proper customizations. The evaluations of similar product clustering demonstrate that using a combined feature of product names and images significantly improves clustering performance, surpassing the use of product names or images alone, with a 9 percent increase and 18 percent increase, respectively.
Faculty of Informatics

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ