Hybrid machine learning models: A comprehensive, data-driven evaluation with diverse data partitioning strategies for net radiation estimation
รหัสดีโอไอ
Creator 1. Kristian Lorenz Bajao
2. Kittisak Phetpan
3. Ponlawat Chophuk
4. Rattapong Suwalak
Title Hybrid machine learning models: A comprehensive, data-driven evaluation with diverse data partitioning strategies for net radiation estimation
Publisher Faculty of Engineering, Khon Kaen University
Publication Year 2568
Journal Title Engineering and Applied Science Research
Journal Vol. 52
Journal No. 3
Page no. 240-250
Keyword Artificial intelligence, Crop water requirement, Smart irrigation, Climate change, Net radiation, Data partitioning
URL Website https://ph01.tci-thaijo.org/index.php/easr/index
Website title Engineering and Applied Science Research
ISSN 2539-6161
Abstract Surface net radiation (Rn) is crucial for climate modeling and agricultural management but is often not readily available, especially in regions like Thailand. Accurate prediction of Rn is essential for estimating evapotranspiration, which is vital for irrigation planning and agricultural productivity. This study develops a hybrid machine learning framework that incorporates K-Nearest Neighbors (KNN) for missing data imputation, Random Forest-Recursive Feature Elimination (RF-RFE) for feature selection, and machine learning models (Multi-layer Perceptron, K-Nearest Neighbors, and Random Forest) for prediction. The research evaluates various data partitioning methods, including hold-out split, K-fold cross-validation, and growing-window forward-validation (gwFV), alongside hyperparameter tuning using GridSearch to enhance model robustness and prevent overfitting. The primary objectives are to develop and evaluate the hybrid ML models for daily Rn estimation using basic meteorological inputs (temperature, relative humidity, and sunshine duration), assess the impact of different input combinations on prediction accuracy in Sawi, Chumphon, Thailand, and compare data partitioning techniques to determine the optimal model performance. Utilizing FAO56PM-calculated Rn as a reference, this study finds that the Random Forest model, with average temperature and sunshine duration (M2) as inputs evaluated under the gwFV method, achieves the highest stability and high accuracy (R? of 0.972, RMSE of 0.457 MJ m-2 day-1, and MAPE of 3.50%). The Random Forest demonstrates strong generalization capabilities, making it a reliable choice. Even models using only sunshine duration (M3) perform adequately, offering a solution when data availability is scarce. This study concludes that hybrid machine learning models, combined with careful data partitioning, significantly improve Rn estimation. These advancements provide valuable insights for climate modeling, agricultural management, and irrigation scheduling, particularly in data-scarce regions.
Engineering and Applied Science Research

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ