Advancing soil moisture retrieval from SAR data using meta-heuristic optimization XGBoost and SHAP analysis

CL Hou and ML Tan and ZM Yaseen and F Zhang, INTERNATIONAL JOURNAL OF REMOTE SENSING, 46, 5354-5383 (2025).

DOI: 10.1080/01431161.2025.2518507

Accurate soil moisture (SM) retrieval from synthetic aperture radar (SAR) data presents persistent challenges due to complex surface-signal interactions and optimization difficulties in machine learning approaches. This study proposes a novel framework integrating meta- heuristic optimization with extreme gradient boosting (XGBoost) to enhance SM estimation while systematically investigating SAR-SM relationships. A comprehensive evaluation of 29 meta-heuristic algorithms for hyperparameter optimization of XGBoost was conducted, alongside an analysis of 24 SAR-derived features, using dual- polarization Sentinel-1A data from the Northwest Shandong Plain, China. Field measurements from 159 sampling locations across three seasonal periods (summer monsoon, winter dry season and spring transition) provided ground-truth validation. The Snake Optimization algorithm- enhanced XGBoost (SO-XGBoost) demonstrated superior performance (R2 = 0.554 +/- 0.223, RMSE = 5.471 +/- 1.686 m3 m-3), improving accuracy by 15.7% over the baseline model while maintaining reasonable computational efficiency (157.89 s fold-1). SHapley Additive exPlanations (SHAP) analysis revealed that entropy-based features, particularly the normalized intensity component of Shannon Entropy (SEIn), were dominant predictors of SM (mean absolute SHAP value = 4.11), substantially outweighing traditional backscatter coefficients. While faster alternatives like Kepler Optimization Algorithm (KOA-XGBoost, 25.89 s fold-1) exist, SO-XGBoost offered an optimal balance between accuracy and computational efficiency for operational applications. The spatial SM maps generated for three periods effectively captured seasonal dynamics and land-use influences. These findings suggest a paradigm shift in SAR-based SM retrieval methodology, emphasizing statistical distribution approaches over conventional intensity-based methods. The study demonstrates that meta-heuristic optimization significantly enhances machine learning performance for SM retrieval, while entropy- based polarimetric features should be prioritized in future operational applications for improved accuracy.

Return to Publications page