Abstract:
Objective High-altitude forest monitoring faces multiple constraints, including cloud interference, limited training samples, and high spectral similarity among tree species. These factors severely restrict accurate mapping of dominant species distributions. This study, focusing on typical pure forests in Shangri-La, aimed to enhance species identification accuracy and model generalization through multi-source data and multi-strategy feature optimization.
Method Sentinel-2 optical time-series, Sentinel-1 radar, and SRTM topographic data were retrieved via Google Earth Engine. We extracted spectral, texture, vegetation index, radar polarization, topographic, and temporal features to construct a baseline feature set. A Random Forest (RF) model first established a pre-selection benchmark. J-M distance, ReliefF, and RFE algorithms were then executed in parallel to generate three individual feature subsets. These subsets were merged via union fusion to create a parallel hybrid feature set. Both individual and hybrid feature sets were input into RF models for classification. The optimal scheme was identified by comparing results across all feature sets. Accuracy was evaluated using Producer’s Accuracy (PA), User’s Accuracy (UA), F1-score, Overall Accuracy (OA), and Kappa coefficient.
Result (1) Scheme 9, based on the parallel hybrid of J-M distance, ReliefF, and RFE, achieved the highest accuracy (OA = 94.82%, Kappa = 0.94), surpassing the pre-selection baseline (Scheme 5). (2) Multi-source data integration outperformed single-source data. Using Sentinel-2 data alone yielded an OA of 83.35% (Kappa = 0.79). Adding Sentinel-1 radar features, Sentinel-1 texture features, topographic features, and Sentinel-2 temporal features increased OA by 0.87%, 6.28%, 8.08%, and 10.18%, respectively (Kappa = 0.81, 0.86, 0.90, 0.92). Notably, Sentinel-2 temporal features alone contributed a 2.10 percentage point improvement. (3) Temporal vegetation index curves revealed significant inter-species differences and strong separability during autumn and winter.
Conclusion The parallel hybrid feature selection approach, integrating multi-source GEE data, effectively improved identification accuracy of dominant forest species in Shangri-La. It systematically revealed their spatial distribution patterns and provides robust technical support for precision monitoring of forest resources in high-altitude regions.