Abstract:
Objective This paper studies the multispectral image-based estimation method for chlorophyll content in Hopea hainanensis, so as to explore the feasibility of fusing multispectral frequency domain features to estimate chlorophyll content, and provide an effective tool for nondestructive monitoring of chlorophyll content in H. hainanensis.
Method By combining vegetation index with traditional threshold segmentation methods, the background of multispectral images of H. hainanensis was removed, and the optimal segmentation method was determined by F1 as the segmentation accuracy evaluation index. Then, based on the segmented multispectral image, spatial domain features (vegetation index and texture features) were extracted, and three frequency domain features were introduced. The measured value of relative chlorophyll content (SPAD value) was measured using a portable chlorophyll analyzer SPAD. And based on correlation analysis and Lasso algorithm, image features were filtered to determine the preferred features which were strongly correlated with SPAD value of H. hainanensis. Finally, based on partial least-squares regression (PLSR), random forest (RF) and XGBoost algorithm, multispectral spatial domain, frequency domain and fusion feature models were established, and precision verification was conducted to determine the optimal model form for SPAD value estimation of young H. hainanensis.
Result The segmentation method combining DVI and Kapur threshold achieved the highest segmentation accuracy, with F1 of 0.917. Therefore, it was the most suitable segmentation method for H. hainanensis canopy multispectral images. Many spatial and frequency domain features of multispectral images exhibited significant correlations with the SPAD values of H. hainanensis. The most correlated feature was the modified chlorophyll absorption reflectivity index, with a correlation coefficient of −0.780. It was the preferred feature for estimating SPAD values based on single image features. Among the three frequency domain features, the correlation performance of wavelet features was the best. Therefore, wavelet transform was the preferred frequency domain transformation method for slope barrier multispectral images. The SPAD value estimation models constructed with different image features were sorted by performance as single frequency domain feature model < single spatial domain feature model < fused feature model, and the corresponding optimal modeling algorithms were RF and XGBoost, respectively. The fusion feature model based on RF was the optimal model, with a test R2 of 0.791, which was 7.9% higher than the test R2 of a single spatial feature model.
Conclusion The estimation accuracy of H. hainanensis SPAD values can be improved by introducing three frequency domain features, and the fusion feature model based on RF can achieve good estimation accuracy. Therefore, integrating multispectral spatial and frequency domain features with machine learning algorithms can be used as an effective tool for estimating the relative chlorophyll content of young H. hainanensis, which is conducive to the intelligent development of H. hainanensis cultivation and management.