高级检索

    基于随机森林算法的落叶松−云冷杉混交林单木胸径生长预测

    Individual tree DBH growth prediction of larch-spruce-fir mixed forests based on random forest algorithm

    • 摘要:
      目的单木生长受气候、林分等多种因子影响,需要利用适当的方法厘清气候以及林分中影响林木生长的主导因子。随机森林等机器学习方法提供了一种新的途径,需要检验利用随机森林算法分析气候和林分因子对林木生长影响的可靠性,为森林生长收获预估提供新的方法。
      方法以吉林省汪清林业局20块落叶松−云冷杉混交林固定样地25年(1986—2010年)间连续调查数据作为研究材料,候选气候和林分因子52个,利用随机森林算法建立了包含气候和林分的单木胸径生长模型,分析气候和林分因子对单木胸径年平均生长量的影响:基于52个超参数组合(决策树数目ntree = 1 000、决策树每个结点随机选择的预测变量个数mtry = 1, 2, ···, 52)构建了52个随机森林模型,利用10折交叉验证法分别训练和评估52个随机森林模型;基于完整数据集,利用最优随机森林模型分析自变量对单木胸径年平均生长量影响的相对重要性以及偏依赖关系。
      结果ntree = 1 000、mtry = 12所对应的模型是52个模型中具有最佳泛化能力的模型,该模型具有最大的交叉验证决定系数R2cvR2cv = 0.54),以及最小的交叉验证均方根误差RMSEcv、交叉验证平均绝对偏差MAEcv和交叉验证相对均方根误差rRMSEcv(RMSEcv = 0.14 cm、MAEcv = 0.10 cm、rRMSEcv = 50%)。单木胸径年平均生长量受林分因子的影响极大,相对重要性超过80.00%。8个林分因子中,大于对象木的林木断面积之和BAL对单木胸径年平均生长量影响最大,林分每公顷株数N对单木胸径年平均生长量影响最小,其他因子对单木胸径年平均生长量影响介于两者之间;单木胸径年平均生长量随BAL、林分每公顷断面积BA、N以及林分断面积平均胸径Dg的增加而下降,随对象木胸径与林分断面积平均胸径之比RD、林木期初胸径D0以及对象木胸径与林分中最大林木胸径之比DDM的增加而增加。单木胸径年平均生长量受气候因子的影响较小,相对重要性低于20.00%。44个气候因子对单木胸径年平均生长量的影响均较小(相对重要性均 < 1%),其中,生长季平均降水量(4—9月)与年均降水量之比Pratio、年总太阳辐射时长Asr、生长季平均降水量(4—9月)与生长季相对湿度(4—9月)之比Gspgsrh以及生长季太阳辐射时长(4—9月)Gssr是前4个相对重要的变量。
      结论随机森林模型能够较好地解析各变量与单木胸径年平均生长量之间复杂的关系,单木胸径年平均生长量受林分因子的影响极大,而受气候因子的影响较小。总体而言,在局部尺度上,林分因子是影响单木胸径生长的主导因子,而气候因子对单木胸径生长的解释能力有限。随机森林模型具有一定的泛化能力和统计可靠性,产生的变量重要性和偏依赖图具有合理的林学意义。

       

      Abstract:
      Objective Individual tree growth can be controlled by many factors, such as climate, competition, stand factor and so on. It is necessary to clarify the dominant factors affecting tree growth from climate and stand variables with appropriate methods. Machine learning methods, such as random forest and so on, provide a new way. Testing the reliability of random forest algorithm in analyzing the effects of stand factors and climate on individual tree growth is necessary. The algorithm is expected to provide a new method for forest growth and yield prediction.
      Method Long-term continuous monitoring data of larch-spruce-fir sample plots repeatedly measured for 25 years (1986−2010) in Wangqing Forest Bureau of Jilin Province, northeastern China were used. Random forest algorithm was used to build individual tree radial growth model with 52 candidate independent variables as competition, stand factor and climate. The effects of climate and stand factors on individual tree radial growth were analyzed. More concretely, 52 random forest models were built based on 52 hyperparametric combinations (ntree = 1 000 and mtry = 1, 2, 3, ···, 52). And 10-fold cross validation was used to train and evaluate these models. The relative importance and partial dependence of independent variables affecting individual tree radial growth were analyzed based on the full data set and the optimal random forest model.
      Result The random forest model with ntree = 1 000 and mtry = 12 had the best generalization ability among all 52 random forest models. This model had the maximal determination coefficient of cross validation (R2cv, R2cv = 0.54), the minimal root mean square error of cross validation (RMSEcv), mean absolute deviation of cross validation (MAEcv) and relative root mean square error of cross validation (rRMSEcv) (RMSEcv = 0.14 cm, MAEcv = 0.10 cm and rRMSEcv = 50%). Individual tree radial growth was affected mostly by stand factors with the relative importance over 80.00%. What’s more, among the 9 stand factors, the sum of basal area larger than the subject tree (BAL) had the highest impact on individual tree radial growth, the number of trees per hectare (N) had the lowest impact, and the impact of other 5 factors were in between. Besides, individual tree radial growth decreased with the increase of BAL, basal area per hectare (BA), N and average DBH (Dg). On the other hand, individual tree radial growth increased with the increase of the ratio of DBH of subject tree to average DBH (RD), DBH at the beginning of the period (D0) and the ratio of DBH of subject tree to the maximal DBH (DDM). In addition, individual tree radial growth was less affected by climatic factors, with the relative importance was below 20.00%. What’s more, all the 44 climatic variables had little impact on individual tree radial growth (the relative importance < 1%). And the ratio of average precipitation in growing season (April−September) to annual precipitation (Pratio), the total annual solar radiation (Asr), the ratio of average precipitation in growing season (April−September) to the relative humidity in growing season (April−September) (Gspgsrh) and solar radiation duration in growing season (April−September) (Gssr) were top four relatively important climate variables.
      Conclusion Random forest model can be used to reasonably analyze the complex relationship between each predicted variables and individual tree radial growth. Individual tree radial growth is mostly affected by stand factors, but less affected by climatic factors. In general, the climatic factors had very limited ability in explaining the variation of individual tree radial growth at local scale, while the stand factors, such as competition and stand factor, are the main drivers to individual tree radial growth. Random forest models have good performance in model generalization and accuracy. Both the variable importance and partial dependence plots have reasonable forestry interpretation.

       

    /

    返回文章
    返回