Applications of machine learning algorithms in forest growth and yield prediction
-
摘要: 森林生长收获预估是森林经理学的一个重要方向,采用模型技术进行森林生长收获估计是森林经营决策的重要前提。传统的统计模型如线性及非线性回归模型、混合效应模型、分位数回归、度量误差模型等统计方法已被广泛应用于研究林木生长,但这些统计方法在应用时常常需满足一定的统计假设前提,诸如数据独立、正态分布和等方差等。由于森林生长数据的连续观测和层次性,上述假设通常难以满足。近年来随着人工智能技术的发展,机器学习算法为森林生长收获预估提供了一种新的手段,它具有对输入数据的分布形式没有假设前提、能够揭示数据中的隐含结构、预测结果好等优点,但在森林生长收获预估中的应用仍十分有限。文章对分类和回归树、多元自适应样条、bagging回归、增强回归树、随机森林、人工神经网络、支持向量机、K最近邻等方法在森林生长收获预估中的应用、软件及调参等进行了综述,讨论了机器学习方法的优势和挑战,认为机器学习方法在森林生长收获预估方面有很大的潜力,必将得到广泛应用,并和传统统计模型相结合成为生长收获模型发展的一种趋势。Abstract: Forest growth and yield prediction is an important field of forest management science, and modelling forest growth and yield is key to forest management decision-making. The traditional statistical growth models such as linear and nonlinear regression model, mixed-effect model, quantile regression, variable-in-error model are often applied under certain statistical assumptions, such as the data are independent, normally distributed and homoscedastic. The above requirements are usually difficult to be met for forest data with repeated observation and hierarchy. With the development of AI techniques, machine learning provides a new way for forest growth modeling, with the advantages of no requirements on data distribution, extracting deep knowledge from the data, and high accuracy. The applications in forest growth and yield are still less than other domains. We reviewed the main machine learning algorithms including classification and regression tree (CART), multivariate adaptive regression splines (MARS), bagging regression, boosted regression tree (BRT), random forest (RF), artificial neural networks (ANN), k-nearest neighbors (k-NN), and support vector machine (SVM), parameter tuning, software, advantages and challenge. We conclude that machine learning would be widely applied with great potential and its combination with traditional statistical methods would become a trend in forest growth and yield prediction.
-
表 1 R软件中机器学习包及调参方法
Table 1. R packages and parameter tuning of machine learning algorithms
机器学习算法
ML algorithmR 程序包
R package可调超参数
Adjustable hyper-parameter调参方法
Method of parameter tuning默认值(回归)
Default value (regression)CART rpart 复杂度参数
Complexity parameter (cp)cp取介于0 ~ 1的实数。可基于交叉验证(如10折交叉验证),建立调参网格对cp进行调优(如cp可设置为0.000 1,0.001,0.01,0.1等不同值)
The value of cp is between 0 and 1. The optimal one could be obtained by grid search (with different cp values of 0.000 1, 0.001, 0.01, 0.1 for example)cp = 0.01 MARS earth ①自变量间交互的阶数
The degree of interaction of input variables (degree)
②模型中最大的项数
Maximum number of terms (including intercept) in the model (nprune)degree为大于等于1的整数,Hastie等[93]建议为交互项degree设置一个上限(如degree ≤ 3)。2 ≤ nprune ≤ nk,nk的计算公式为:nk = min(200, max(20, 2 × ncol(x))),式中,min()与max()分别表示取最小值和取最大值,ncol(x)表示自变量x的总数。可基于交叉验证,建立调参网格对degree和prune进行调优
The value of degree is the integer ≥ 1. Hastie et al.[93] suggested degree should be set an upper limit (degree ≤ 3, for example). 2 ≤ nprune ≤ nk,nk = min(200, max(20, 2 × ncol(x))),where ncol(x)is the number of input variables. The values of degree and prune could be attained by grid search based on cross validationdegree = 1
nprune无默认值
No default provided for npruneBagging
回归树
Bagging regression treeipred 决策树的数量
The number of decision trees (nbagg)nbagg取值为大于等于1的整数,该参数取值仍需依具体数据而定,为保证预估结果的可靠性且不影响计算效率,可将nbagg设置为大于25的值(如50等)
The value of nbagg is the integer ≥ 1. It could be set as the integer ≥25 (50 for example) for prediction reliability and computation efficiencynbagg = 25 RF randomForest ①随机森林中决策树的数目
The number of decision trees (ntree)
②树节点随机抽选的变量个数
The number of input variables randomly sampled as candidates at each node (mtry)ntree和mtry取值均为大于1的整数,当ntree在500以后整体误差便趋于稳定,但仍需依据具体数据而定,为保证预估结果的可靠性且不会影响计算效率,ntree可以取大于500的值(如1 000等);1 ≤ mtry ≤ P, P为全部自变量数目,可基于交叉验证,建立调参网格对mtry进行调优
The values of ntree and mtry are the integers ≥ 1. The bias could be stable when ntree is larger than 500. It could be set as the integer ≥ 500 (1 000 for example) for prediction reliability and computation efficiency. mtry is less than the number of all input variables P, and the optimal value could be attained by grid search based on cross validationntree = 500;mtry为全部自变量数目的三分之一(取整)
The value of mtry is the one-third of all input variables
(integer)BRT gbm ①损失函数的形式
The name of the distribution (distribution)
②决策树的数目(或称迭代次数)
Integer specifying the total number of trees to fit (n.trees)
③学习速率(或称收缩参数)
The learning rate or step-size reduction (shrinkage)
④再抽样比率
The fraction of the training set observations randomly selected to propose the next tree in the expansion (bag.fraction)
⑤变量交互的深度
The maximum depth of variable interactions (interaction.depth)对于回归问题distribution设置为gaussian;bag.fraction (0 < bag.fraction ≤ 1),Friedman[96]推荐将bag.fraction设置在0.5左右;shrinkage (0 < shrinkage ≤ 1),n.trees为大于1的整数,shrinkage 影响超参数n.trees的取值,Ridgeway[95]建议将shrinkage 的取值范围设置在0.01至0.001之间,同时n.trees取值介于3 000至10 000之间;interaction.depth取值为大于等于1的整数,为了平衡计算开销和模型性能,该超参数可尝试若干值进行调优(如1,3,5,7,9)。可基于交叉验证,建立调参网格对上述超参数进行调优
gaussian is assumed for regression; bag.fraction is suggested to be set as 0.5 by Friedman[96], shrinkage is suggested to be between 0.01 and 0.001 and n.trees between 3 000 and 10 000 by Ridgeway[95]. Shrinkage affects the value of n.trees. The value of interaction.depth is a integer ≥ 1, and several specific values could be tested (1, 3, 5, 7, 9 for example) for the balance of model performance and computation efficiency. All optimal values of these hyper-parameters could be attained by grid search based on cross validationDistribution = gaussian;
n.trees = 100;
shrinkage = 0.1;
bag.fraction = 0.5;
interaction.dept = 1SVM kernlab ①核函数
Kernel function (kernel)
②代价参数
Cost of constraints violation (C)对于回归问题,线性核函数以及径向基核函数是两类常用的kernel;C为大于0的实数(如0.01,0.1,1,10,100,…)。可基于交叉验证,建立调参网格对上述超参数进行调优
Linear or radical basis functions are commonly used for regressions. The optimal values of C (0.01, 0.1, 1, 10, 100, … for example) could be attained by grid search based on cross validationkernel为径向
基核函数;
Radical basis function; C = 1ANN nnet ①隐藏节点个数
The number of hidden node (size)
②权重衰减
Weight decay (decay)size为大于等于0的整数,一般使用的size确定方法为,$\scriptstyle{\rm{size}} = \sqrt {P + O} + m$,P表示输入层自变量的个数,O表示输出层因变量的个数,m的取值为0 ~ 10之间的整数;decay为0至0.1的实数。可基于交叉验证(如10折交叉验证),建立调参网格对上述超参数进行调优$\scriptstyle{\rm{size}} = \sqrt {P + O} + m$
Size is the integar ≥ 0 and determined as $\scriptstyle{\rm{size}} =\sqrt {P + O} + m $, where P is the number of input variables, O is the number of the output variables, m is a integer between 0 and 10. Decay is a real between 0 and 0.1. The optimal values of these parameters could be attained by grid search based on cross validationdecay = 0
size无默认值
no default provided
for sizek-NN caret 近邻点的个数
The number of nearest
neighbors (k)k为大于等于1的整数,通常k的取值在3 ~ 10范围内[94]。可基于交叉验证, 建立调参网格对k进行调优
k is the integer ≥ 1 and often set as between 3 and 10 according to Lantz[94]. The optimal value could be attained by grid search based on cross validationk = 5 表 2 主要机器学习算法优缺点
Table 2. Advantages and disadvantages of main machine learning algorithms
机器学习算法
ML algorithm优点
Advantage缺点
DisadvantageCART 与分布无关;受共线性和异常值影响小;简单,容易解释,能自动处理交互作用;能处理连续和分类变量
Distribution independent; less affected by collinearity and outliers; easy explained; deal with interactions, continuous and category variables单棵树的结构不稳定,容易出现过拟合,大的决策树不易解释
Unstable structure for single tree; over-fitting; difficulty in explaining big decision treesbagging回归
Bagging regression与分布无关;调节参数少;受共线性和异常值影响小;较CART泛化能力强;能处理连续和分类变量
Distribution independent; less affected by collinearity and outliers; less tuning parameters; better generalization capacity than CART; deal with interactions, continuous and category variables不如单棵决策树容易解释,对过多变量会敏感
Not so easy to be explained as single decision tree; sensitive to too many input variablesRF 与分布无关;调节参数少;受共线性和异常值影响小;能产生变量重要性和偏依赖图;与bagging相比,训练出的模型的方差更小,泛化能力强;能处理连续和分类变量
Distribution independent; less affected by collinearity and outliers; less tuning parameters; better generalization capacity than CART; deal with interactions, continuous and category variables不如单棵决策树容易解释,对过多变量会敏感
Not so easy to be explained as single decision tree; sensitive to too many input variablesBRT 与分布无关;受共线性和异常值影响小,可处理复杂非线性关系;能产生变量偏依赖性和相对重要性图;能处理连续和分类变量
Distribution independent; less affected by collinearity and outliers; deal with complex nonlinear relations, and continuous and category variables; generate variable importance and partial dependence plots超参数过多,调参复杂
Many hyperparameters, complex parameter tuningSVM 与分布无关;可以解决高维问题,能处理非线性特征的相互作用,泛化能力强;能处理连续和分类变量
Distribution independent; workable for high-dimensional variables; deal with continuous and category variables; good generalization capacity黑箱;样本量多时,效率不高;非线性问题较难找到核函数;有太多的定性变量或定性变量水平太多时很难实现,难以解释;易受共线性影响
Black box; low efficiency for large sample size; difficulty in searching kernel function for nonlinear issues and implementing owing to too many category variables or levels; easily affected by collinearityANN 与分布无关;可处理非线性数据;能处理连续和分类变量
Distribution independent; workable for nonlinear relations, continuous and category variables黑箱;存在过拟合风险,自变量过多时预测结果不好;易受共线性影响
Black box; overfitting risk; large prediction errors for many input variables; easily affected by collinearityKNN 与分布无关;建模简单;可处理复杂非线性关系;能处理连续和分类变量
Distribution independent; easy; workable for nonlinear relations, continuous and category variables黑箱;K值需要调试,样本不平衡时,预测误差较大;易受共线性影响
Black box; testing for k values; large prediction errors for imbalanced samples; affected by collinearityMARS 与分布无关;调节参数少;受共线性和异常值影响小;可处理非线性和变量交互问题;计算效率高;能处理连续和分类变量
Distribution independent; less tuning parameters; unaffected by collinearity and outliers; workable for nonlinear relations and interactions; high computation efficiency易受局部数据特征的影响
Easily being affected by local data features -
[1] Weiskittel A R, Hann D W, Kershaw Jr J A, et al. Forest growth and yield modeling[M]. Chichester: John Wiley and Sons, 2011. [2] 唐守正, 李希菲, 孟昭和. 林分生长模型研究的进展[J]. 林业科学研究, 1993, 6(6):672−679. doi: 10.3321/j.issn:1001-1498.1993.06.018Tang S Z, Li X F, Meng Z H. The development of studies on stand growth models[J]. Forest Research, 1993, 6(6): 672−679. doi: 10.3321/j.issn:1001-1498.1993.06.018 [3] Peng C H. Growth and yield models for uneven-aged stands: past, present and future[J]. Forest Ecology and Management, 2000, 132(2−3): 259−279. doi: 10.1016/S0378-1127(99)00229-7 [4] Huang S L, Ramirez C, McElhaney M, et al. F3: simulating spatiotemporal forest change from field inventory, remote sensing, growth modeling, and management actions[J]. Forest Ecology and Management, 2018, 415−416: 26−37. doi: 10.1016/j.foreco.2018.02.026 [5] Cutler D R, Edwards T C, Beard K H, et al. Random forests for classification in ecology[J]. Ecology, 2007, 88(11): 2783−2792. doi: 10.1890/07-0539.1 [6] Wu C F, Shen H H, Shen A H, et al. Comparison of machine-learning methods for above-ground biomass estimation based on Landsat imagery[J]. Journal of Applied Remote Sensing, 2016, 10(3): 035010. doi: 10.1117/1.JRS.10.035010 [7] Recknagel F. Applications of machine learning to ecological modelling[J]. Ecological Modelling, 2001, 146(1−3): 303−310. [8] Liu Z L, Peng C H, Xiang W H, et al. Application of artificial neural networks in global climate change and ecological research: an overview[J]. Chinese Science Bulletin, 2010, 55(34): 3853−3863. doi: 10.1007/s11434-010-4183-3 [9] Guan B T, Gertner G. Modeling red pine tree survival with an artificial neural network[J]. Forest Science, 1991, 37(5): 1429−1440. [10] Guan B T, Gertner G. Using a parallel distributed processing system to model individual tree mortality[J]. Forest Science, 1991, 37(3): 871−885. [11] 李际平, 姚东和. BP模型在单木树高与胸径生长模拟中的应用[J]. 中南林学院学报, 1996, 16(3):34−36.Li J P, Yao D H. Application of BP neural network model to the simulation of breast height diameter and tree-height growth[J]. Journal of Central-South Forestry University, 1996, 16(3): 34−36. [12] 洪伟, 吴承祯, 何东进. 基于人工神经网络的森林资源管理模型研究[J]. 自然资源学报, 1998, 13(1):69−72. doi: 10.3321/j.issn:1000-3037.1998.01.012Hong W, Wu C Z, He D J. A study on the model of forest resources management based on the artificial neural network[J]. Journal of Natural Resources, 1998, 13(1): 69−72. doi: 10.3321/j.issn:1000-3037.1998.01.012 [13] 浦瑞良, 宫鹏. 应用神经网络和多元回归技术预测森林产量[J]. 应用生态学报, 1999, 10(2):129−134. doi: 10.3321/j.issn:1001-9332.1999.02.001Pu R L, Gong P. Forest yield prediction with an artificial neural network and multiple regression[J]. Chinese Journal of Applied Ecology, 1999, 10(2): 129−134. doi: 10.3321/j.issn:1001-9332.1999.02.001 [14] 林辉, 彭长辉. 人工神经网络在森林资源管理中的应用[J]. 世界林业研究, 2002, 15(3):22−31. doi: 10.3969/j.issn.1001-4241.2002.03.004Lin H, Peng C H. Application of artificial neural network in forest resource management[J]. World Forestry Research, 2002, 15(3): 22−31. doi: 10.3969/j.issn.1001-4241.2002.03.004 [15] 黄家荣, 孟宪宇, 关毓秀. 马尾松人工林单木生长神经网络模型研究[J]. 山地农业生物学报, 2004, 23(5):386−391. doi: 10.3969/j.issn.1008-0457.2004.05.003Huang J R, Meng X Y, Guan Y X. The study on neural network models of individual tree growth in Pinus massoniana plantation[J]. Journal of Mountain Agriculture and Biology, 2004, 23(5): 386−391. doi: 10.3969/j.issn.1008-0457.2004.05.003 [16] Peng C H, Wen X Z. Recent applications of artificial neural networks in forest resource management: an overview[C/OL]//Environmental Decision Support Systems and Artificial Intelligence. AAAI, 1999 [2019−06−16]. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.487.7652. [17] Liu Z L, Peng C H, Work T, et al. Application of machine-learning methods in forest ecology: recent progress and future challenges[J]. Environmental Reviews, 2018, 26(4): 339−350. doi: 10.1139/er-2018-0034 [18] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.Zhou Z H. Machine leaning[M]. Beijing: Tsinghua University Press, 2016. [19] Zhou Z H. Machine learning: recent progress in China and beyond[J]. National Science Review, 2018, 5(1): 20. doi: 10.1093/nsr/nwx132 [20] 吴喜之. 应用回归及分类: 基于R[M]. 北京: 中国人民大学出版社, 2016.Wu X Z. Applied regression and classification with R[M]. Beijing: China People’s University Press, 2016. [21] Dobbertin M, Biging G S. Using the non-parametric classifier CART to model forest tree mortality[J]. Forest Science, 1998, 44(4): 507−516. [22] Fan Z F, Kabrick J M, Shifley S R. Classification and regression tree based survival analysis in oak-dominated forests of Missouri’s Ozark highlands[J]. Canadian Journal of Forest Research, 2006, 36(7): 1740−1748. doi: 10.1139/x06-068 [23] Adamec Z, Drápela K. Comparison of parametric and nonparametric methods for modeling height-diameter relationships[J]. iForest-Biogeosciences and Forestry, 2017, 10(1): 1−8. doi: 10.3832/ifor1928-009 [24] Aertsen W, Kint V, van Orshoven J, et al. Comparison and ranking of different modelling techniques for prediction of site index in Mediterranean mountain forests[J]. Ecological Modelling, 2010, 221(8): 1119−1130. doi: 10.1016/j.ecolmodel.2010.01.007 [25] Räty M, Kangas A. Localizing general models with classification and regression trees[J]. Scandinavian Journal of Forest Research, 2008, 23(5): 419−430. doi: 10.1080/02827580802378826 [26] Piramuthu S. Input data for decision trees[J]. Expert Systems with Applications, 2008, 34(2): 1220−1226. doi: 10.1016/j.eswa.2006.12.030 [27] Rejwan C, Collins N C, Brunner L J, et al. Tree regression analysis on the nesting habitat of smallmouth bass[J]. Ecology, 1999, 80(1): 341−348. doi: 10.1890/0012-9658(1999)080[0341:TRAOTN]2.0.CO;2 [28] Friedman J H. Multivariate adaptive regression splines[J]. Annals of Statistics, 1991, 19(1): 1−67. doi: 10.1214/aos/1176347963 [29] Prasad A M, Iverson L R, Liaw A. Newer classification and regression tree techniques: bagging and random forests for ecological prediction[J]. Ecosystems, 2006, 9(2): 181−199. doi: 10.1007/s10021-005-0054-1 [30] Chojnacky D C, Heath L S. Estimating down deadwood from FIA forest inventory variables in Maine[J]. Environmental Pollution, 2002, 116(Suppl.1): S25−S30. [31] Hart S J, Laroque C P. Searching for thresholds in climate-radial growth relationships of Engelmann spruce and subalpine fir, Jasper National Park, Alberta, Canada[J]. Dendrochronologia, 2013, 31(1): 9−15. doi: 10.1016/j.dendro.2012.04.005 [32] Moisen G G, Frescino T S. Comparing five modelling techniques for predicting forest characteristics[J]. Ecological Modelling, 2002, 157(2/3): 209−225. [33] Ou Q X, Lei X D, Shen C C. Individual tree diameter growth models of larch-spruce-fir mixed forests based on machine learning algorithms[J]. Forests, 2019, 10(2): 187. doi: 10.3390/f10020187 [34] Lee T S, Chiu C C, Chou Y C, et al. Mining the customer credit using classification and regression tree and multivariate adaptive regression splines[J]. Computational Statistics and Data Analysis, 2006, 50(4): 1113−1130. doi: 10.1016/j.csda.2004.11.006 [35] Heddam S, Kisi O. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree[J]. Journal of Hydrology, 2018, 559: 499−509. doi: 10.1016/j.jhydrol.2018.02.061 [36] Breiman L. Random forests[J]. Machine Learning, 2001, 45(1): 5−32. doi: 10.1023/A:1010933404324 [37] Friedman J H. Greedy function approximation: a gradient boosting machine[J]. Annals of Statistics, 2001, 29(5): 1189−1232. [38] Goldstein A, Kapelner A, Bleich J, et al. Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation[J]. Journal of Computational and Graphical Statistics, 2015, 24(1): 44−65. doi: 10.1080/10618600.2014.907095 [39] Strobl C, Boulesteix A L, Kneib T, et al. Conditional variable importance for random forests[J]. BMC bioinformatics, 2008, 9: 307. doi: 10.1186/1471-2105-9-307 [40] Weiskitte A R, Crookston N L, Radtke P J. Linking climate, gross primary productivity, and site index across forests of the western United States[J]. Canadian Journal of Forest Research, 2011, 41(8): 1710−1721. doi: 10.1139/x11-086 [41] Bond-Lamberty B, Rocha A V, Calvin K, et al. Disturbance legacies and climate jointly drive tree growth and mortality in an intensively studied boreal forest[J]. Global Change Biology, 2014, 20(1): 216−227. doi: 10.1111/gcb.12404 [42] Kilham P, Hartebrodt C, Kändler R G. Generating tree-level harvest predictions from forest inventories with random forests[J]. Forests, 2019, 10(1): 20. [43] 欧强新, 雷相东, 沈琛琛, 等. 基于随机森林算法的落叶松-云冷杉混交林单木胸径生长预测[J]. 北京林业大学学报, 2019, 41(9):9−19.Ou Q X, Lei X D, Shen C C, et al. Individual tree DBH growth prediction of larch-spruce-fir mixed forests based on random forest algorithm[J]. Journal of Beijing Forestry University, 2019, 41(9): 9−19. [44] Nunes M H, Görgens E B. Artificial intelligence procedures for tree taper estimation within a complex vegetation mosaic in Brazil[J/OL]. PLoS One, 2016, 11(5): e0154738 [2019−10−02]. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0154738. [45] De ’ath G. Boosted trees for ecological modeling and prediction[J]. Ecology, 2007, 88(1): 243−251. doi: 10.1890/0012-9658(2007)88[243:BTFEMA]2.0.CO;2 [46] Freeman E A, Moisen G G, Coulston J W, et al. Random forests and stochastic gradient boosting for predicting tree canopy cover: comparing tuning processes and model performance[J]. Canadian Journal of Forest Research, 2016, 46(3): 323−339. doi: 10.1139/cjfr-2014-0562 [47] Kuhn M, Johnson K. Applied predictive modeling[M]. New York: Springer, 2013. [48] Elith J, Leathwick J R, Hastie T. A working guide to boosted regression trees[J]. Journal of Animal Ecology, 2008, 77(4): 802−813. doi: 10.1111/j.1365-2656.2008.01390.x [49] Mezei P, Grodzki W, Blaženec M, et al. Host and site factors affecting tree mortality caused by the spruce bark beetle (Ips typographus) in mountainous conditions[J]. Forest Ecology and Management, 2014, 331: 196−207. doi: 10.1016/j.foreco.2014.07.031 [50] Sproull G J, Adamus M, Bukowski M, et al. Tree and stand-level patterns and predictors of Norway spruce mortality caused by bark beetle infestation in the Tatra Mountains[J]. Forest Ecology and Management, 2015, 354: 261−271. doi: 10.1016/j.foreco.2015.06.006 [51] Oguro M, Imahiro S, Saito S, et al. Relative importance of multiple scale factors to oak tree mortality due to Japanese oak wilt disease[J]. Forest Ecology and Management, 2015, 356: 173−183. doi: 10.1016/j.foreco.2015.07.016 [52] Cai W H, Yang J, Liu Z H, et al. Post-fire tree recruitment of a boreal larch forest in Northeast China[J]. Forest Ecology and Management, 2013, 307: 20−29. doi: 10.1016/j.foreco.2013.06.056 [53] De Cauwer V, Fichtler E, Beeckman H, et al. Predicting site productivity of the timber tree Pterocarpus angolensis[J]. Southern Forests: a Journal of Forest Science, 2017, 79(3): 259−268. doi: 10.2989/20702620.2016.1256042 [54] Razakamanarivo R H, Grinand C, Razafindrakoto M A, et al. Mapping organic carbon stocks in eucalyptus plantations of the central highlands of Madagascar: a multiple regression approach[J]. Geoderma, 2011, 162(3−4): 335−346. [55] Lin D M, Anderson-Teixeira K J, Lai J S, et al. Traits of dominant tree species predict local scale variation in forest aboveground and topsoil carbon stocks[J]. Plant and Soil, 2016, 409(1−2): 435−446. [56] 欧强新, 李海奎, 杨英. 福建地区马尾松生物量转换和扩展因子的影响因素[J]. 生态学报, 2017, 37(17):5756−5764.Ou Q X, Li H K, Yang Y. Factors affecting the biomass conversion and expansion factor of Masson pine in Fujian Province[J]. Acta Ecologica Sinica, 2017, 37(17): 5756−5764. [57] 欧强新, 李海奎, 雷相东, 等. 基于清查数据的福建省马尾松生物量转换和扩展因子估算差异解析:3种集成学习决策树模型的比较[J]. 应用生态学报, 2018, 29(6):2007−2016.Ou Q X, Li H K, Lei X D, et al. Difference analysis in estimating biomass conversion and expansion factors of masson pine in Fujian Province, China based on national forest inventory data: a comparison of three decision tree models of ensemble learning[J]. Chinese Journal of Applied Ecology, 2018, 29(6): 2007−2016. [58] Ren Y, Chen S S, Wei X H, et al. Disentangling the factors that contribute to variation in forest biomass increments in the mid-subtropical forests of China[J]. Journal of Forestry Research, 2016, 27(4): 919−930. doi: 10.1007/s11676-016-0237-y [59] Aertsen W, Kint V, De Vos B, et al. Predicting forest site productivity in temperate lowland from forest floor, soil and litterfall characteristics using boosted regression trees[J]. Plant and Soil, 2012, 354(1−2): 157−172. [60] Mitsopoulos I, Xanthopoulos G. Effect of stand, topographic, and climatic factors on the fuel complex characteristics of Aleppo (Pinus halepensis Mill.) and Calabrian (Pinus brutia Ten.) pine forests of Greece[J]. Forest Ecology and Management, 2016, 360: 110−121. doi: 10.1016/j.foreco.2015.10.027 [61] Fricker G A, Synes N W, Serra-Diaz J M, et al. More than climate? Predictors of tree canopy height vary with scale in complex terrain, Sierra Nevada, CA (USA)[J]. Forest Ecology and Management, 2019, 434: 142−153. doi: 10.1016/j.foreco.2018.12.006 [62] 王星. 大数据分析: 方法与应用[M]. 北京: 清华大学出版社, 2013.Wang X. Big data analysis: methods and applications[M]. Beijing: Tsinghua University Press, 2013. [63] Ciaburro G, Venkateswaran B. 神经网络: R语言实现[M]. 李洪成, 译. 北京: 机械工业出版社, 2018.Ciaburro G, Venkateswaran B. Neural networks with R[M]. Li H C, trans. Beijing: China Machine Press, 2018. [64] Ciaburro G, Venkateswaran B. Neural networks with R: smart models using CNN, RNN, deep learning, and artificial intelligence principles[M]. Birmingham: Packt Publishing, 2017. [65] Hasenauer H, Merkl D, Weingartner M. Estimating tree mortality of Norway spruce stands with neural networks[J]. Advances in Environmental Research, 2001, 5(4): 405−414. doi: 10.1016/S1093-0191(01)00092-2 [66] Castro R V O, Boechat Soares C P, Leite H G, et al. Individual growth model for Eucalyptus stands in Brazil using artificial neural network[J/OL]. ISRN Forestry, 2013, 2013: Article ID 196832 [2019−05−18]. https://www.hindawi.com/journals/isrn/2013/196832/. [67] Reis L P, de Souza A L, dos Reis P C M, et al. Estimation of mortality and survival of individual trees after harvesting wood using artificial neural networks in the amazon rain forest[J]. Ecological Engineering, 2018, 112: 140−147. doi: 10.1016/j.ecoleng.2017.12.014 [68] da Rocha S J S S, Torres C M M E, Jacovine L A G, et al. Artificial neural networks: modeling tree survival and mortality in the Atlantic Forest biome in Brazil[J]. Science of the Total Environment, 2018, 645: 655−661. doi: 10.1016/j.scitotenv.2018.07.123 [69] Bayat M, Ghorbanpour M, Zare R, et al. Application of artificial neural networks for predicting tree survival and mortality in the Hyrcanian forest of Iran[J]. Computers and Electronics in Agriculture, 2019, 164: 104929. doi: 10.1016/j.compag.2019.104929 [70] Soares F A A M N, Flôres E L, Cabacinha C D, et al. Recursive diameter prediction and volume calculation of eucalyptus trees using multilayer perceptron networks[J]. Computers and Electronics in Agriculture, 2011, 78(1): 19−27. doi: 10.1016/j.compag.2011.05.008 [71] Ashraf M I, Zhao Z Y, Bourque C P A, et al. Integrating biophysical controls in forest growth and yield predictions with artificial intelligence technology[J]. Canadian Journal of Forest Research, 2013, 43(12): 1162−1171. doi: 10.1139/cjfr-2013-0090 [72] Vieira G C, de Mendonça A R, da Silva G F, et al. Prognoses of diameter and height of trees of eucalyptus using artificial intelligence[J]. Science of the Total Environment, 2018, 619: 1473−1481. doi: 10.1016/j.scitotenv.2017.11.138 [73] 马翔宇, 段文英, 崔金刚. 白桦人工林单木生长的人工神经网络模型研究[J]. 森林工程, 2009, 25(3):30−33, 38. doi: 10.3969/j.issn.1001-005X.2009.03.007Ma X Y, Duan W Y, Cui J G. Study on the artificial neural network model of individual tree growth in the Betula platyphlla plantation[J]. Forest Engineering, 2009, 25(3): 30−33, 38. doi: 10.3969/j.issn.1001-005X.2009.03.007 [74] 沈剑波, 雷相东, 李玉堂, 等. 基于BP神经网络的长白落叶松人工林林分平均高预测[J]. 南京林业大学学报(自然科学版), 2018, 42(2):147−154.Shen J B, Lei X D, Li Y T, et al. Prediction mean height for Larix olgensis plantation based on Bayesian-regularization BP neural network[J]. Journal of Nanjing Forestry University (Natural Sciences Edition), 2018, 42(2): 147−154. [75] 车少辉, 张建国, 段爱国, 等. 杉木人工林胸径生长神经网络建模研究[J]. 西北农林科技大学学报(自然科学版), 2012, 40(3):84−92.Che S H, Zhang J G, Duan A G, et al. Modelling tree diameter growth for Chinese fir plantations with neural networks[J]. Journal of Northwest A&F University (Natural Sciences Edition), 2012, 40(3): 84−92. [76] 龙滔, 覃连欢, 叶绍明. 基于BP神经网络连栽桉树人工林生长量预测[J]. 东北林业大学学报, 2012, 40(5):122−125. doi: 10.3969/j.issn.1000-5382.2012.05.030Long T, Qin L H, Ye S M. Prediction for the growth of Eucalyptus plantations with continuous-planting rotations based on BP neural network[J]. Journal of Northeast Forestry University, 2012, 40(5): 122−125. doi: 10.3969/j.issn.1000-5382.2012.05.030 [77] Reis L P, de Souza A L, Mazzei L, et al. Prognosis on the diameter of individual trees on the eastern region of the amazon using artificial neural networks[J]. Forest Ecology and Management, 2016, 382: 161−167. doi: 10.1016/j.foreco.2016.10.022 [78] Vahedi A A. Artificial neural network application in comparison with modeling allometric equations for predicting above-ground biomass in the Hyrcanian mixed-beech forests of Iran[J]. Biomass and Bioenergy, 2016, 88: 66−76. doi: 10.1016/j.biombioe.2016.03.020 [79] Özçelık R, Diamantopoulou M J, Eker M, et al. Artificial neural network models: an alternative approach for reliable aboveground pine tree biomass prediction[J]. Forest Science, 2017, 63(3): 291−302. doi: 10.5849/FS-16-006 [80] Wu C Y, Chen Y F, Peng C H, et al. Modeling and estimating aboveground biomass of Dacrydium pierrei in China using machine learning with climate change[J]. Journal of Environmental Management, 2019, 234: 167−179. [81] 徐奇刚, 雷相东, 国红, 等. 基于多层感知机的长白落叶松人工林林分生物量模型[J]. 北京林业大学学报, 2019, 41(5):97−107.Xu Q G, Lei X D, Guo H, et al. Stand biomass model of Larix olgensis plantations based on multi-layer perceptron networks[J]. Journal of Beijing Forestry University, 2019, 41(5): 97−107. [82] Hlásny T, Trombik J, Bošeľa M, et al. Climatic drivers of forest productivity in Central Europe[J]. Agricultural and Forest Meteorology, 2017, 234−235: 258−273. doi: 10.1016/j.agrformet.2016.12.024 [83] Lima M B D O, Junior I M L, Oliveira E M, et al. Artificial neural networks in whole-stand level modeling of Eucalyptus plants[J]. African Journal of Agricultural Research, 2017, 12(7): 524−534. doi: 10.5897/AJAR2016.12068 [84] Yousefpoor M, Shahraji T R, Eslam B A, et al. The use of artificial neural network to evaluate the effects of human and physiographic factors on forest stock volume[J]. Journal of Applied Sciences and Environmental Management, 2016, 20(4): 1017−1024. [85] 林卓, 吴承祯, 洪伟, 等. 基于BP神经网络和支持向量机的杉木人工林收获模型研究[J]. 北京林业大学学报, 2015, 37(1):42−54.Lin Z, Wu C Z, Hong W, et al. Yield model of Cunninghamia lanceolata plantation based on back propagation neural network and support vector machine[J]. Journal of Beijing Forestry University, 2015, 37(1): 42−54. [86] Tavares J I D S, da Rocha J E C, Ebling  A, et al. Artificial neural networks and linear regression reduce sample intensity to predict the commercial volume of Eucalyptus Clones[J]. Forests, 2019, 10(3): 268. doi: 10.3390/f10030268 [87] 刘鑫, 王海燕, 雷相东, 等. 基于BP神经网络的天然云冷杉针阔混交林标准树高−胸径模型[J]. 林业科学研究, 2017, 30(3):368−375.Liu X, Wang H Y, Lei X D, et al. Generalized height-diameter model for natural mixed spruce-fir coniferous and broadleaf forests based on BP neural network[J]. Forest Research, 2017, 30(3): 368−375. [88] Diamantopoulou M J, Özçelik R, Crecente-Campo F, et al. Estimation of Weibull function parameters for modelling tree diameter distribution using least squares and artificial neural networks methods[J]. Biosystems Engineering, 2015, 133: 33−45. doi: 10.1016/j.biosystemseng.2015.02.013 [89] 薛薇. R语言数据挖掘方法及应用[M]. 北京: 电子工业出版社, 2016.Xue W. Data Mining method with R language and its application[M]. Beijing: Publishing House of Electronics Industry, 2016. [90] Che S H, Tan X H, Xiang C W, et al. Stand basal area modelling for Chinese fir plantations using an artificial neural network model[J]. Journal of Forestry Research, 2019, 30(5): 1641−1649. doi: 10.1007/s11676-018-0711-9 [91] Maltamo M, Kangas A. Methods based on k-nearest neighbor regression in the prediction of basal area diameter distribution[J]. Canadian Journal of Forest Research, 1998, 28(8): 1107−1115. doi: 10.1139/x98-085 [92] Lantz B. 机器学习与R语言[M]. 李洪成, 许金炜, 李舰, 译. 北京: 机械工业出版社, 2017.Lantz B. Machine learning with R[M]. Li H C, Xu J W, Li J, trans. Beijing: China Machine Press, 2017. [93] Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference and prediction[M]. 2nd ed. New York: Springer, 2009. [94] Lantz B. Machine Learning with R[M]. Birmingham: Packt Publishing, 2013. [95] Ridgeway G. Generalized boosted models: a guide to the GBM package[Z/OL]. [2019−10−13]. https://cran.r-project.org/web/packages/gbm/vignettes/gbm.pdf. [96] Friedman J H. Stochastic gradient boosting[J]. Computational Statistics and Data Analysis, 2002, 38(4): 367−378. doi: 10.1016/S0167-9473(01)00065-2 [97] Thessen A E. Adoption of machine learning techniques in ecology and earth science[J/OL]. PeerJ PrePrints, 2016, 4: e1720v1 [2019−05−06]. https://peerj.com/preprints/1720.pdf. [98] Fielding A H. Cluster and classification techniques for the biosciences[M]. London: Cambridge University Press, 2006. [99] Corona-Núñez R O, Mendoza-Ponce A, López-Martínez R. Model selection changes the spatial heterogeneity and total potential carbon in a tropical dry forest[J]. Forest Ecology and Management, 2017, 405: 69−80. doi: 10.1016/j.foreco.2017.09.018 [100] Temesgen H, Ver Hoef J M. Evaluation of the spatial linear model, random forest and gradient nearest-neighbour methods for imputing potential productivity and biomass of the Pacific Northwest forests[J]. Forestry, 2015, 88(1): 131−142. doi: 10.1093/forestry/cpu036 [101] Jevšenak J, Levanič T. Should artificial neural networks replace linear models in tree ring based climate reconstructions?[J]. Dendrochronologia, 2016, 40: 102−109. doi: 10.1016/j.dendro.2016.08.002 [102] Görgens E B, Montaghi A, Rodriguez L C E. A performance comparison of machine learning methods to estimate the fast-growing forest plantation yield based on laser scanning metrics[J]. Computers and Electronics in Agriculture, 2015, 116: 221−227. doi: 10.1016/j.compag.2015.07.004 [103] Wang Y H, Raulier F, Ung C H. Evaluation of spatial predictions of site index obtained by parametric and nonparametric methods: a case study of lodgepole pine productivity[J]. Forest Ecology and Management, 2005, 214(1/3): 201−211. [104] 高若楠, 谢阳生, 雷相东, 等. 基于随机森林模型的天然林立地生产力预测研究[J]. 中南林业科技大学学报, 2019, 39(4):39−46.Gao R N, Xie Y S, Lei X D, et al. Study on prediction of natural forest productivity based on random forest model[J]. Journal of Central South University of Forestry and Technology, 2019, 39(4): 39−46. [105] Zhang H, Wang K L, Zeng Z X, et al. Large-scale patterns in forest growth rates are mainly driven by climatic variables and stand characteristics[J]. Forest Ecology and Management, 2019, 435: 120−127. doi: 10.1016/j.foreco.2018.12.054 [106] Guan H Y, Yu Y T, Ji Z, et al. Deep learning-based tree classification using mobile LiDAR data[J]. Remote Sensing Letters, 2015, 6(11): 864−873. doi: 10.1080/2150704X.2015.1088668 [107] Sun Y, Liu Y, Wang G, et al. Deep learning for plant identification in natural environment[J/OL]. Computational intelligence and neuroscience, 2017, 2017: Article ID 7361042 [2019−05−18]. https://www.hindawi.com/journals/cin/2017/7361042/. [108] Pearline S A, Kumar V S, Harini S. A study on plant recognition using conventional image processing and deep learning approaches[J]. Journal of Intelligent and Fuzzy Systems, 2019, 36(3): 1997−2004. doi: 10.3233/JIFS-169911 [109] Wang G, Sun Y, Wang J X. Automatic image-based plant disease severity estimation using deep learning[J/OL]. Computational intelligence and neuroscience, 2017, 2017: Article ID 2917536 [2019−05−16]. https://www.hindawi.com/journals/cin/2017/2917536/. [110] Asner G P, Brodrick P G, Philipson C, et al. Mapped aboveground carbon stocks to advance forest conservation and recovery in Malaysian Borneo[J]. Biological Conservation, 2018, 217: 289−310. doi: 10.1016/j.biocon.2017.10.020 -