Objective In order to improve the gap-filling accuracy of net ecosystem productivity (NEE) under long-term missing, this study used the artificial neural network (ANN) and bi-directional long short-term memory (Bi-LSTM) to combine the environmental factors and temporal characteristics of NEE, proposing the ANN-BiLSTM model.
Method This study took the NEE data and micro-meteorological data of Yanchi Observatory in Ningxia of northwestern China as the research object, and evaluated the gap-filling results of the ANN-BiLSTM model, random forest (RF), ANN, K-nearest neighbor (KNN), support vector regression (SVR) and marginal distribution sampling (MDS) under long-term absence of NEE by randomly eliminating five kinds of missing scenarios for 7, 15, 30, 45 and 90 d.
Result When the number of missing days was ≤ 30 d, the gap-filling accuracy of each model was relatively reliable. The ANBiLSTM model had the highest gap-filling accuracy. The mean coefficient of determination (R2) was 0.48−0.56. The root mean squares of errors (RMSE) and mean absolute error (MAE) were 0.68−1.92 μmol/(m2·s) and 0.45−1.30 μmol/(m2·s). When the missing data days were ≥ 45, MDS cannot process missing values. The gap-filling accuracy of ANN-BiLSTM model was significantly higher than machine learning. The mean value of R2 > 0.45, RMSE and MAE were 0.79−1.95 μmol/(m2·s) and 0.50−1.31 μmol/(m2·s).
Conclusion When the length of missing NEE data in temperate desert shrub ecosystems is > 30 d, we suggest to use ANN-BiLSTM to interpolate the missing data, which can improve the accuracy of long-term NEE gap-filling results to a certain extent.