Panthera unica recognition based on data expansion and ResNeSt with few samples
-
摘要:
目的 红外触发相机采集的雪豹监测图像质量参差不齐,且数量有限,为了提升小样本下雪豹的识别准确率,本研究提出一种雪豹监测图像自动识别方法。 方法 该方法基于具备注意力机制的ResNeSt50模型,使用祁连山国家公园的雪豹监测图像作为原始数据集,红外触发相机拍摄的非雪豹陆生野生动物图像作为扩充负样本,网络雪豹图像作为扩充正样本,生成3种数据集并依次进行对比实验,选择合适的扩充方式引导模型逐步关注到雪豹个体关键特征,使用梯度类激活热力图可视化进一步验证数据扩充后的有效性。 结果 使用原始数据集+扩充负样本+扩充正样本训练的模型识别效果最好,热力图可视化显示模型正确关注到雪豹个体花纹与斑点特征,对比基于Vgg16和ResNet50的识别模型,ResNeSt50的识别效果最好,测试集识别准确率达到97.70%,精确率97.26%,召回率97.59%。 结论 采用本研究提出的原始数据集+扩充负样本+扩充正样本数据扩充方法训练的模型,可以区分背景与前景,且对雪豹本身特征具有较强的判别能力,泛化能力最好。 Abstract:Objective The quality of snow leopard monitoring images collected by infrared trigger cameras is uneven and the number is limited. An automatic recognition method of snow leopard monitoring images based on deep learning data expansion was proposed to improve the recognition accuracy of the snow leopard under limited samples. Method Improving the ResNeSt50 model with attention mechanism, the snow leopard monitoring images of Qilian Mountain National Park of northwestern China were used as the original dataset, the non-snow leopard terrestrial wildlife images taken by the infrared trigger camera were used as the extended negative sample, and the network snow leopard images were used as the extended positive sample. Comparative experiments were conducted in turn based on the above three datasets. The model was gradually guided to focus on the key characteristics of individual snow leopards by choosing an appropriate expansion method, and the effectiveness of the data expansion was verified by Gradient-weighted Class Activation Map. Result The model trained with the original data set+expanded negative samples+expanded positive samples had the best recognition effect. The Grad-CAM showed that the model correctly focused on the individual pattern and spot characteristics of the snow leopard. Compared with the recognition model based on Vgg16 and ResNet50, ResNeSt50 achieved the best recognition effect, the test set recognition accuracy rate reached 97.70%, the precision rate reached 97.26%, and the recall rate reached 97.59%. Conclusion The model trained by the original data set+extended negative sample+extended positive sample data expansion method proposed in this paper can distinguish the background from the foreground, and has a strong ability to discriminate the characteristics of snow leopard itself, and the generalization ability is the best. -
Key words:
- Panthera unica /
- monitoring image /
- few sample /
- data expansion /
- convolutional neural network
-
图 2 ResNeSt的基本单元
(h,w,c)为输入特征图的(高,宽,通道数);Cardinal K为第K个分支;Split S为第S个子组;C为中间卷积层的特征图通道数;Concatenate代表通道拼接;Split-Attention表示分割注意力模块。(h,w,c), height, width and number of channels of the input feature graph; Cardinal K, he Kth cardinal group; Split S, the Rth split; C, number of feature graph channels in the middle convolutional layer; Concatenate, concatenate between channels; Split-Attention, split attention block.
Figure 2. Basic unit of ResNeSt
图 3 Split-Attention的具体结构
$ {U}_{j} $为Split-Attention模块中第j个输入特征;$ {\widehat{U}}^{k} $为第k个分支的组合特征;Global pooling为全局池化层;Dense C/K为全连接层;BN为批量归一化层;ReLU为激活函数;S-Softmax为分类器;H, W, C/K表示中间特征层的高、宽、通道数。$ {U}_{j} $, the jth input feature in Split-Attention block; $ {\widehat{U}}^{k} $, combinatorial feature of the kth cardinal group; global pooling, global pooling layer; dense C/K, fully connected layer; BN, batch normalization layer; ReLU, activation function; R-Softmax, classifier; H, W, C/K, height, width and number of channels of middle feature layer.
Figure 3. Structure of Split-Attention
图 12 不同模型在3个数据集上训练的识别结果
1_1,模型1在测试集1上的识别结果;2_2,模型2在测试集2上的识别结果;3_3,模型3在测试集3上的识别结果;3_1,模型3在测试集1上的识别结果。1_1, recognition results of test set 1 by model 1; 2_2, recognition results of test set 2 by model 2; 3_3, recognition results of test set 3 by model 3; 3_1, recognition results of test set 1 by model 3.
Figure 12. Recognition results of three datasets by different models
表 1 数据集分布情况
Table 1. Dataset distribution
数据集
Dataset正样本数
Number of
positive sample负样本数
Number of
negative sample原始数据集 Original dataset 1 324 1 110 扩充数据集 Extended dataset 310 524 总数据集 Whole dataset 1 634 1 634 表 2 训练使用的扩充数据集
Table 2. Extended datasets for training use
数据集来源 Dataset source 数据集 Dataset 训练集 Train set 测试集 Test set 原始数据集 Original dataset 数据集1 Dataset 1 训练集1 Train set 1 测试集1 Test set 1 原始数据集 + 扩充负样本 Original dataset + extended negative sample 数据集2 Dataset 2 训练集2 Train set 2 测试集2 Test set 2 原始数据集 + 扩充负样本+扩充正样本
Original dataset + extended negative sample + extended positive sample数据集3 Dataset 3 训练集3 Train set 3 测试集3 Test set 3 表 3 模型1在测试集1、2上的识别结果
Table 3. Recognition results of test sets 1 and 2 by model 1
测试集
Test set准确率
Accuracy rate/%精确率
Precision rate/%召回率
Recall rate/%测试集1 Test set 1 96.30 94.14 98.25 测试集2 Test set 2 88.85 78.39 98.25 表 4 模型2对测试集2、3上的识别结果
Table 4. Recognition results of test sets 2 and 3 by model 2
测试集
Test set准确率
Accuracy rate/%精确率
Precision rate/%召回率
Recall rate/%测试集2 Test set 2 97.29 94.56 98.68 测试集3 Test set 3 94.03 95.32 91.06 表 5 模型3在测试集3上的识别结果
Table 5. Recognition results of test set 3 by model 3
测试集
Test set准确率
Accuracy rate/%精确率
Precision rate/%召回率
Recall rate/%测试集3 Test set 3 97.70 97.26 97.59 -
[1] 洪洋, 张晋东, 王玉君. 雪豹生态与保护研究现状探讨[J]. 四川动物, 2020, 39(6):711−720. doi: 10.11984/j.issn.1000-7083.20190438Hong Y, Zhang J D, Wang Y J. Progress in the ecology and conservation research on Panthera unica[J]. Sichuan Journal of Zoology, 2020, 39(6): 711−720. doi: 10.11984/j.issn.1000-7083.20190438 [2] Bracciale L, Catini A, Gentile G, et al. Delay tolerant wireless sensor network for animal monitoring: the pink iguana case[C]//Alessandro D G. Proceedings of International Conference on Applications in Electronics Pervading Industry, Environment and Society. Cham, Switzerland: Springer , 2016: 18−26. [3] 徐峰. 新疆雪豹研究简史[J]. 人与生物圈, 2020(增刊 1):77−79.Xu F. A brief history of snow leopard in Xinjiang[J]. Man & Biosphere, 2020(Suppl. 1): 77−79. [4] 马鸣, 徐峰, 吴逸群, 等. 新疆雪豹种群密度监测方法探讨[J]. 生态与农村环境学报, 2011, 27(1):79−83. doi: 10.3969/j.issn.1673-4831.2011.01.016Ma M, Xu F, Wu Y Q, et al. Monitoring of population density of snow leopard in Xinjiang[J]. Journal of Ecology and Rural Environment, 2011, 27(1): 79−83. doi: 10.3969/j.issn.1673-4831.2011.01.016 [5] 汪六三, 黄子良, 王儒敬. 基于近红外光谱和机器学习的大豆种皮裂纹识别研究[J]. 农业机械学报, 2021, 52(6):361−368. doi: 10.6041/j.issn.1000-1298.2021.06.038Wang L S, Huang Z L, Wang R J. Identification of soybean seed coat crack using near infrared spectroscopy and machine learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(6): 361−368. doi: 10.6041/j.issn.1000-1298.2021.06.038 [6] 杨晓花, 高海云. 基于改进贝叶斯的书目自动分类算法[J]. 计算机科学, 2018, 45(8):203−207.Yang X H, Gao H Y. Improved bayesian algorithm based automatic classification method for bibliography[J]. Computer Science, 2018, 45(8): 203−207. [7] Majdar R S, Ghassemian H. A probabilistic svm approach for hyperspectral image classification using spectral and texture features[J]. International Journal of Remote Sensing, 2017, 38(15): 4265−4284. doi: 10.1080/01431161.2017.1317941 [8] Le Cun Y, Boser B, Denker J S, et al. Handwritten digit recognition with a back-propagation network[C]// Touretzky D S. Advances in neural information processing systems. San Francisco: Morgan Kaufmann, 1990: 396−404. [9] Le Cun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278−2324. doi: 10.1109/5.726791 [10] Okafor E, Pawara P, Karaaba F, et al. Comparative study between deep learning and bag of visual words for wild-animal recognition[C]//2016 IEEE symposium series on computational intelligence (SSCI). Athens: IEEE, 2017: 1−9. [11] 向秋敏. 野生动物监测图像显著性检测算法及应用[D]. 北京: 北京林业大学, 2017.Xiang Q M. Saliency detection and application in wildlife monitoring images[D]. Beijing: Beijing Forestry University, 2017. [12] Horn G V, Aodha O M, Song Y, et al. The iNaturalist species classification and detection dataset[C]// Mortensen E. 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR). Salt Lake City: Utah, 2018(3): 132−139. [13] Timm M, Maji S, Fuller T. Large-scale ecological analyses of animals in the wild using computer vision[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Salt Lake City: IEEE, 2018: 1896−1898. [14] 王柯力, 袁红春. 基于迁移学习的水产动物图像识别方法[J]. 计算机应用, 2018, 38(5):1304−1308, 1326.Wang K L, Yuan H C. Aquatic animal image classification method based on transfer learning[J]. Journal of Computer Applications, 2018, 38(5): 1304−1308, 1326. [15] Willi M, Pitman R T, Cardoso A W, et al. Identifying animal species in camera trap images using deep learning and citizen science[J]. Methods in Ecology and Evolution, 2019, 10(1): 80−91. doi: 10.1111/2041-210X.13099 [16] 陈争涛, 黄灿, 杨波,等. 基于迁移学习的并行卷积神经网络牦牛脸识别算法[J]. 计算机应用, 2021, 41(5):1332−1336.Chen Z T, Huang C, Yang B, et al. Parallel convolutional neural network yak face recognition algorithm based on transfer learning[J]. Computer Application, 2021, 41(5): 1332−1336. [17] 赵歆. 基于ResNet网络的奶山羊行为识别方法研究[D]. 西安: 西北农林科技大学, 2020.Zhao X. Research on dairy goat behavior recognition method based on resnet network[D]. Xi’an: Northwest A&F University, 2020. [18] 程浙安. 基于深度卷积神经网络的内蒙古地区陆生野生动物自动识别[D]. 北京: 北京林业大学, 2019.Cheng Z A. Automatic recognition of terrestrial wildlife in Inner Mongolia based on deep convolutional neural network[D]. Beijing: Beijing Forestry University, 2019. [19] 李安琪. 基于卷积神经网络的野生动物监测图像自动识别方法研究[D]. 北京: 北京林业大学, 2020.Li A Q. Research on automatic recognition method of wildlife monitoring images based on convolutional neural network[D]. Beijing: Beijing Forestry University, 2020. [20] Xie S, Girshick R, Dollár P, et al. Aggregated residual transformations for deep neural networks[C]// 2017 IEEE conference on computer vision and pattern recognition (CVPR). Honolulu: IEEE, 2016: 5987−5995. [21] 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述[J]. 软件学报, 2021, 32(2):349−369.Zhao K L, Jin X L, Wang Y Z. Survey on few-shot learning[J]. Journal of Software, 2021, 32(2): 349−369. [22] Royle J A, Dorazio R M, Link W A. Analysis of multinomial models with unknown index using data augmentation[J]. Journal of Computational and Graphical Statistics, 2007, 16(1): 67−85. doi: 10.1198/106186007X181425 [23] Zhang H, Wu C, Zhang Z, et al. ResNeSt: split-attention networks[J/OL]. arXiv, 2020 [2021−05−25]. https://arxiv.org/abs/2004.08955. [24] Li X, Wang W, Hu X, et al. Selective kernel networks[C]// Brendel W. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). Long Beach: IEEE, 2019: 510−519. -