Advanced search
    Yang Feng, Zhao Chaoyue, Wang Sheng, Han Boxun, Yang Linzhe, Xu Fu. Adaptive mask resolution with collaborative edge optimization for wildlife instance segmentationJ. Journal of Beijing Forestry University. DOI: 10.12171/j.1000-1522.20250211
    Citation: Yang Feng, Zhao Chaoyue, Wang Sheng, Han Boxun, Yang Linzhe, Xu Fu. Adaptive mask resolution with collaborative edge optimization for wildlife instance segmentationJ. Journal of Beijing Forestry University. DOI: 10.12171/j.1000-1522.20250211

    Adaptive mask resolution with collaborative edge optimization for wildlife instance segmentation

    • Objective Wildlife instance segmentation underpins critical ecological analyses, including behavior tracking, habitat utilization assessment, and population dynamics monitoring. Existing encoder-decoder architectures extract features via multi-level downsampling and then apply fixed upsampling to generate masks. This pipeline degrades boundary continuity, yielding blurry and fragmented contours, and lacks difficulty-aware resource allocation based on pixel-wise occlusion complexity and edge sharpness. These limitations severely compromise data fidelity for high-precision ecological research.
      Method To address these challenges, we construct a comprehensive wildlife dataset comprising 4 231 images and propose a novel instance segmentation framework that fuses dynamic resolution with edge enhancement. Our approach enables accurate animal contour extraction through difficulty-adaptive resource scheduling. The framework integrates deformable convolutions within a Channel Attention Deep Adaptive Module (SE-DAM) to capture fine-grained spatial details. An Adaptive Resolution Module (ARM) dynamically adjusts mask resolution based on instance scale and occlusion characteristics, mitigating spatial information loss. Additionally, a Probability-Driven Collaborative Optimization (PDCO) module employs Discrete Cosine Transform (DCT) for pixel difficulty classification and models interdependencies between easy and hard regions. By prioritizing simple pixels to guide the refinement of complex regions, PDCO avoids redundant foreground/background computations. These three modules form a closed-loop optimization across feature extraction, resolution decision, and mask refinement, collectively enhancing robustness to occlusion, camouflage, and small targets.
      Result Experimental results demonstrate that with a ResNet-101-FPN backbone, our method achieves an mAP of 53.7% and an mI2oU of 86.71%, surpassing Mask R-CNN by 11.2 and 7.7 percentage points, respectively. The computational cost increases by only 0.3 GFLOPs with an inference speed of 12.4 FPS, showing an effective balance between accuracy and efficiency. Using a Swin-B backbone, the mAP reaches 54.8%, outperforming state-of-the-art methods such as Mask Transfiner (50.8%) and QueryInst (52.4%). Our dynamic resolution strategy maintains superior performance in challenging scenarios, including overlapping individuals, branch occlusion, and infrared imagery, while achieving significantly lower computational costs than fixed-resolution approaches.
      Conclusion We present a wildlife instance segmentation method that integrates dynamic mask resolution with edge detail enhancement. Through a difficulty-aware adaptive resource allocation mechanism, our approach effectively mitigates boundary blur and computational redundancy inherent in conventional fixed-resolution methods. This yields significant improvements in recognition accuracy and contour quality under challenging conditions including occlusion, camouflage, and small targets. The framework provides a deployable solution for efficient, high-precision automatic wildlife monitoring and ecological research, offering clear practical value for intelligent ecological conservation.
    • loading

    Catalog

      Turn off MathJax
      Article Contents

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return