Hierarchical Mask-Enhanced Dual Reconstruction Network for Few-Shot Fine-Grained Image Classification 文章

ArXiv CS.CV2026-06-05NEWSen作者: Ning Luo, Meiyin Hu, Huan Wan, Yanyan Yang, Zhuohang Jiang, Xin Wei

摘要

arXiv:2506.20263v2 Announce Type: replace Abstract: Few-shot fine-grained image classification (FS-FGIC) is challenging as it requires distinguishing visually similar subclasses with extremely limited labeled examples. Existing methods suffer from critical limitations: metric-based methods lose spatial information and misalign local features, while reconstruction-based methods underuse hierarchical feature information and lack selective focus on discriminative key regions. We propose the Hierarchical Mask-enhanced Dual Reconstruction Network (HMDRN), integrating dual-layer feature reconstruction with mask-enhanced feature processing. HMDRN leverages complementary visual information from different network hierarchies via learnable weights, balancing high-level semantic representations with mid-level structural details. It incorporates a spatial binary mask-enhanced transformer module that selectively enhances discriminative regions while filtering background noise.