BMCR: Adaptive Backbone Module Composition via Reinforcement Learning for Remote Sensing Object Detection 文章

ArXiv CS.CV2026-06-05NEWSen作者: Wenlin Liu, Xikun Hu, Ping Zhong

详细信息

来源站点
ArXiv CS.CV
作者
Wenlin Liu, Xikun Hu, Ping Zhong
文章类型
NEWS
语言
en
发布日期
2026-06-05

摘要

arXiv:2606.05586v1 Announce Type: new Abstract: In remote sensing object detection, Convolutional Neural Networks (CNNs) excel at capturing local details while Vision Transformers (ViTs) are better at global context modeling. However, existing detectors typically rely on a single fixed backbone or a manually designed hybrid architecture, and thus fail to adaptively exploit these complementary strengths across inputs of diverse complexity. To address this limitation, we propose Backbone Module Composition via Reinforcement Learning (BMCR). BMCR dynamically assembles input-adaptive inference paths from reusable modules decomposed from off-the-shelf CNN and ViT backbones. To enable such cross-family composition, we first construct an extensible module toolbox. Specifically, we decompose representative CNN and ViT backbones into reusable functional modules and encapsulate each module with explicit structural, semantic, and computational metadata for compatibility-aware assembly.