摘要
arXiv:2606.02894v1 Announce Type: new Abstract: Small edge devices such as IoT surveillance nodes and search-and-rescue (SAR) platforms are increasingly expected to run computer vision locally. On ultra-low-end hardware, however, object detection is limited by available memory and compute, by communication costs when several devices cooperate, and by the loss of accuracy caused by occlusion. The work evaluates occlusion-robust object detection on devices with less than 1 MB SRAM by combining an MCUNet backbone, a YOLOv2 detection head, and TensorFlow Lite quantisation. We evaluate two collaborative inference strategies: feature-level fusion, which concatenates intermediate feature maps, and decision-level fusion via Weighted Boxes Fusion (WBF). Under the tested occlusion settings, WBF outperforms feature-level fusion and gives gains of up to +0.2736 mAP in asymmetric occlusion scenarios. Extending fusion to three views improves accuracy further (up to +0.