Multi-Modal Building Inspection via Perceiver IO Fusion of Satellite and Street-Level Imagery 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
Multi-Modal Building Inspection via Perceiver IO Fusion of Satellite and Street-Level Imagery arXiv:2605.26381v1 Announce Type: new Abstract: We present a multi-modal classification framework that fuses satellite and street-level imagery through a Perceiver IO architecture operating on spatial patch tokens from a shared DINOv2 backbone. The design naturally handles a variable number of street-level views per building without padding or fixed-size pooling, and jointly predicts multi-label roof e