Two-Pass Zero-Shot Temporal-Spatial Grounding of Rare Traffic Events in Surveillance Video 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Two-Pass Zero-Shot Temporal-Spatial Grounding of Rare Traffic Events in Surveillance Video arXiv:2605.01512v2 Announce Type: replace Abstract: Grounding traffic accidents in real CCTV footage is a rare-event problem where training on labeled accident video is often prohibited, yet accurate joint localization in time, space, and collision type is required. We present a no-fine-tuning pipeline that elicits this joint output from frozen vision-language models through two ideas. First, a coarse-to-