TRACE: Evidence Grounding-Guided Multi-Video Event Understanding and Claim Generation 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

TRACE: Evidence Grounding-Guided Multi-Video Event Understanding and Claim Generation arXiv:2605.16740v2 Announce Type: replace Abstract: Multi-video event understanding demands models that can locate and attribute query-relevant evidence scattered across long, heterogeneous video corpora. Existing large vision-language models (LVLMs) often underperform in this regime because they quickly exhaust their context budget and struggle to precisely localize evidentially important segments, frequently