TraRA: Trajectory-level Recognition Aggregation for Video Text Spotting in Urban Surveillance 文章

ArXiv CS.CV2026-06-08NEWSen作者: Duc Tri Tran, Trung Thanh Nguyen, Vijay John, Phi Le Nguyen, Yasutomo Kawanishi

详细信息

来源站点: ArXiv CS.CV
作者: Duc Tri Tran, Trung Thanh Nguyen, Vijay John, Phi Le Nguyen, Yasutomo Kawanishi
文章类型: NEWS
语言: en
发布日期: 2026-06-08

摘要

arXiv:2606.07161v1 Announce Type: new Abstract: Video Text Spotting (VTS) is essential for urban surveillance and intelligent transportation systems, enabling automated reading of street signs, vehicle markings, and scene text in video streams. However, reliable recognition remains challenging due to dynamic video factors common in surveillance scenarios, including motion blur, occlusion, and scale variation, which degrade frame-level recognition. Existing VTS methods typically perform recognition independently on each frame, leading to inconsistent and inaccurate results across sequences. To address these limitations, we propose TraRA (Trajectory-level Recognition Aggregation for VTS), a plug-and-play method that performs trajectory-level text recognition by leveraging temporal and multimodal consistency. TraRA integrates two key modules: (1) the Temporal Clustering and (2) the Vision-Language Aggregation.

TraRA: Trajectory-level Recognition Aggregation for Video Text Spotting in Urban Surveillance 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (4)