Benchmarking Visual State Tracking in Multimodal Video Understanding 文章

ArXiv CS.CV2026-06-03NEWSen作者: Sihyun Yu, Nanye Ma, Pinzhi Huang, Hyunseok Lee, Shusheng Yang, June Suk Choi, Ellis Brown, Oscar Michel, Boyang Zheng, Jinwoo Shin, Saining Xie

Benchmarking Visual State Tracking in Multimodal Video Understanding · 相关技术

相关技术