Rethinking Video-Language Model from the Language Input Perspective 文章

ArXiv CS.CV2026-05-28NEWSen作者: Xiang Fang, Wanlong Fang, Changshuo Wang, Xiaoye Qu, Daizong Liu

Rethinking Video-Language Model from the Language Input Perspective · 相关技术