VidMsg: A Benchmark for Implicit Message Inference in Short Videos 文章

ArXiv CS.CV2026-06-03NEWSen作者: Issar Tzachor, Michael Green, Rami Ben-Ari

摘要

arXiv:2606.03635v1 Announce Type: new Abstract: Understanding short online videos involves more than identifying visible objects and actions; video makers often include an underlying message or purpose in the clip. We introduce VidMsg, a benchmark for evaluating implicit message understanding in short, internet-native video clips. VidMsg contains 400 YouTube-derived clips across 9 practical topic areas and 52 fine-grained target messages, covering domains such as career and finance, education, health and well-being, culture, safety, sustainability, and lifestyle. VidMsg is constructed through a message-first pipeline: an LLM first translates target messages into indirect search scenarios, which are used to retrieve candidate clips. Human annotators then retain clips that convey the intended message without being overly explicit.

相关人物

暂无数据

相关技术

暂无数据