VCIFBench: Evaluating Complex Instruction Following for Video Understanding 事件

PRODUCT_LAUNCH2026-06-04影响: MEDIUM

VCIFBench: Evaluating Complex Instruction Following for Video Understanding arXiv:2606.04588v1 Announce Type: new Abstract: Multimodal large language models have made rapid progress in video understanding, yet existing benchmarks largely rely on simple prompts and provide limited evidence about whether models can satisfy explicit output constraints. We introduce VCIFBench, a benchmark for evaluating complex instruction following in video understanding. VCIFBench constructs constraint-rich instr

VCIFBench: Evaluating Complex Instruction Following for Video Understanding · 相关技术