CycliST: A Video Language Model Benchmark for Reasoning on Cyclical State Transitions 文章

ArXiv CS.CV2026-06-16NEWSen作者: Simon Kohaut, Daniel Ochs, Shun Zhang, Benedict Flade, Julian Eggert, Kristian Kersting, Devendra Singh Dhami

详细信息

来源站点
ArXiv CS.CV
作者
Simon Kohaut, Daniel Ochs, Shun Zhang, Benedict Flade, Julian Eggert, Kristian Kersting, Devendra Singh Dhami
文章类型
NEWS
语言
en
发布日期
2026-06-16

摘要

arXiv:2512.01095v2 Announce Type: replace Abstract: We present CycliST, a novel benchmark dataset designed to evaluate Video Language Models (VLM) on their ability for textual reasoning over cyclical state transitions. CycliST captures fundamental aspects of real-world processes by generating synthetic, richly structured video sequences featuring periodic patterns in object motion and visual attributes. CycliST employs a tiered evaluation system that progressively increases difficulty through variations in the number of cyclic objects, scene clutter, and lighting conditions, challenging state-of-the-art models on their spatio-temporal cognition. We conduct extensive experiments with current state-of-the-art VLMs, both open-source and proprietary, and reveal their limitations in generalizing to cyclical dynamics such as linear and orbital motion, as well as time-dependent changes in visual attributes like color and scale.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据