Back to Parsimonious Latents: Learning Task-Centric World Models from Visual Foundations 文章

ArXiv CS.AI2026-05-26NEWSen作者: Minghao Fu, Fan Feng, Nicklas Hansen, Biwei Huang

摘要

arXiv:2605.25620v1 Announce Type: new Abstract: World models enable agents to predict future dynamics conditioned on actions, making the choice of latent representation central to planning and control. Such representations are often either learned directly from pixels with limited semantic structure or inherited from frozen visual foundation models with excessive task-irrelevant detail, yielding state spaces that are poorly matched to downstream planning and control. This is especially challenging in reward-free offline settings, where the model must learn from fixed trajectories without reward supervision or online interaction. To address this, we propose TC-WM, a framework for turning foundation-model embeddings into compact, task-sufficient world representations.

Back to Parsimonious Latents: Learning Task-Centric World Models from Visual Foundations 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (3)

相关人物

相关产品查看全部 (6)

相关技术查看全部 (33)