Causal-JEPA: Learning World Models through Object-Level Latent Masking 文章

ArXiv CS.AI2026-05-29NEWSen作者: Heejeong Nam, Quentin Le Lidec, Lucas Maes, Yann LeCun, Randall Balestriero

摘要

arXiv:2602.11389v2 Announce Type: replace Abstract: World models require robust relational understanding to support prediction, reasoning, and control. While object-centric representations provide a useful abstraction, they are not sufficient to capture interaction-dependent dynamics. We therefore propose C-JEPA, a simple and flexible object-centric world model that extends masked joint embedding prediction from image patches to object-centric representations. By masking object-level latents and requiring each masked object state to be inferred from the surrounding context, C-JEPA imposes structured partial observability during training, creating counterfactual-like prediction queries that discourage shortcut solutions and make interaction-dependent prediction necessary under the learning objective.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据