Identifiable Token Correspondence for World Models 文章

ArXiv CS.CV2026-05-27NEWSen作者: Youngin Kim, Ray Sun, Inho Kim, Bumsoo Park, Hyun Oh Song

摘要

arXiv:2605.16457v3 Announce Type: replace-cross Abstract: Token-based transformer world models have shown strong performance in visual reinforcement learning, but often suffer from temporal inconsistency in long-horizon rollouts, including object duplication, disappearance, and transmutation. A key reason is that most existing approaches treat next-frame prediction purely as a token generation problem, without considering the persistence of tokens across time. We introduce Identifiable Token Correspondence (ITC), a decoding step for token-based transformer world models that formulates next-frame prediction as a structured assignment problem with latent token correspondence variables: each next-frame token is explained either by copying a token from the previous frame or by generating a new one. ITC leaves the transformer architecture and training procedure unchanged and can be added on top of existing backbones.

Identifiable Token Correspondence for World Models 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (3)

相关人物

相关产品查看全部 (6)

相关技术查看全部 (19)