Cosmos 3: Omnimodal World Models for Physical AI 事件

PRODUCT_LAUNCH2026-06-03影响: MEDIUM

Cosmos 3: Omnimodal World Models for Physical AI arXiv:2606.02800v1 Announce Type: new Abstract: We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate language, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI -- effectively subsuming vision-language models, video generators, world