MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction arXiv:2602.03668v3 Announce Type: replace-cross Abstract: Latent actions learned from diverse human videos serve as pseudo-labels for vision-language-action (VLA) pretraining, but provide effective supervision only if they remain informative about the underlying ground-truth actions. For effective supervision, latent actions should contain information about the underlying actions even though they are inaccessible.

MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction · 相关报道