MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction arXiv:2602.03668v3 Announce Type: replace-cross Abstract: Latent actions learned from diverse human videos serve as pseudo-labels for vision-language-action (VLA) pretraining, but provide effective supervision only if they remain informative about the underlying ground-truth actions. For effective supervision, latent actions should contain information about the underlying actions even though they are inaccessible.
MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction · 相关报道
相关报道
MVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint Reconstruction
ArXiv CS.CV2026-05-28