Efficient Long-Horizon Vision-Language-Action Models via Static-Dynamic Disentanglement 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Efficient Long-Horizon Vision-Language-Action Models via Static-Dynamic Disentanglement arXiv:2602.03983v3 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models have recently emerged as a promising paradigm for generalist robotic control. Built upon vision-language model (VLM) architectures, VLAs predict actions conditioned on visual observations and language instructions, achieving strong performance and generalization across tasks. However, VLAs face two major challenges: