VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models 事件

PRODUCT_LAUNCH2026-06-02影响: MEDIUM

VLM4VLA: Revisiting Vision-Language-Models in Vision-Language-Action Models arXiv:2601.03309v2 Announce Type: replace Abstract: Vision-Language-Action (VLA) models, which integrate pretrained large Vision-Language Models (VLM) into their policy backbone, are gaining significant attention for their promising generalization capabilities. This paper revisits a fundamental yet seldom systematically studied question: how VLM choice and competence translate to downstream VLA policies performance? We