Rethinking VLM Representation for VLA Initialization 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

Rethinking VLM Representation for VLA Initialization arXiv:2605.25802v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models widely adopt pretrained Vision-Language Models (VLMs) as policy backbones, yet it remains unclear what kind of pretrained VLM representation is useful as a VLA initialization. In this paper, we study VLA initialization as a controlled representation-design problem along three axes: capability-level embodied VQA supervision, parameter-update strategy, and robot

Rethinking VLM Representation for VLA Initialization · 相关产品