DRIFT: A Residual Flow Adapter for Decoding Continuous Outputs in Vision-Language Models 事件

PRODUCT_LAUNCH2026-06-05影响: MEDIUM

DRIFT: A Residual Flow Adapter for Decoding Continuous Outputs in Vision-Language Models arXiv:2606.05758v1 Announce Type: new Abstract: Many modern vision-language models (VLMs) build on autoregressive decoding of discrete tokens. While text-based output interfaces enable scalable pretraining and strong zero-shot generalization across diverse tasks, they are poorly suited for problems that require precise continuous outputs, such as localizing temporal boundaries of events or generating roboti