VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning arXiv:2605.28023v1 Announce Type: new Abstract: Visual captioning requires models to capture visual content faithfully while minimizing both omission and hallucination. As the dominant paradigm for captioning, MLLMs have achieved strong performance through scaling and high-quality data. Recently, RL has emerged as a key route to driving MLLMs toward higher precision and broader coverage, however, existing reward designs for capti