Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs 文章

ArXiv CS.CV2026-05-28NEWSen作者: Yahan Yu, Yuyang Dong, Masafumi Oyamada

详细信息

来源站点: ArXiv CS.CV
作者: Yahan Yu, Yuyang Dong, Masafumi Oyamada
文章类型: NEWS
语言: en
发布日期: 2026-05-28

摘要

arXiv:2507.06999v2 Announce Type: replace Abstract: Reasoning is essential for large language models (LLMs), especially in complex tasks such as mathematical problem solving. However, multimodal reasoning still faces challenges in modality alignment and training scalability, as many existing methods rely on additional annotations or complex rule-based rewards. To address these issues, we propose the Deliberate-to-Intuitive reasoning framework (D2I), which improves the understanding and reasoning abilities of multimodal LLMs (MLLMs) without extra annotations or complex rewards. During training, D2I uses deliberate reasoning strategies supervised only by rule-based format rewards to enhance modality alignment. During inference, it shifts to intuitive reasoning by removing these explicit strategies, allowing the model to implicitly apply the acquired abilities in its responses.

Learning Deliberately, Acting Intuitively: Unlocking Test-Time Reasoning in Multimodal LLMs 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (3)