MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning 文章

ArXiv CS.CL2026-06-01NEWSen作者: Aritra Dutta, Somak Aditya

摘要

arXiv:2605.25842v2 Announce Type: replace-cross Abstract: Vision-language models (VLMs) increasingly rely on chain-of-thought (CoT) reasoning to solve complex multimodal tasks, but their large parameter sizes make deployment expensive. Structured pruning offers a natural solution; however, existing methods fail to preserve CoT reasoning accuracy in VLMs. We identify two key reasons: (1) CoT consistency depends on sparse transition points (pivot tokens) in the generation trajectory, while existing pruning methods are CoT-agnostic; and (2) pruning methods designed for unimodal LLMs do not account for activation-distribution differences across visual and textual modalities. Motivated by these observations, we propose MuCRASP, a structured pruning framework that targets reasoning-critical components while preserving cross-modal alignment and accounting for layer-wise sensitivity under a global parameter budget.

MuCRASP: Multimodal Chain-of-thought Reasoning aware Structured Pruning 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (3)

相关人物

相关产品查看全部 (8)

相关技术查看全部 (20)