MorphoQuant: Modality-Aware Quantization for Omni-modal Large Language Models 文章

ArXiv CS.CV2026-06-04NEWSen作者: Yue Wu, Changyuan Wang, Zixuan Wang, Shilin Ma, Yansong Tang

摘要

arXiv:2606.04349v1 Announce Type: new Abstract: Conventional Post-Training Quantization (PTQ) methods struggle with 4-bit Omni-modal Large Language Models (OLLMs) due to the extreme distribution heterogeneity and disparate outlier patterns across modalities. To address this, we propose MorphoQuant, a modality-aware PTQ framework engineered to preserve cross-modal morphology and mitigate outlier loss. Specifically, we introduce Distribution-Aware Bias Compensation (DABC), which selectively absorbs long-tailed outliers into channel-wise biases. This mechanism safeguards outlier magnitudes while maintaining high-precision discretization for dense inliers, thereby preserving accurate discretization across diverse modal distribution. Complementing this, we propose Morphology-Directed Quantization Function Optimization (MDQFO) to co-optimize the quantization grid with the bias mask, ensuring fine-grained alignment across modalities. Extensive evaluations on Qwen2.

相关公司

暂无数据

相关人物

暂无数据