ReSpinQuant: Efficient Layer-Wise LLM Quantization via Subspace Residual Rotation Approximation 事件
PRODUCT_LAUNCH2026-05-29影响: MEDIUM
ReSpinQuant: Efficient Layer-Wise LLM Quantization via Subspace Residual Rotation Approximation arXiv:2604.11080v2 Announce Type: replace Abstract: Rotation-based Post-Training Quantization (PTQ) has emerged as a promising solution for mitigating activation outliers in the quantization of Large Language Models (LLMs). Global rotation methods achieve inference efficiency by fusing activation rotations into attention and FFN blocks, but suffer from limited expressivity as they are constrained to