Hurwitz Quaternion Multiplicative Quantization for KV Cache Compression 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
Hurwitz Quaternion Multiplicative Quantization for KV Cache Compression arXiv:2605.27646v1 Announce Type: cross Abstract: We propose \textbf{Hurwitz Quaternion Multiplicative Quantization (HQMQ)}, a \textbf{calibration-free} method for KV cache compression of large language models. HQMQ treats each 4-element chunk of K or V as a quaternion and quantizes its unit direction to the \emph{product} $q_p \cdot q_s$, where $q_p$ ranges over the 24-element Hurwitz group $2T$ (the 24 vertices of the 24-
相关人物
暂无数据
相关产品查看全部 (10)
相关报道查看全部 (1)
Hurwitz Quaternion Multiplicative Quantization for KV Cache Compression
ArXiv CS.AI2026-05-28