NeUQI: Near-Optimal Uniform Quantization Parameter Initialization for Low-Bit LLMs 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

NeUQI: Near-Optimal Uniform Quantization Parameter Initialization for Low-Bit LLMs arXiv:2505.17595v4 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve impressive performance across domains but face significant challenges when deployed on consumer-grade GPUs or personal devices such as laptops, due to high memory consumption and inference costs. Post-training quantization (PTQ) of LLMs offers a promising solution that reduces their memory footprint and decoding latency