Model-Preserving Adaptive Rounding 文章

ArXiv CS.AI2026-06-04NEWSen作者: Albert Tseng, Zhaofeng Sun, Christopher De Sa

详细信息

来源站点: ArXiv CS.AI
作者: Albert Tseng, Zhaofeng Sun, Christopher De Sa
文章类型: NEWS
语言: en
发布日期: 2026-06-04

摘要

arXiv:2505.22988v3 Announce Type: replace-cross Abstract: The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do this tractably, most quantization algorithms minimize the immediate activation error of each layer as a proxy for the end-to-end error. However, this ignores the effect of future layers, making it a poor proxy. In this work, we introduce Yet Another Quantization Algorithm (YAQA), an adaptive rounding algorithm that directly considers the error at the network's output. YAQA introduces a series of theoretical results that culminate in the first end-to-end error bounds for quantization algorithms. First, we characterize the convergence time of adaptive rounding algorithms via the structure of their Hessian approximations. We then show that the end-to-end error can be bounded by the approximation's cosine similarity to the true Hessian.

Model-Preserving Adaptive Rounding 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品

相关技术查看全部 (2)