Model-Preserving Adaptive Rounding 文章

ArXiv CS.AI2026-06-04NEWSen作者: Albert Tseng, Zhaofeng Sun, Christopher De Sa

详细信息

来源站点
ArXiv CS.AI
作者
Albert Tseng, Zhaofeng Sun, Christopher De Sa
文章类型
NEWS
语言
en
发布日期
2026-06-04

摘要

arXiv:2505.22988v3 Announce Type: replace-cross Abstract: The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do this tractably, most quantization algorithms minimize the immediate activation error of each layer as a proxy for the end-to-end error. However, this ignores the effect of future layers, making it a poor proxy. In this work, we introduce Yet Another Quantization Algorithm (YAQA), an adaptive rounding algorithm that directly considers the error at the network's output. YAQA introduces a series of theoretical results that culminate in the first end-to-end error bounds for quantization algorithms. First, we characterize the convergence time of adaptive rounding algorithms via the structure of their Hessian approximations. We then show that the end-to-end error can be bounded by the approximation's cosine similarity to the true Hessian.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据