Token Inflation: How Dishonest Providers Can Overcharge for Large Language Model Usage 文章

ArXiv CS.CL2026-05-29NEWSen作者: Shahinul Hoque, Jinghuai Zhang, Jinyuan Sun, Fnu Suya

摘要

arXiv:2605.30040v1 Announce Type: cross Abstract: Per-token billing is now the standard pricing model for commercial large language models (LLMs), so the honesty of reported token counts directly affects what users pay. We show that this kind of billing is hard to audit by design: providers hide the model, the tokenizer, and the execution to protect their IP, mitigate jailbreaks, and preserve user privacy, which means an auditor can only inspect proofs the provider supplies. The audit therefore reduces to a consistency check on the provider's own reports. We call this a trust paradox: every audit must trust some artifact, but current frameworks trust exactly the ones a provider has the strongest reason to manipulate. We study three recent token auditing frameworks and show that a provider with ordinary commercial capabilities can systematically inflate billed token counts. In the most permissive setting, hidden reasoning usage can be inflated by 1,469% on average without detection.

Token Inflation: How Dishonest Providers Can Overcharge for Large Language Model Usage 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术