Before and After Temperature: A Distributional View of Creative LLM Generation 文章

ArXiv CS.CL2026-06-02NEWSen作者: V. S. Raghu Parupudi, Harsha Ponnada, Aditi Kaushal, S. Shria Parupudi, Saiteja Dasari, Sahiti Bulusu

摘要

arXiv:2606.01451v1 Announce Type: new Abstract: Reference-free evaluation of large language model (LLM) creativity relies on perplexity, entropy, and top-1 margin. We show that a much stronger signal lives one step earlier in the pipeline: in how sampling temperature \emph{reshapes} the model's token distribution before the next token is drawn. On Llama-3.1-8B-Instruct generations of 500 open-ended creative prompts at $T \in \{0.3, 0.8, 1.5\}$, a single per-token feature derived from this reshaping predicts the within-prompt creativity rank at Spearman $\rho{=}0.918$ against an averaged gpt-4o\,/\,gemini-2.5-pro judge ($n{=}500$) and $\rho{=}0.870$ against a three-rater human-majority ranking ($n{=}150$). Each of four standard reference-free baselines (self-perplexity, mean predictive entropy, top-1 margin, gzip compression ratio) tops out at $|\rho|\!\approx\!0.76$ on both ground truths: a gap of $+0.165$ on averaged-LLM and $+0.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据